Liang Wei

A Deep Coarse-to-Fine Network for Head Pose Estimation from Synthetic Data

Yujia Wang¹ Wei Liang¹ Jianbing Shen¹ Yunde Jia¹ Lap-Fai Yu²

¹Beijing Institute of Technology ²George Mason University

Abstract

Various applications of human-computer interaction are based on the estimation of head pose, which is challenging due to different facial appearance, inhomogeneous illumination, partial occlusion, etc. In this paper, we propose a deep neural network following the Coarse-to-Fine strategy to estimate head poses. The scheme includes two branches: Coarse classification phase classifying the input image into four categories, and Fine Regression phase estimating the accurate pose parameters. The two sub-networks are trained jointly. To tackle the problem of insufficient annotated data in training process, we design a rendering pipeline to synthesize realistic head images and generate an annotated dataset with a collection of 310k head poses. The results on benchmark datasets and synthetic dataset validate the effectiveness of our approach, as well as the results on images with diverse illumination, occlusion, and motion blur. Moreover, our method can be easily extended to estimate head poses on depth images.

Keywords

Head pose estimation, Coarse-to-Fine, Joint learning.

Publication

A Deep Coarse-to-Fine Network for Head Pose Estimation from Synthetic Data
Yujia Wang, Wei Liang, Jianbing Shen, Yunde Jia, Lap-Fai Yu
Pattern Recognition (PR 2019)
Paper , Video , Data(Coming Soon)

BibTex

@article{wang2019deep,
    title= {A Deep Coarse-to-Fine Network for Head Pose Estimation from Synthetic Data},
    author = {Wang, Yujia and Liang, Wei and Shen, Jianbing and Jia, Yunde and Yu, Lap-Fai},
    journal={Pattern Recognition},
    year = {2019},
    publisher={Elsevier} }

媒体计算与智能系统实验室

Media Computing and Intelligent Systems Lab

Beijing Institute of Technology Copyright Address: 5 South Zhongguancun

Street, Haidian District, Beijing Postcode: 100081