티스토리 뷰

One Millisecond Face Alignment with an Ensemble of Regression Trees

V. Kazemi and J. Sullivan, "One millisecond face alignment with an ensemble of regression trees," 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1867-1874, doi: 10.1109/CVPR.2014.241.

Abstract

This paper addresses the problem of Face Alignment for a single image. We show how an ensemble of regression trees can be used to estimate the face’s landmark positions directly from a sparse subset of pixel intensities, achieving super-realtime performance with high quality predictions. We present a general framework based on gradient boosting for learning an ensemble of regression trees that optimizes the sum of square error loss and naturally handles missing or partially labelled data. We show how using appropriate priors exploiting the structure of image data helps with efficient feature selection. Different regularization strategies and its importance to combat overfitting are also investigated. In addition, we analyse the effect of the quantity of training data on the accuracy of the predictions and explore the effect of data augmentation using synthesized data.

1. Introduction

  • 주요 Contribution point는 아래 4개
  1. A novel method for alignment based on ensemble of regression trees that performs shape invariant feature selection while minimizing the same loss function during training time as we want to minimize at test time.
  2. We present a natural extension of our method that handles missing or uncertain labels.
  3. Quantitative and qualitative results are presented that confirm that our method produces high quality predictions while being much more efficient than the best previous method (Figure 1).
  4. The effect of quantity of training data, use of partially labeled data and synthesized data on quality of predictions are analyzed.
  • 또한 regressor에 들어가는 Input data는 sparse pixel이고 데이터로부터 오는 prior와 gradient boosting로 정해진다고 함. 2.3.3 Feature selection 파트에 설명
    "The sparse pixel set, used as the regressor’s input, is selected via a combination of the gradient boosting algorithm and a prior probability on the distance between pairs of input pixels"

2. Method

2.1. The cascade of regressors

2.2. Learning each regressor in the cascade

  • Regression Tree를 사용하여 각 landmark point의 위치를 regression함
  • 그 각 tree를 gradient boosting 기법을 활용하여 ensembling 기법으로 학습함

2.3. Tree based regressor

  • regressor의 input data는 선택되는 sparse pixel이고, 구체적인 방법은 더 확인이 필요함 (TODO)


cascade 레벨에 따른 regression 정확도 분석, 초기에는 전체 데이터의 mean landmark position에서 시작함

  • 이미지당 1ms의 detection 시간을 보여줌
  • 각 landmark point가 평균값으로부터 시작해서 regression tree를 통해 shape가 점점 움직임
  • Algorithm1에 따르며, 아래 공식도 참고. (s는 scale, R은 rotation)
댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2025/05   »
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
글 보관함