3D Pose Estimation in the Wild
Below is an implementation of the 3D pose estimation algorithm described in the paper:
Action Recognition Using a Distributed Representation of Pose and Appearance,
For the action classification software go here: http://www.cs.berkeley.edu/~smaji/projects/action
The code is built on top of the poselets framework.
A Dataset for 3D Pose Estimation
We collected 3D pose annotations on the PASCAL VOC 2010 validation dataset containing people using Amazon Mechanical Turk. These annotations were then cleaned up resulting in 3240 including reflections. The figure on top shows the AMT interface(a), the agreement between human subjects across views(b), as well as the distibution of the yaw in the dataset (c,d,e).
You can download the annotations + images here (annotations.tar.gz). Run view_annots.m to view these annotations.
You can also download these annotations in text format here (3dannots.txt). Each line contains one annotation specifying the bounding box and the yaw in the following format: image_name x_min y_min width height headyaw torsoyaw
Yaw convention, 0 : front facing, -90 : right facing, +90 : left facing, -180/180 : back facing.
Below are the results of pose estimation using 10 fold cross validation (50-50 split). Note: For each split we make sure that both the image and its flipped version are both in either the training or test set.
Download and Instructions
Below are the outputs of the program:
Last updated: March 26, 2011