Penn Action Dataset (University of Pennsylvania) contains 2326 video sequences of 15 different actions and human joint annotations for each sequence. The dataset is available for download via the following link:
If you use our dataset, please cite the following paper:
Weiyu Zhang, Menglong Zhu and Konstantinos Derpanis, "From Actemes to Action: A Strongly-supervised Representation for Detailed Action Understanding" International Conference on Computer Vision (ICCV). Dec 2013.
The dataset is organized in the following format:
/frames ( all image sequences ) /0001 000001.jpg 000002.jpg ... /0002 ... /labels ( all annotations ) 0001.mat 0002.mat ... /tools ( visualization scripts ) visualize.m ...
The image frames are located in the /frames folder. All frames are in RGB. The resolution of the frames are within the size of 640x480.
The annotations are in the /labels folder. The sequence annotations include class label, coarse viewpoint, human body joints (2D locations and visibility), 2D bounding boxes and training/testing label. Each annotation is a separate MATLAB .mat file under /labels.
An example annotation looks as follows in MATLAB:
annotation = action: 'tennis_serve' pose: 'back' x: [46x13 double] y: [46x13 double] visibility: [46x13 logical] train: 1 bbox: [46x4 double] dimensions: [272 481 46] nframes: 46
baseball_pitch clean_and_jerk pull_ups strumming_guitar baseball_swing golf_swing push_ups tennis_forehand bench_press jumping_jacks sit_ups tennis_serve bowling jump_rope squats
The annotation tool used in creating this dataset is also available. Please refer to http://dreamdragon.github.io/vatic/ for more details.
Please direct any questions regarding the dataset to
Menglong Zhu firstname.lastname@example.org