Menglong Zhu's Homepage - Computer Vision & Robotics

CV Contact: menglong AT google.com

I'm currently at Google working on many interesting Computer Vision & Deep Learning problems. I am a core team member of Google's winning entry in 2016 COCO detection challenge. Try out our open-source Tensorflow Object Detection API!

Before joining Google, I obtained a PhD degree in Computer Information Science in 2016, at GRASP Lab, University of Pennsylvania, advised by Kostas Daniilidis. I received a Master's degree in Robotics from the University of Pennsylvania in 2012 and a Bachelor's degree in Computer Science from Fudan University in 2010.

My research interests are Computer Vision, Robotics and Machine Learning, especially in object detection & recognition, human pose estimation, action recognition, visual SLAM and text recognition.

DetectionAPI

Tensorflow Object Detection API: a collection of detection models.

Mobilenets

Mobilenets for on-device vision.

ADPM

Fast object detector - Active Deformable Part Models.

Vatic-Action

Video annotation tool for human joints, Amazon Mturk deployable.

Solve3Plus1

Camera pose estimation, involved in Google Project Tango.

Literate PR2

Text detection and recognition, ROS package, PR2 deployable.

Quadtree2D

Fast A* planning algorithm with quadtree decomposition.

Penn Action

15-class action dataset, frame-by-frame annotated human body joints.

1. MobileNets: Efficient Convolutional Neural Networks
for Mobile Vision Applications
A. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang,
T. Weyand, M. Andreetto, H. Adam
[arxiv]

2. Speed/accuracy trade-offs for modern convolutional
object detectors
J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara, A. Fathi,
I. Fischer, Z. Wojna, Y. Song, S. Guadarrama, K. Murphy,
Computer Vision and Pattern Recogition (CVPR), 2017.
[arxiv]

3. Sparseness Meets Deepness: 3D Human Pose Estimation
from Monocular Video
X. Zhou*, M. Zhu*, S. Leonardos, K. Derpanis and K. Daniilidis,
*Equal Contribution
Computer Vision and Pattern Recogition (CVPR), 2016.
[PDF]

4. Single Image Pop-Up from Discriminatively Learned Parts
M. Zhu*, X. Zhou* and K. Daniilidis, *Equal Contribution
International Conference on Computer Vision (ICCV), 2015.
[PDF / video]

5. Multi-Image Matching via Fast Alternating Minimization
X. Zhou, M. Zhu and K. Daniilidis,
International Conference on Computer Vision (ICCV), 2015. (Oral)
[PDF]

6. Active Deformable Part Models Inference
M. Zhu, N. Atanasov, G. J. Pappas, and K. Daniilidis,
European Conference on Computer Vision (ECCV), 2014.
[PDF / bibtex / video / project page (code)]

@InProceedings{ZhuADPM2014,
  author    = {M. Zhu and N. Atanasov and G. Pappas and K. Daniilidis},
  title     = {{Active Deformable Part Models Inference}},
  year      = {2014},
  booktitle = {European Conference on Computer Vision (ECCV)}
}

7. Localization from Semantic Observations
via the Matrix Permanent
N. Atanasov, M. Zhu, K. Daniilidis and G. J. Pappas
International Journal of Robotics Research (IJRR), 2015.
[bibtex / video]

@Article{Atanasov_SemanticLocalization_IJRR15,
  author = {N. Atanasov and M. Zhu and K. Daniilidis and G. Pappas},
  title = {{Localization from Semantic Observations via the Matrix Permanent}},
  year = {2015},
  journal = {International Journal of Robotics Research}
}

8. Semantic Localization Via the Matrix Permanent
N. Atanasov, M. Zhu, K. Daniilidis and G. J. Pappas
Robotics: Science and Systems (RSS), 2014.
[PDF / bibtex / video]

@InProceedings{Atanasov_SemanticLocalization_RSS14,
  author = {N. Atanasov and M. Zhu and K. Daniilidis and G. Pappas},
  title = {{Semantic Localization Via the Matrix Permanent}},
  year = {2014},
  booktitle={Robotics: Science and Systems (RSS)}
}

9. Single Image 3D Object Detection and Pose Estimation
for Grasping
M. Zhu, K. Derpanis, Y. Yang, S. Brahmbhatt, M. Zhang,
C. Phillips, M. Lecce and K. Daniilidis,
International Conference on Robotics and Automation(ICRA),2014.
[PDF / bibtex / video / project page]

@article{zhu2014grasping,
  title     = {Single Image 3D Object Detection 
               and Pose Estimation for Grasping},
  author    = {Zhu, Menglong and Derpanis, Konstantinos G 
               and Yang, Yinfei and Brahmbhatt, Samarth 
               and Zhang, Mabel and Phillips, Cody 
               and Lecce, Matthieu and Daniilidis, Kostas}
  booktitle = {International Conference on Robotics and Automation}
  year      = {2014} 
}

10. From Actemes to Action: A Strongly-supervised
Representation for Detailed Action Understanding
W. Zhang, M. Zhu and K. Derpanis,
International Conference on Computer Vision (ICCV), 2013.
[PDF / bibtex / video / project page / action dataset]

@inproceedings{zhang2013actemes,
  title     = {From Actemes to Action: A Strongly-supervised 
               Representation for Detailed Action Understanding},
  author    = {Zhang, Weiyu and Zhu, Menglong 
               and Derpanis, Konstantinos G},
  booktitle = {International Conference on Computer Vision},
  pages     = {2248--2255},
  year      = {2013},
}

11. Monocular Visual Odometry and Dense 3D Reconstruction
for On-Road Vehicles
M. Zhu, S. Ramalingam, Y. Taguchi and T. Garass,
European Conference on Computer Vision (ECCV), CVVT , 2012.
[PDF / bibtex / video]

@inproceedings{zhu2012monocular,
  title     = {Monocular visual odometry and dense 3d 
               reconstruction for on-road vehicles},
  author    = {Zhu, Menglong and Ramalingam, Srikumar 
               and Taguchi, Yuichi and Garaas, Tyler},
  booktitle = {European Conference on Computer Vision, 
               Workshops and Demonstrations},
  pages     = {596--606},
  year      = {2012},
}

12. Literate PR2: Text detection and recognition for
indoor environment
M. Zhu, K. Derpanis, K. Daniilidis,
[Open Source] Robotics Operating System (ROS), 2011.
[ROS wiki / video]

Method and System for Determining Poses of Vehicle-Mounted Cameras
for In-Road Obstacle Detection, US 20140037136 A1.
M. Zhu, S. Ramalingam and Y. Taguchi, [Link]

April 09, 2014, VASC Seminar at CMU

Advanced Robotics, 2015 Spring, Learn to program a quadrotor

Machine Perception, 2013 Spring

Machine Learning, 2012 Fall, Remebering Ben Taskar

Introduction to Computer Programming, 2012 Spring

Programming Languages and Techniques III, 2011 Spring

Programming Languages and Techniques III, 2011 Fall

About Me

Code

Datasets

Selected Publications

Patents

Invited Talks

Teaching