Yingwei Li's personal homepage

About Me

Before coming to UCSD, I spent four wonderful years at University of Science and Technology of China(USTC) to get a B.E in Electronic Engineering and Information Science(1006).

Education

University of California, San Diego

Sep 2014 - Nov 2017

Master of Science, Electrical and Computer Engineering

Overall GPA:3.945/4.0

University of Science and Technology of China (USTC)

Sep 2010 - Jul 2014

BE, Electronic Engineering and Information Science

Overall GPA:4.02/4.3 Rank:2/133

Projects

Large-scale Isolated Gesture Recognition Challenge

Aug 2017 - Sep 2017, Chalearn LAP Challenge@ICCV 2017

Challenged to recognize 249 classes of gestures from RGB + D (depth) videos
Trained a multi-modal 3D-CNN with more than 50,000 samples for gesture recognition
Proposed 1) Region of Interest Masking 2) Multi-modality finetuning 3) Spatial-Temporal Pyramid encoding to improve performance
Ranked 3rd place out of 12 attending teams, code and model is available.

Image Tagging and Search with Image-text Embedding

June 2016 - Sep 2016, Adobe Research, Research Intern

Improved the in-house image auto-tagging and search system with deep learning.
Trained an image embedding network with PMI word vectors
Boosted the image search system with click-through data by ranking with positive enhancement
Derived an online update algorithm for PMI word embedding with a large dictionary
Applied the embedding network for image dense tagging in an online web demo

Playing FlappyBird with Reinforcement Learning

Apr 2017 - Jun 2017, Side Project

An interesting trial to apply reinforcement learning on playing games: video link
Implemented the game simulator to interact with the reinforcement learning model
Designed a Deep Q-learning model with memory replay
Compared with a hand-crafted baseline policy and the Deep Q-learning achieved better results video link

VLAD³: Encoding Dynamics of Deep Features for Action Recognition

Jul 2015 - Nov 2015, Research Project

Studied the importance of dynamics modeling in video action recognition
Derived a VLAD encoding(VLAD³) with Linear Dynamic System(LDS) model for deep feature sequence
Implemented the codebook learning and encoding of VLAD³ with a modified Kalman smoothing algorithm
Benchmarked against common baselines and achieved superior performance on common benchmark datasets(e.g. Olympic Sports, UCF101 and THUMOS14)

Building Large Scale 3D Map with Kinect

Dec 2013 - May 2014, Microsoft Research Asia, Research Intern

Designed and implemented an indoor 3D map reconstruction system: video link
Input: RGBD data from Microsoft Kinect sensor
Process I: Registration of consecutive frames by visual feature matching
Process II: Globally optimize the pose graph and align large planes
Process III: De-noise and re-assign color of 3D points.
Output: Full-size 3D map of a large indoor scene.
Map Demo