Security Camera

Action recognition



Human activity recognition from videos attributes to a myriad of real-life applications primarily dealing with human-centric problems. As Deep Learning set to become the heart of automation, recent work on vision-based human action recognition focuses on designing complex deep learning models for the task.


The majority of these solutions are modeled for large training datasets. However, collecting and processing video data is usually very expensive and time-consuming. Thus achieving equivalent performance with low data is very much essential. In addition to this, due to lack of depth information, RGB-only videos perform poorly in comparison to RGB-D video-based solutions. But acquiring depth information, inertia, etc., is costly and requires special equipment, whereas RGB videos are available in ordinary cameras.


This work deals with an action recognition task to automate surveillance at a Japanese manufacturing company. In this regard, the solution attempts to obtain significant performance for activity recognition from RGB only videos using low training data, thereby addressing both the issues through various techniques such as data augmentations, autoencoders, etc.



Prof. Ravi Kiran Sarvadevabhatla


Team of two


Python, Excel

My contribution

I collaborated on analyzing and formatting the raw video data to convert it into a dataset. Researched multiple action recognition architectures. Worked on designing and optimizing a neural network model for action recognition.

5 months (Aug - Dec 2019)