Back to All Essays

Statement of Purpose Essay - MIT

Program:Phd, ML, Robotics
Type:PHD
License:CC_BY_NC_SA_4_0
Source: Public Success Story (Siddharth Nayak)View Original

Statement of Purpose Siddharth Nayak, PhD Applicant, AeroAstro–Fall 2020 My primary research interests lie in the areas of Reinforcement Learning and Robust Control Theory as applied to Robotics. My career aspiration is to lead a research laboratory where I will strive to create technology with a positive societal impact. My first exposure to research was an internship at Daimler AG, Germany after my sophomore year. I worked on finding the dependence of pre-trained object detection network performance on camera parameters. I created a dataset of images taken with different combinations of shutter-speeds and voltage gains. This dataset was used to determine the optimal combination of the camera parameters to maximise the object detection performance of a pre-trained network. During this internship, I got an opportunity to experience industrial research which kindled my interest in applied research. I was fascinated by how they used ideas from seemingly disparate fields to improve the performance of the cars. During my internship at Daimler AG, as I was going through the process of labelling the dataset with bounding boxes, it struck me that this can be automated using Reinforcement Learning (RL). Also, RL could generalise well in different lighting conditions which was not possible with the captured dataset. I started working on this idea after the internship under the guidance of Prof. Balaraman Ravindran at IIT Madras. Initially, we worked on a simpler version of the idea which involved changing the images digitally. Here, we tried using human-based reward systems to recover digitally distorted images which worked successfully. After this, we started experiments with the object detector added into the pipeline. However, even after working on it for two semesters, I did not get the expected results with this model (called ObjectRL). As I was fixated upon getting the model to work by trying out different things like changing reward systems, state representations, I was spending a lot of time on the project and consequently, my grades took a hit in the sixth semester. This setback showed me that side of research, which one never hears about in the success stories. Nevertheless, I learnt a lot about reinforcement learning and research in general from this project. I interned under Dr Harshad Khadilkar at TCS Innovation Lab in the summer after my junior year. During this internship, I worked on using reinforcement learning to solve the online version of 3D bin-packing. This is when the lessons learnt from the ObjectRL project came in handy and helped me design different methods by changing the state representations, the MDP associated with the agent. If some method did not work out, I could figure out the possible reasons for it, and try to improve it. The most successful method involved combining heuristics with RL, which we called PackMan. This work resulted in a manuscript which we submitted to a top conference. Finally, after a year of experimenting and research in the field of reinforcement learning, I had a research paper which motivated me to work more and innovate in this field. Also, working on this project firmed my penchant for application-oriented projects. The success with PackMan inspired me to give the ObjectRL project one last shot. The experimenting methods I learnt during the internship helped me a lot this time. And the perseverance paid off as we got the expected results with ObjectRL. The difference this time was that I was logging all the experiment traces which helped in rectifying the flaws in the original model. The ObjectRL agent learned to find optimal changes in brightness to improve object detection. In a few of the test-cases, the agent-proposed image performed better than images without distortion. This led to a conclusion that there is a need to pre-process images to extract maximum performance from object detection networks. This project culminated with a manuscript which we submitted in a top conference. This incident of not getting the expected results immediately and then trying different things with a systematic approach taught me a lot about what graduate studies would be like. I have taken a lot of interest in the work done in the Aerospace Controls Laboratory (ACL) especially the ones involving reinforcement learning. I even had a discussion with one of the graduate students at ACL on his paper "Safe Reinforcement Learning with Model Uncertainty Estimates". After the discussion, I learnt more about his work and a few other extensions of the project the group was working on. One of the extensions which I liked a lot, was of using a Hololens to create an environment with virtual agents and real people. I found this idea quite impressive as it was probably the most practical way of training an RL agent in the real-world without damaging any physical objects. With many of the recent research projects at ACL focussed on applying RL in real-world scenarios, I feel that my goals are well aligned with the research objectives of the lab. Also, an opportunity to work here for my PhD would enable me to apply the algorithms on physical robots which has not been possible during my undergraduate studies. Furthermore, given that people from different departments can work in affiliated labs, MIT would provide an ideal climate to develop my inter-disciplinary interests in Robotics. Though I am open to a wide variety of research within control in robotics, my experience with projects involving RL has inspired an interest to work in the intersection of reinforcement learning and robotics with a focus on application-oriented projects. I have realised from the ObjectRL and the PackMan project that making RL work in practice is quite hard but it seems to be the most feasible approach to generalisation. Hence learning with little data, having PAC guarantees and ensuring reduced failure rates is critical in bridging this gap. During my graduate studies, I would like to work on multi-agent reinforcement learning where I could target different sub-goals sequentially. A few of the sub-goals could be avoiding collisions effectively by traversing through the environment in a socially acceptable manner and being robust enough during training so that the agent does not enter dangerous or unsafe states during exploration. I would also like to tackle the problem of transferring policies from the simulated environments to real-world environments while training RL agents for robots. An implication of this problem is that we should be able to initialise the policies intelligently through imitation learning or have modular policies so that the changes in the environment can be mapped to changes in the policies easily. With respect to my thesis, I would like to work with Prof. Jonathan How as I have been fascinated with the projects his group at ACL is working on. Along with this, I find Prof. Richard Linares’s work on using RL to control spacecrafts during atmospheric re-entry to tackle the difficulty in modelling the complex dynamics quite interesting. I am also interested in working with Professors Julie Shah and Nicholas Roy as many of the research projects carried out by their groups are also application-oriented. The uncertainty involved in research is similar to the fate of Schrodinger’s cat. One doesn’t know whether the ideas will succeed or fail before getting one’s hands dirty. My journey of research had both ups and downs associated with it. But the positives have certainly trumped the frustrations and the negatives. Once I started enjoying the process and the results started coming in, the satisfaction of creating something which probably no one has done before was immense. Thus, I think that pursuing graduate studies would be the perfect way to continue working on what I love.