Statement of Purpose Essay - Deakin University
"A glimpse of the future, captured through the lens of technology" - this line has always been a driving force behind my passion for computer vision and its potential applications in shaping our world. As an undergraduate student with a CS major, I have been fascinated by the potential of computer vision to revolutionize industries such as healthcare, autonomous driving, and transportation. I am excited to take the next step in my academic journey by applying to the Ph.D. program in Computer Science at the University of Pittsburgh, where I am confident that I will be able to delve deeper into my research interests and make a meaningful impact in the field of computer vision. My fascination for machine perception, however, was not productively translated until I became a part of the small machine learning research group in my undergraduate institute, where we perused databases like PubMed, IEEE, Web of Science, and bioRxiv to read about the state-of-the-art deep learning models and their benchmarks to implement their codes ourselves using PyTorch. After deploying the various proposed architectures of YOLO, SSD, RCNNS, SegNets, and U-Nets from existing research works for various applications like healthcare, industrial automation, and transportation, I developed the ability to come up with novel ideas. It was during this process that I realized that despite the stellar performance of AI, there is a need for interpretable models for the research products to be applied to real-world solutions. Later that summer, I got the chance to finally work on our research project where we utilized a publicly available structured dataset to correlate Dressler’s Syndrome with Myocardial Infarction and predict the risk of a person suffering myocardial infarction getting Dressler's. Our supervisor, Dr. Diganta Sengupta taught me how to compare our models with the existing literature and formulate novel pipelines to optimize them. Under his guidance, I could soon perform literature surveys and find research gaps in the existing works on my own. During this project, we encountered problems like a high risk of bias due to inadequate data points or lack of external validation, and outliers due to randomized data collection processes. And I found myself intuitively trying out necessary steps like synthetic oversampling or transforming data distributions to overcome them. When this project was accepted at the International Conference on Intelligent Systems and Human Machine Collaboration 2022, I attended a conference for the very first time. There I was introduced to concepts like Federated Learning and edge computing, which broadened my ideas on applications of computer vision in distributed systems. Shortly after that, I authored another paper (https://doi.org/10.1007/s11334-022-00473-3), published in the journal, “Innovations in Systems and Software Engineering, Springer”, this time on a dataset derived from OCT images of Diabetic Retinopathy(DR) patients to detect DR on a low-resourced dataset. We observed how feature extraction of the OCT scans for the training of neural networks takes a toll on the computing cost and decided to follow a classical machine learning approach. I tuned the meta learners from the basic classifiers and eventually came up with an ensemble model that not only delivered a higher performance value than the existing art but also computed faster than the deep learning based solutions available. Understanding the inherent parameters of the classifiers and tuning the hyperparameters scrupulously enabled me to build an intuitive sense of which model maps better with which kind of distributions. I also took into consideration the need to balance the algorithmic performance with the robustness of the solution, as Dr. Sengupta had in mind, to gain insights on explainability and model interpretations. Even though we mostly used classical machine learning in Dr. Sengupta’s lab, the research projects there gave me the necessary insights into research methodologies that matured me as a deep learning practitioner. Eventually getting a deeper perspective of the technologies and the challenges for fair and accessible computer vision to be utilized in, I embarked on several projects, presently underway, such as a medical image augmentation task based on GANs to solve the issue of insufficient training data needed for medical computer vision use cases. Besides the significance of the task to solve the inherent dearth of balanced and well-annotated datasets for medical images, this attempt led me to learn and apply concepts of attention models and the common evaluation metrics and loss functions for the task, that I previously knew very little about. Another such project was an auto segmentation of brain tumors from MRI scans using the 3D U-net architecture in which I later tried using Transformers, a widely used model in natural language processing but hardly known for image analysis. I was surprised by the results and its possibilities and realized how a single new outlook could unsettle the state of the art and open such vast new avenues to explore. This very process of working on vital problems with a zeal for figuring out a novel and impactful solution, thinking through it, and appreciating the already attempted solutions made me realize my passion for conducting meaningful research to make AI-assisted vision a reality. I have also gained practical experience in computer vision through my internships. As a computer vision intern at the Indian Institute of Technology, Bombay, I have been implementing existing state-of-the-art solutions from research papers on autonomous driving. I have been training on simulators like CARLA and AirSim and trying out 2D and 3D object detection, and object tracking under the guidance of Dr. Rishabh Iyer. Additionally, I have also worked as a Deep Learning Engineer at Resolute Ai Software Pvt Ltd, where I applied computer vision concepts of Real-Time Object Detection for factory automation and safety assurance. During my time there, I collected and annotated video/image data, used image processing techniques, built custom object detection models, and used pre-trained models such as YOLO. The School of Computing and Information at the University of Pittsburgh offers a unique and ideal environment for me to pursue my research interests in computer vision. The strong interdisciplinary research centers and labs dedicated to Human Computer Interaction and Machine Perception and Cognitive Robotics, and professors like Adriana Kovashka, are a perfect match for my goal of becoming a researcher in the field of computer vision. The cutting-edge research being conducted in these centers aligns perfectly with my interests in medical imaging and autonomous driving. Additionally, the Center for Artificial Intelligence Innovation for Medical Imaging, led by Dr. Shandong Wu, is especially inspiring to me and I am excited about the prospect of working with him and his team to develop AI-based solutions for improving medical imaging. The opportunity to work with such renowned researchers and to partake in state-of-the-art research at the University of Pittsburgh would be invaluable in helping me achieve my goals to stay at the frontiers of the development of computer vision as a professor and an entrepreneur.