Back to All Essays

Statement of Purpose Essay - Stanford University

Program:Phd, CV
Type:PHD
License:CC_BY_NC_SA_4_0
Source: Public Success Story (Frieda Rong)View Original

Statement of Purpose for Stanford CS PhD∗ Frieda Rong December 3, 2019 My goal is to pursue a Ph.D. in computer science focused on deep learning. My current research interests are computer vision and deep learning, motivated by my multiple experiences in machine learning and most recently, the work I did – initially on an internship and subsequently part-time between classwork and research – at the Uber Advanced Technologies Group under Professor Urtasun on image synthesis of urban driving scenes using geometry-aware composition. In truth, my background tells a slightly more complicated story. More of my coursework has to do with theoretical computer science, including two graduate courses in the area with Associate Professor Lap Chi Lau (for which I wrote course reports on spectral graph sparsification and combinatorial expansion in high dimensional simplicial complexes). I also took a summer to give extremal combinatorics a serious effort in the face of my plans to turn fully to machine learning after graduation. While I deeply enjoyed the material, my personal professional desire is to work on problems that have clear purpose to society. Since my school is better known for professional experience opportunities than research, so far I have striven to do so by working on software systems in internships; however, the same guiding motivation holds in the research I hope to do. Although my undergraduate career has taken a relatively exploratory trajectory, I hope in graduate studies to focus above all on driving innovative, productive research. Even before securing any formal research opportunity with an advisor, my desire to work on technically fascinating projects led me to reproduce computer science papers from existing literature. One early personal project was a real-time collaboration system for hierarchical lists, inspired by Google Docs and the list-based note-taking system Workflowy, for which I read the original Jupiter collaboration system paper and filled in the details for applying the idea of operational transform to multiple clients. During this period of independent study, I played as well with modifying the deep learning framework Caffe to prune neural network connections, and wrote a regression tree library for facial keypoint recognition following [1]. Prior to joining Uber, I completed internships at Petuum (an enterprise machine learning startup spun out of CMU) and Apple. At Apple, the summer following sophomore year, we explored ways of visualizing the intermediate representations of trained neural networks using t-Stochastic Neighbour Embeddings and investigated root cause analysis of model errors by fitting a more interpretable logistic regression model to predict model predictions. The visualization tool I wrote as well helped data scientists in their work on improving the user product Siri! The following winter at Petuum, I worked on projects at the intersection of distributed systems and machine learning, first on reducing data transfers for sparse gradients of indexed tensors (think word embedding matrices) under the parameter server model. We discussed implementing ring allreduce, an MPI communication primitive used by Baidu that is bandwidth-optimal, as a potential project after finishing the first major project earlier than expected; however I ultimately worked on adding support vector machine solvers to the library of distributed applications. This last project was in effect independent work done without a real supervisor, from understanding the literature on convex optimization to writing an application in Litz, an out-of-core elastic framework for distributed machine learning. The approaches I settled with implementing were stochastic dual coordinate ascent and parallel block minimization for the kernel case. While at Petuum, Hao Zhang and I also discussed ideas for outside research, in particular a systems machine learning project similar in nature to [2] but for batched execution of the computation graph in Dynet (we dropped the idea after deciding that we would be unlikely to obtain significant speedups). My project at Uber has been comprehensive in scope and excitingly open-ended, exposing me to many problems and areas within the broad umbrella of computer vision from the history of image-based rendering and mesh reconstruction methods in traditional graphics to advances motivated by deep learning approaches such as deep active contours, differentiable (inverse) rendering, and cycle-consistency losses. As the research projects in the lab are individually-owned, I worked under the guidance of Shenlong Wang (my direct mentor), Professor Urtasun, and Ersin Yumer (who leads the San Francisco R&D team) through weekly meetings to propose approaches or sub-problems to focus on, implement them, and discuss next directions to take. The work from this project has since been highlighted in several invited talks at CVPR. There are multiple faculty at Stanford whose work inspired me to apply to this school in particular, and there are many half-baked questions related to their work that I am curious about. Beyond those named in ApplyWeb: under Percy Liang, Daniel Selsam investigated neural SAT solvers, but what about the amenability of different NP-complete problems to deep learning? After taking a course in logic using Coq, I was interested in contributing to Lean for a while. Certigrad is already two years old, but currently there are efforts on learning formal-to-formal proofs of IMO problems by encoding them in Lean. Admittedly I only ever participated at the national level in math olympiads but I would be interested in contributing to this task! I’m interested as well in applying techniques from distributionally robust optimization to deep learning, such as John Duchi did in his work on providing a training procedure and certificates of robustness against adversarial attacks. Can we also apply ideas from robust estimation for distribution learning (in particular, multivariate Gaussians in the high-dimensional setting) to counteract noise in datasets? (Does this hurt learning of rare events, or can it provide a way of automatically detecting outliers? Or are much simpler methods sufficient?) Another question that might be worth investigation is that of structured omnidirectional generation of text and spatial or other data, or more flexible conditioning schemes. Currently GPT-2 has prefixed prompts and hacks like “tl;dr”, but these are limited options. Somewhat relatedly, I’ve wondered how overkill or effective the ostentatious-sounding wave function collapse would be for generating novel maps for simulation environments. Of course, inspired by the research problems I encountered through my project at Uber, I am interested in deep learning on different geometric representations (e.g. voxelization, point clouds, implicit surfaces), differentiable rendering (connecting visual representations with symbolic ones as Jiajun Wu has done!), and modelling temporal information (as well as how ideas in theory may be practically realized with modest hardware requirements by exploiting computational optimizations such as spatial and temporal sparsity). I am also curious about coupling GANs with learning an intermediate geometric representation or making more explicit what implicit information a GAN or VAE has learned (such as spatial rotations). Many more questions remain! So far, I have been extremely fortunate to have worked with bright, inspiring collaborators on rewarding projects that had us tossing questions around, and only wish to advance further to unknown ground. References [1] Vahid Kazemi and Josephine Sullivan. One millisecond face alignment with an ensemble of regression trees. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 1867–1874, Columbus, OH, June 2014. IEEE. [2] Azalia Mirhoseini, Hieu Pham, Quoc V. Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar, Mohammad Norouzi, Samy Bengio, and Jeff Dean. Device Placement Optimization with Reinforcement Learning. arXiv:1706.04972 [cs], June 2017. arXiv: 1706.04972. * alternative working title: Me and My Research (c.f. Richard Hamming’s You and Your Research) 1 you could jokingly frame life as a multi-armed bandit problem with non-stationary rewards as you try various fields in an online fashion to determine your respective aptitudes... but this is a limited model :) 2 This is interesting because while two NP-complete problems may have polynomial reductions to each other, they can have vastly different approximability behaviour. 3 Is creating a dataset a natural milestone for an aspiring graduate student in machine learning? Richard Feynman may have said “What I cannot create, I do not understand”, but should the maxim really be “What I cannot compress, I do not understand”?