Statement of Purpose Essay - University of Southern California
From the swift growth of public transportation infrastructure to the widespread adoption of electronic payment systems, my childhood in Shanghai saw endless changes brought forth by rapid technological advancements. This experience nurtured in me the desire to shape this force of change for a safer future. AI has taken a prominent role in this force of change due to its ability to automate decision-making on a massive scale, but an AI agent never acts in isolation, as its actions and learning affect those of other agents. It is therefore crucial to ensure that the interaction dynamics of multiagent systems are both effective and well-understood. Motivated by this need, I am driven by the following questions: How can agents acquire skills and model the world and its inhabitants without external supervision? How can agents communicate explicitly via language, and implicitly via theory of mind reasoning? How can language in turn be used as a tool for long-horizon planning and human-interpretability? How can agents effectively coordinate in common-payoff settings, even without prior interactions? How can we encourage collaboration amongst agents in mixed-motive settings? Halfway through my undergraduate study in aerospace engineering, I switched majors to CS due to my desire to study a more multidisciplinary set of problems. I was introduced to the field of AI shortly after, and became captivated by reinforcement learning (RL) due to its generality and connection to fields as disparate as, for example, game theory, psychology, and optimal control. To gain more experience in this field, I joined MIT Lincoln Laboratory (MITLL) as a researcher after graduation, working in collaboration with Prof. Mykel Kochenderfer and his students at Stanford Intelligent Systems Laboratory (SISL). During this time, I crystallized my research direction, built up my research skills, and established for myself the career goal of advancing the fundamentals of AI with applications to robotics and virtual assistants as a research scientist. To this end, I plan to earn my PhD at the intersection of machine learning and multiagent systems to further my foundational knowledge and capabilities as a researcher. Learning without labels: Inspired by the ability of self-supervised learning to learn without laborious annotations, I adopted Deep Clustering for an MITLL project on pattern of life classification of unlabeled air traffic data. The method first clusters the sample embeddings, then uses the centroid assignments as pseudo-labels to train a model to produce better embeddings. This two-step procedure is repeated to iteratively improve the model and the pseudo-labels. From my experiments on the sponsor’s dataset, we found the learned clusters to exhibit distinct modes of aircraft behavior, e.g. fast or slow, high or low altitude, thus showing the applicability of this method for unsupervised classification of air traffic data. Similarly, RL often relies on laboriously handcrafted reward functions for supervision, while humans can learn useful behaviors by engaging with the world without external supervision, consuming large amounts of texts and videos, and interacting with others competitively and collaboratively. This inspired me to study how AI agents can learn useful behaviors in unsupervised manners similar to humans, and to characterize such emergent behaviors via metrics beyond task-specific reward maximization. Explicit communication: Communication can be crucial for coordination. I was intrigued by my exploration on how the vocabulary size of emergent communication protocols affects coordination among learning agents. Quantifying this experimentally, my SISL colleagues and I showed that communication protocols learned through discrete channels resulted in similar task performance as those learned through continuous channels, despite discrete protocols not having the infinite vocabulary size that continuous protocols do. We showed that sufficient protocol bandwidth can make up for this limitation, and demonstrated that the symbolic nature of discrete messages made them more interpretable to humans compared to continuous messages. This work was published at ICRA 2022. Recent advancements in natural language processing have unlocked countless new potentials in building human-compatible AI. Seeing language as a natural interface for human-AI interaction, I aim to use those advancements to develop agents that can leverage the structure and knowledge inherent in languages to learn useful behaviors, communicate with other agents, and explain their own internal processes. Implicit communication and human-AI coordination: Agents must develop theory of mind for successful zero-shot coordination (ZSC) without explicit communication. In the card game Hanabi, algorithms designed specifically for ZSC have produced RL agents that collaborate well together despite zero prior interactions. My MITLL colleagues and I set out to test the compatibility of such agents with humans, and showed in our NeuRIPS 2021 paper that humans strongly disliked those agents, mainly due to their tendency to be mostly collaborative during a game, but take unpredictable and suboptimal actions in crucial moments, frustrating their human counterparts. In follow-up work, I am currently developing methods to predict such human-incompatible behaviors without laborious human experiments. Namely, I am focusing on the brittleness of deep RL agents, who expect their teammates to choose a best response action from some strategy equilibrium that converged during training. Humans, due to bounded rationality, often adopt dominated strategies in crucial moments, causing the RL agents to behave unpredictably. I am actively working on using concepts from the Cognitive Hierarchy Theory literature to measure an RL agent’s brittleness to this “human-like” quality of their teammates. This research thread is due for publication in early 2023. For future work, I hope to continue my game-theoretic study of multiagent interactions, and apply its mathematical rigor to more complex domains using insights from recent successes in computational game theory. Plans for the future: In addition to my professional career, I have been continuously furthering my education by taking remote courses from Stanford University on decision-making under uncertainty and optimization, where I became fascinated by neuroevolution and AlphaZero through my final projects. I have also completed MOOCs to fill the gaps in my knowledge on topics like deep RL and game theory. My self-directed learning greatly benefitted my research, but has been of secondary focus behind my full-time job. To make better progress towards my career goal, I plan to return to academia to take classes and work with professors at the pinnacle of the fields I care about. The work done at USC Department of Computer Science is highly aligned with my research direction. In particular, I am interested in Prof. Erdem Biyik’s work on human-AI collaboration, Prof. Sven Koenig’s work on search and optimization for multiagent systems, Prof. Gaurav Sukhatme’s work on multi-robot systems, and Prof. Stefanos Nikolaidis’s work on quality diversity optimization and scene generation for human-robot interaction. I will also be strongly interested in working on natural language-informed robotics with Prof. Jesse Thomason when he is taking on new students again. The research opportunities and phenomenal pedagogy of USC’s PhD program will perfectly prepare me for my career goals. Outside of my academic pursuits, my personal history with childhood obesity and diabetes motivates me to keep an active lifestyle, where I regularly train and compete in the sport of powerlifting, and served as the vice president of University of Florida’s Gator Powerlifting Club. With my in-depth research experience, motivation in taking courses while working full-time, collaborative spirit in working well with both industry and academia colleagues, and discipline in balancing it all with a demanding sport, I wholeheartedly believe that I would be a valuable addition to the academic community at the University of Southern California, where I will thrive as a community member, while pushing the frontier of AI progress as a researcher.