Back to All Essays

Statement of Purpose Essay - MIT

Program:Phd, NLP, HCI, Computational Social Science
Type:PHD
License:CC_BY_NC_SA_4_0
Source: Public Success Story (Hang Jiang)View Original

My current interest in artificial intelligence (AI) stems from taking courses in linguistic theory, cognitive psychology and programming in my freshman year at Emory University. These classes prompted me to ask two fundamental questions. Firstly, how do we create machines that can master human language? Secondly, how do humans, especially children, learn to express meaning using language? In my undergraduate years at Emory, I became particularly interested in natural language processing (NLP) when working with Jinho D. Choi. Later, I enrolled in Stanford as a graduate student majoring in Symbolic Systems, and worked with Michael C. Frank to study cognitive development with computational models. Meanwhile, I am passionate about applying AI for social good. To me, the highly interdisciplinary nature of your program represents my favorite approach to the study of AI. In the following sections, I will talk about my research interests in these three areas: machine language understanding, human language acquisition, and AI for social good. Overall, my research and work experience in the fields have prepared me excellently for graduate study in the program. Machine language understanding. In order to “create machines that can master human language”, I have been doing research in NLP and deep learning. I joined Prof. Choi’s Emory NLP lab in 2016 and worked on several research projects, including adapting universal dependency to Chinese, applying recurrent neural networks in disfluency detection, using a sequence-to-sequence model for Chinese word segmentation, and creating a novel dialogue dataset and approach for personality recognition. In the personality project, I received training in how to collect annotations from Amazon Turk, publish a new dialogue dataset on personality recognition, and implement novel attentive neural models for the task. This particular project helped me win the highest honors with my thesis, Automatic Personality Prediction with Attention-based Neural Networks [1]. I recently further polished my thesis by integrating additional results from pre-trained contextual embeddings such as BERT and RoBERTa [2,3]. An abstract was published to the Student Abstract and Poster Program at AAAI 2020 (strong accept) [4]. At Stanford, I started to tackle more challenging NLP problems, such as question answering (QA) and language modeling (LM). In the NLP with Deep Learning course taught by Christopher Manning, my team won the best poster prize for our work, “Ensemble BERT with Data Augmentation and Linguistic Knowledge on SQuAD 2.0” [5]. We used a novel data augmentation technique, replacing synonyms from WordNet on the SQuAD 2.0 dataset and fine-tuning BERT models on TPU . We then collated the predictions from BERT, Bi-Directional Attention Flow (BiDAF) [6] and prior linguistic knowledge to build a robust QA system. In the following quarter, I started a new project in Chris Potts’s natural language understanding class and collaborated with Dr. Vivek Kurkani to publish my co-first-authored paper, DialectGram: Detecting Dialectal Variation at Multiple Geographic Resolutions, at the Society for Computation in Linguistics (SCiL), including an oral presentation [7]. As part of this project I implemented a novel and scalable approach known as “DialectGram”, based on a multi-sense skip gram framework, in order to encode linguistic variations among English dialects. I also constructed a new corpus called “Geo-Twitter2019”, and created a validation set (“DialectSim”) for quantifying linguistic variations between British and American English. DialectGram not only outperforms the baseline methods in accuracy, but also enables detection of dialectal variation at multiple levels of geographic resolution, obviating the need for an a priori definition of the resolution level. Human language acquisition. In order to understand “humans, especially children, learn to express meaning using language”, I have also worked on language learning with computational methods. I was first introduced to theories of language acquisition at Emory. In the course of countless readings and debates, I learned to critically examine linguistic and cognitive theories from Saussure, Wittgenstein, Chomsky, and Tomasello. On the one hand, I am amazed by the underlying linguistic structures depicted by Chomsky’s Universal Grammar. On the other hand, I tend to favor emergentist theories such as MacWhinney’s competition theory [8] and Tomasello’s functional theory of language development [9]. In summer 2017, I contacted Brian MacWhinney by email and got the opportunity to work closely with him on improving the recognition performance of speeches from AphasiaBank using EESEN [10], an open-source LSTM-based automatic speech recognition (ASR) system. I was able to reduce the word error rate (WER) on aphasia speeches by optimizing the training process, which made the ASR system a promising tool for aphasia patients. I additionally designed and built an intelligent tutor application called Funetics for language learning. This app is based on MEAN (MongoDB, Express, Angular, Node) stack technologies, interacts with English learners, and provides them with customized feedback on pronunciation. This experience with Brian allowed me to combine my interest in language learning and machine learning, which motivated me to apply for the unique Symbolic Systems program at Stanford. After entering Stanford, I was fortunate to work closely alongside my advisor Prof. Michael C. Frank and Dr. Abdellah Fourtassi, who focuses on child language acquisition using computational models . We have tried various computational tools, including social networks and word embedding, to model children’s learning of words. Our current project aims to automatically detect linguistic changes in semantic development on CHILDES dataset [11], and explore how concepts evolve over time. An accompanying paper in NLP and computational cognitive science is in preparation. Beyond this project, I am also actively working on a different project that combines the Rational Speech Act Framework [12] and deep learning to model pragmatic reasoning. Going forward, I believe that gaining a better understanding of language acquisition will prove to be the key to breaking the bottleneck in contemporary AI research. AI for Social Good. In terms of AI applications, I am highly motivated to use computational tools for the wider social good, especially in areas such as mental health, education, and sustainable development. I want to make AI tools available to everyone and applicable across different fields. My favorite quote is from Andrew Ng’s AI Is the New Electricity and I believe that AI is freeing humans from repetitive labor and allows people to make the best use of their creativity in arts and science. At Emory, I worked with Prof. Roberto Franzosi in Sociology to co-lead the development of the Computer-Assisted Coding of Events (PC-ACE) program. This is a software based on Stanford CoreNLP and ClausIE, and it is designed for social science researchers without a programming background to help them conduct research in computational social science. I built two major pipelines to extract subject-verb-object (SVO) triplets, both for constructing interpersonal social networks and to detect changing points for narrative analysis. I also volunteered to teach more than fifty students in a Linguistics capstone class how to use this program for their research projects. Seeing the tools I developed being used for their research made me feel very fulfilled. In the following summer, I interned at the Educational Testing Service (ETS), the organization running TOEFL and GRE exams, and applied NLP technologies to practical educational projects. Advised by Martin Chodorow, Dr. Nitin Madnani, and Dr. Aoife CahillI, I rebuilt the preposition error detection component of the e-rater system [13] by integrating hand-crafted linguistic features from both constituency and dependency parse trees. The new system achieved a 10% increase in recall rate by maintaining the previous 90% precision. ETS integrated my component into its e-rater, which is a tool that marks grammar mistakes for human raters, facilitates the monotonous process of grading, and promises the fairness of the grading process – human raters are not always able to detect grammar errors as consistently as machines. This past summer, I interned at Apple, working alongside Dr. Bing Zhao and Dr. Vivek Kumar Rangarajan Sridhar. During this internship I implemented text summarization models, built iOS prototypes, and contributed to the internal NLP framework to make input easier for iPhone users. Most importantly, I learnt a lot from the company about how to apply AI into applications without compromising ethics. At present I am working with PhD students from the Geophysics Lab and Future Data System Lab to detect earthquakes using neural nets. These applications are exciting for me and I hope to wield the power of AI wisely in order to solve more social problems. In the long term, I aim to become a leading researcher and professor in AI, exploring how to teach machines language and how humans learn language. I enjoy teaching as much as research and hope to nurture the leaders of our next generation in AI. I am currently a teaching assistant (TA) for CS145: Data Management and Data Systems taught by Dr. Shiva Shivakumar in Fall 2019 and will be a TA for CS224N: NLP with Deep Learning taught by Christopher Manning in Winter 2020. Therefore, I am very excited about the PhD training and the research projects at your program. I am now ready to take my next steps towards a lifelong career in artificial intelligence and I will absolutely go the extra mile to ensure I make a lasting contribution to the program’s cutting-edge research projects, to this field, and to the world at large. References [1]: Jiang, Hang. Automatic Personality Prediction with Attention-Based Neural Networks. Electronic Theses and Dissertations, Emory University, Atlanta, GA. (2018). [2]: Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018). [3]: Liu, Yinhan, et al. "Roberta: A robustly optimized bert pretraining approach." arXiv preprint arXiv:1907.11692 (2019). [4]: Jiang, Hang, Xianzhe Zhang, and Jinho D. Choi. Automatic Text-Based Personality Recognition on Monologues and Multiparty Dialogues Using Attentive Networks and Contextual Embeddings. To Appear in Proceedings of the 34th AAAI Conference on Artificial Intelligence: Student Abstract and Poster Program, of AAAI:SAP’20. New York, USA. (2019). [5]: Zhou, Wen, Xianzhe Zhang, and Hang Jiang. "Ensemble BERT with Data Augmentation and Linguistic Knowledge on SQuAD 2.0." CS224N Technical Report. (2019). [6]: Seo, Minjoon, et al. "Bidirectional attention flow for machine comprehension." arXiv preprint arXiv:1611.01603 (2016). [7]: Jiang, Hang*, Yuxing Chen*, Haoshen Hong*, and Vivek Kulkarni. "DialectGram: Automatic Detection of Dialectal Variation at Multiple Geographic Resolutions." To appear in Proceedings of the Society for Computation in Linguistics. New Orleans: Linguistic Society of America. (2019). [8]: Bates, Elizabeth, Brian MacWhinney, and B. MacWhinney. "Competition, variation, and language learning." Mechanisms of language acquisition (1987): 157-193. [9]: Tomasello, Michael. "Language is not an instinct." (1995): 131-156. [10]: Miao, Yajie, Mohammad Gowayyed, and Florian Metze. "EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding." 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE, 2015. [11]: MacWhinney, Brian. The CHILDES project: Tools for analyzing talk, Volume II: The database. Psychology Press, 2014. [12]: Frank, Michael C. "Rational speech act models of pragmatic reasoning in reference games." (2016). [13]: Attali, Yigal, and Jill Burstein. "Automated essay scoring with e‐rater® v. 2.0." ETS Research Report Series 2004.2 (2004): i-21.