Back to All Essays

Statement of Purpose Essay - University of Chicago

Program:Phd, NLP ML
Type:PHD
License:CC_BY_NC_SA_4_0
Source: Public Success Story (Chenghao Yang)View Original

My research focuses on the intersection of machine learning (ML) and natural language processing (NLP). My goal is to build user-centric ML/NLP systems with rich human knowledge. My research vision has been shaped by my research experience at Tsinghua University, Columbia University, Johns Hopkins University, IBM and AWS AI. In this statement, I will introduce my research views, discuss my previous work that helps build my research skills and motivates my researches, and finally introduce future plans. My Research Views: Build User-Centric ML/NLP Systems via Human Knowledge Support A Call for User-Centric Research Over the last decade, the rise of neural networks has sparked a series of remarkable achievements in ML/NLP. Commercial applications such as chatbots have entered into our daily life and changed our behaviors substantially. Nevertheless, from users’ perspectives, I see two major problems in current neural models: 1) over-sensitivity to data noise: Even a typo input by users can drastically influence the model outcome. 2) poor explainability for undesirable outputs: State-of-the-art (SOTA) models can produce harmful outputs that contain social bias or expose users’ privacy, which hurts users’ interest. But we still know little about how these problems arise and cannot give reasonable explanations to users. These urgent practical concerns call for user-centric ML/NLP research. Proposal: Deep Involvements of Human Knowledge As larger and larger models are being produced by big companies, and demonstrating surprisingly good performance, a natural yet controversial question would be, do we still need human knowledge to build an intelligent system? My point is, yes, we certainly need it! First, I believe humans should play the central role in value judgments of AI systems using their domain-specific knowledge, rather than relying on automatic judgers. Therefore, human involvement should go beyond traditional model design and implementation. It should also include the design of evaluation protocols (e.g., robustness) and control interface (e.g., controllable generation). Second, human knowledge can help achieve smarter utilization of resources because it can guide systems to prioritize what to explore. For example, incorporation of structured knowledge in models, e.g. neural-symbolic systems, demonstrates faster convergence and better performance than pure data-driven models. My Previous Work: Knowledge-Guided ML/NLP Modeling The first important work that helps shape my research views is my ACL’20 work on adversarial robustness (Yang et al., 2020). We successfully made the first attempt to use ontology knowledge to efficiently find high-quality adversarial examples that could break the SOTA NLP models. In that project, surprisingly, neural models (including BERT) we examined were so vulnerable and could be easily fooled even by simple synonym replacement. However, in many real-world scenarios, users’ input will by no means be well-formatted. A user-centric system should be robust to that input perturbations. I realized that there was still a long way to go, motivating my research on building user-centric ML/NLP systems. Another work that represents well my research view is the neural-symbolic system that I built for event stream modeling (Yang et al., 2021a), where we can ask the model the timing and type of next events given the history. This is the joint work with Prof. Jason Eisner and Hongyuan Mei at JHU. We allow the domain experts to use a formal language (i.e. datalog) to specify the logic structures between events and our system can automatically build corresponding neural architectures. Our system achieves superior performance over purely data-driven counterparts on e-commerce and healthcare data. Meanwhile it is highly interpretable because the model closely follows the logic programming to run. I love this work because it is the first time that I managed to offer a very flexible interface for users. It frees the users from understanding the backbone models and enables their full concentration on application designs. I am really excited that this work paves a great step toward my goal. Besides, I also tried to utilize human knowledge in other domains including narrative QA (Yang et al., 2021b), social media (Yang et al., 2021c) and compositional generalization (Qi et al., 2019). My Future Directions: User-Centric ML/NLP Analysis and Evaluation Holistic Diagnosis for Text Generation Models Background There have been a lot of research on analyzing and doing diagnosis (e.g., attribution, probing or behavior testing) when undesired output (e.g., hallucination, social bias) occur for text generation models. Diagnosis of individual problems has been widely studied. Unfortunately, the ease of one problem can give rise to another problem, because it may risk overfitting to spurious input-output correlations and fail to learn transferable knowledge. This is an important lesson I learned from my ACL’20 work. Proposal I believe there should be inherent correlation between these generation problems and I propose to conduct a holistic diagnosis by simultaneously evaluating and analyzing these identified problems for generation models. We can then better understand what makes the model generate undesired outputs and may be able to develop a systematic fix for many problems. As a working example for holistic analysis in text classification, in CheckList (Ribeiro et al., 2020), we can see commercial NLP systems which fail on lexical perturbation testing will also fail on negation and SRL modeling testing and many more related tasks, indicating potential flaws in lexical modeling capabilities. With holistic diagnosis, users will have the opportunity to dive deeper and understand why models fail better. Building Robust ML/NLP Models Background From both previous work and my experience, the SOTA models that we currently rely on are vulnerable and unstable in two ways: 1) Existing learning paradigms (e.g., fine-tuning, prompt-tuning) are sensitive to hyper-parameter and even random seed choices. 2) the model lacks domain-specific knowledge and generalizes poorly on unseen data (although these data follow a similar generation process). These problems can bring serious risks to applications. Proposal What would be the reason for that vulnerability and instability? My first hypothesis is that some components can significantly hurt the stability of the optimization process. If we can identify them and take care of the updates over them, we can get a more robust model. For example, the optimization process for parameters in classification layers can be critical because it is randomly initialized and can bottleneck the whole network performance. Another hypothesis is that due to limited data, intensive tuning increases the risk of overfitting. We can regularize models to have consistent behaviors over neighborhoods of data points (e.g., paraphrased data), where the neighborhoods are specified using domain knowledge (e.g., for sentiment analysis, irrelevant entity replacement should not change the prediction). My Fitness to University of Chicago (UChicago) I really appreciate UChicago because of its strong NLP and ML background and communities. Specifically, I am interested in working with 1) Prof. Allyson Ettinger, I enjoy reading her work on testing and evaluating the robustness on pre-trained language model (Pandia and Ettinger, 2021) and want to explore how to build a robust model with Prof. Ettinger. 2) Prof. Chenhao Tan, I like his recent work on decision-focused summarization (Hsu and Tan, 2021), which matches my goal of building user-centric ML/NLP models and I want to explore more on this side with Prof. Tan. I believe my in-depth research experience, solid engineering and research skills, willingness to develop real and cutting-edge technology and collaborative mindset make me a clear fit for UChicago. Part I – My Selected Publications Fanchao Qi, Junjie Huang, Chenghao Yang, Zhiyuan Liu, Xiao Chen, Qun Liu, and Maosong Sun. 2019. Modeling semantic compositionality with sememe knowledge. In Proceedings of ACL. Chenghao Yang, Hongyuan Mei, and Jason Eisner. 2021a. Transformer embeddings of irregularly spaced events and their participants. In Submission of ICLR. Chenghao Yang, Xiangyang Mou, Mo Yu, Bingsheng Yao, Xiaoxiao Guo, Saloni Potdar, and Hui Su. 2021b. Narrative Question Answering with Cutting-Edge Open-Domain QA Techniques: A Comprehensive Study. Transactions of the Association for Computational Linguistics, 9:1032–1046. Chenghao Yang, Yuan Zang, Fanchao Qi, Zhiyuan Liu, Meng Zhang, Qun Liu, and Maosong Sun. 2020. Word-level textual adversarial attacking as combinatorial optimization. In Proceedings of ACL. Chenghao Yang, Yudong Zhang, and Smaranda Muresan. 2021c. Weakly-supervised methods for suicide risk assessment: Role of related domains. In Proceedings of ACL-IJCNLP, pages 1049–1057, Online. Association for Computational Linguistics. Part II – Other References Chao-Chun Hsu and Chenhao Tan. 2021. Decision-focused summarization. In Proceedings of EMNLP, pages 117–132. Lalchand Pandia and Allyson Ettinger. 2021. Sorting through the noise: Testing robustness of information processing in pre-trained language models. In Proceedings of EMNLP, pages 1583–1596. Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. 2020. Beyond accuracy: Behavioral testing of nlp models with checklist. In Proceedings of ACL, pages 4902–4912. 1 Work done during internship at Tsinghua with Prof. Zhiyuan Liu. I am the project leader and propose the whole project. 2 This work has been submitted to ICLR 2022. I am the main contributor. 3 for prompt-tuning, it is prompt design.