Back to All Essays

Statement of Purpose Essay - University of Pittsburgh

Program:Phd, NLP, Computational Social Science
Type:PHD
License:CC_BY_NC_SA_4_0
Source: Public Success Story (Andrew Aquilina)View Original

While most European countries observed a downward trend in the number of persons at risk of poverty or social exclusion by late 2017, my home country reported the contrary [1], with roughly one in five Maltese falling into such a category. This is the alarming statistic I was introduced to at the outset of a weekend-long hackathon. Shocked by this, I spent the following 48 hours seeking ways in which loneliness within the elderly community can be alleviated through social media and language use. This experience prompted me to reflect upon the societal implications of the AI technologies I was studying back at university and eventually formed my interests at the intersection of NLP and computational social science. As a result, I am motivated to harness language and its situated nature to affect positive societal change by pursuing the following research questions and career aspirations. 1. How can we utilise language to understand human social behaviour, and in turn, guide policy addressing societal issues? The NLP courses I undertook during my time at the University of Malta deepened my understanding of language, specifically how it can inadvertently reveal our own thoughts and shape the way we present ourselves to one another. Inspired by this, I reached out to Dr. Charlie Abela with a research proposal, where I aimed to explore the relationship between users’ personalities and social connections within a real-world Twitter network. By employing a wide range of personality-annotated datasets and psycholinguistic lexicons, such as LIWC and Sensicon, I trained ML models to recognise users’ personality traits from their own tweets and evaluated them against an open-vocabulary alternative [2]. I subsequently integrated users’ personality preferences for the accounts they follow into topological link predictors and demonstrated improved precision. This research not only gave me first-hand experience in initiating and leading a project concerning NLP and network science, but also resulted in a first-author publication at the ACM Symposium on Applied Computing [3] and culminated in my graduation with the highest honours. I took great pride in achieving such milestones, especially given my personal background and the limited opportunities available at my university. In my future research, I hope to continue studying how individual differences exhibit themselves in linguistic expression, and how they play out in relation to broader social dynamics, such as group cohesion, misinformation spread, bias perpetuation, and others. By doing so, I intend on shedding light on collective behaviours and uncover why individuals behave the way they do when confronted with such phenomena. The end goal I hope to achieve is to ultimately inform social science research and pave a way forward for healthier, more inclusive societies. However, such an objective cannot be pursued if the NLP models we use are not ethically grounded. 2. How can we develop NLP models and pipelines that effectively mitigate their potential societal risks and maintain their ethical integrity? As part of my coursework in ‘LIN3012 Data-driven NLP’, I observed the direct consequences of overexposing NLP models to real-world data, specifically when I chose to investigate gender bias within Maltese word embeddings [4]. I showed how certain occupational roles were solely biased towards men, with few corresponding biases exclusively present towards women. I mitigated this effect by removing gender inflections [5] and applying the ‘hard-debias’ method [6]. The limited prior efforts addressing bias in my own native language made this project a challenging one, but through it, I became aware of the difficulties surrounding gender bias mitigation techniques in gender-marked languages. In light of this experience, I am drawn to research that seeks to uncover the potential risks associated with NLP technologies, especially those that manifest themselves in various forms and across different languages. Additionally, I also hope to contextualise such a pursuit within an intersectional perspective, taking into account social dimensions such as race and class. Unfortunately, NLP technologies have far-reaching societal ramifications that extend beyond the propagation of biases. During my master’s at Stockholm University, I collaborated with Prof. Panagiotis Papapetrou to investigate how podcast comprehension can be improved using a pipeline combining topic segmentation and text summarisation methods [7]. The short time frame allocated to this project, while having served as a formidable challenge, prompted me to take a step back and consider the broader ethical implications of such work. For instance, text summarisation models trained on human-generated text, or evaluated by annotators with skewed demographics [8], are prone to producing inaccurate or even harmful perceptions of the original content. I hope to delve deeper into how ethical frameworks could be established for similar NLP pipelines, with the aim of ensuring their desirable outputs when deployed. Career Plans and PittSCI. With my educational background grounded in AI, paired with my industry and research experience, I am confident in my decision to take the next step into research. My long-term career goal is to become a professor or academic researcher in the domains of NLP and computational social science, while also collaborating with researchers from various interdisciplinary fields, such as social scientists and linguists. With this goal in mind, I believe that pursuing my Ph.D. is the next essential stepping stone in my journey. At PittSCI, I hope to work with Prof. Yu-Ru Lin and Prof. Xiang Lorraine Li. Specifically, I hope to work with Prof. Lin on ways to leverage language use and individual differences to enrich social phenomena understanding. In a conversation with Prof. Lin, I was privileged to learn more about her research within the PICSO lab, and in particular, her recent work mapping language literacy using Facebook data. Working on similar projects under her guidance would enable me to make meaningful contributions in the areas I have outlined above. I would also like to collaborate with Prof. Li on ways to understand the societal risks of NLP technologies. I see great potential in benefiting from her guidance and expertise in evaluating large language models’ implicit commonsense knowledge as I seek to uncover and mitigate their social repercussions. The collaborative and interdisciplinary nature found at PittSCI, alongside its commitment to socially impactful research, is highly apparent. Therefore, given my research vision and its interdisciplinary component, I firmly believe it is the ideal place to pursue my doctorate.