Statement of Purpose Essay - University of Washington
Michael Merrill – Research Statement My interests lie in designing and building population-scale systems which make the internal processes of human cognition and behavior machine readable, particularly in the contexts of mental health and digital interventions. In situ human behavior represents a largely untapped trove of unstructured data, and by learning more about the link between these behaviors and mental illness we can build new tools to combat the growing mental health crisis we face in this country and abroad. I am particularly interested in measuring circadian disruptions and their relevance to mental health and wellbeing, large scale population sensing, and developing tools for digitally augmented care in clinical contexts. I began studying mental health at a population-scale began in the summer of 2014 when, after my freshman year at Cornell, I interned at a public health non-profit. With only my introductory programming course under my belt, I taught myself how to build a very simple Naïve Bayes sentiment analysis engine for geo-tagged tweets. The classifier performed not far above baseline, but I enjoyed the challenge and found it rewarding to watch the live results pour in, knowing that there was a person and a thought behind each hastily classified tweet. I know now that it’s not always valid to assume that how someone behaves is linked to their “real world” experience, and I remember that I neglected to account for potential sampling biases in the Twitter API and demographic differences between neighborhoods. But rather than be discouraged by the project, I think it taught me the importance of a theoretical framework to underpin applied experiments like these. The following summer at Cornell I began working in the People Aware Computing Lab under the supervision of Dr. Tanzeem Choudhury. While I contributed to several projects in the two years between then and graduation, my primary focus was on the groundwork for CrossCheck: a passive mobile monitoring system for the prediction of relapse in patients with schizophrenia. At the time of publication, CrossCheck was the longest longitudinal smartphone-based study of patients with severe mental illness. Since our cohort experienced fewer relapse events than we expected, our first paper focused on finding correlates with self-reported daily measures of mood. Schizophrenic relapse is not a binary event. Patients and caregivers alike report lengthy declines, in which any of a variety of symptoms may manifest, including sleep disruptions, polar swings in sociability, or trouble organizing thoughts. Accordingly, the project’s greatest challenge was adapting our analysis to account for these crucial variations between participant behavior. We found that models which were trained on the union of population data and just 20% of a participant’s data performed 60% better than models trained on population data alone. Reflecting on this experience gave me insight into what excites me professionally and personally: that were able to find results that not only confirmed clinical expectations but also delivered vital new insights into private and subtle signals which may have been impossible to measure before the proliferation of mobile sensing. Due in part to the significance of my contributions to PAC Lab and CrossCheck, I was able to attend UbiComp 2016 in Heidelberg, Germany. Above all else, the talks I heard and conversations I shared imbued in me a humbling sense of community. One night, after the main reception, I found myself back in the AirBnB of a group of students. For an hour or two we went back and forth, helping one student reshape the presentation he’d give the next morning. I was amazed by the comradery and commitment to excellence I’d felt not just in that room but also at the conference at large. During the summer before my senior year I interned with Dr. Anind Dey in the Ubicomp Lab at Carnegie Mellon. While at CMU I recognized that although there were several open source mobile sensing platforms (AWARE, Purple Robot, Funf) there was no unifying standard for the analysis and visualization of these data. My vision was to engineer a data pipeline which could load smartphone sensor data and self reports from an arbitrary source and then perform visualization and feature extraction. Ultimately, I think that given my skill set at the time it was ambitious of me to think the project could be finished within the scope of one summer. I was able to complete the feature processing library and deliver early analysis on a cohort of CMU students, but the single core SQL and Pandas pipeline ran too slowly to be useful in the long term. However, since entering industry, I’ve successfully built a similar system for internal use. Following my summer at CMU I built upon my work with one of Dr. Dey’s students by recruiting 66 of 79 members of a Cornell social fraternity for a cohort analysis study on the feasibility of predicting friendship through smartphone sensor data. We found some promising results, and are working to develop our findings for future publication. The study and the dataset have left me wondering if it is possible to measure the propagation of negative affect through an insular network and whether there is a measurable community component to depression. If we observe negative affect in one member of a network of digital traces, what does this tell us about another member who shares social or physical ties? Could we measure how a bad mood spreads through a group? At the start of my senior year I began working part-time for Dr. Choudhury's mobile sensing startup, HealthRhythms, and following graduation I joined the company as its second full-time employee. It was a difficult decision to attach myself to the company rather than apply directly to graduate school, but I wanted to take two years to refine my technical skills and see the end of the research threads that started in PAC Lab and led into the startup. Presently, my research focuses on discovering links between smartphone sensor data and self-reported symptoms in patients with bipolar disorder, depression, and chronic anxiety. My publicly available work through the startup has produced two posters and we are currently engaged with an intellectual property law firm to begin filing my first patent on behalf of the company. A key tenet of HealthRhythms’ mission is understanding the role that circadian disruptions play in the perpetuation of mental health disorders. For one project, I designed and implemented a proprietary sleep detection algorithm that performs on-par with existing state-of-the-art classifiers but with significantly lower computational overhead. I have also prototyped novel methods for measuring deviations from daily routine which may be linked to downturns in mental health. As part of my day to day work at HealthRhythms I’ve been involved with the planning, management, and analysis of a dozen studies of varying size through our academic and pharmaceutical partners. This access has been a crash course on what separates a successful clinical study from a failure. My work at HealthRhythms has been rewarding, but has also opened up new curiosities about how smartphone data can be fused with other modalities. We have found that the larger the scale of a clinical study, the harder it becomes to practically collect detailed data. While it’s possible to capture behavioral data from massive cohorts with mobile sensing, studies that employ ground truth measurements like sleep actigraphy are often limited in scale due to budgetary constraints and logistic practicality. These challenges ignited my interest, leading me to wonder how we could employ a technique like transfer learning to combine broad information from large datasets with higher resolution data from smaller deployments. Moreover, how do we leverage population level data to the improve measurement of mental illness with personalized models? Most recently my interests have expanded beyond mobile interventions. Time and time again the gold standard for the treatment of depression and anxiety has proven to be a combination of medication and, crucially, a strong clinician-patient relationship. I’ve become interested in employing natural language processing methods to study the ways in which patients with depression and anxiety use language to express their symptoms in clinical settings. Standard clinical assessments like the HAM-D rely on the subjective translation of patient interviews into ordinal scales. Perhaps if we can better measure the latent patterns that link our words and mental state we will be able to build new assessments that better augment existing therapeutic practices. Furthermore, the prospect of healing after a mental health diagnosis can be daunting for patients, particularly without a constructive vocabulary with which to discuss trauma. Could a digital therapeutic assistant leverage conversations with previous patients to direct a conversation with another patient in a promising new direction? To me, a PhD would mean the opportunity to develop the cognitive tools I need to answer these questions and branch out in new directions. I’ve tried tackling some of these problems on my my own, but every time I feel like I’ve made a dent I realize that I’ve only broken into another layer of depth. It’s easy enough for me to read a paper on some new technique or model and implement it, but I feel it’s another thing entirely to leverage ideas from the community and build upon them to contribute my own answers. When I look forward and wonder about what the rest of my career might look like, I simply can’t imagine spending it floating on top of superficial understanding while the most interesting problems flow underneath. My two years in industry have rewarded me with hands-on experience and grown my understanding of existing limitations, but above all they’ve helped me realize that if I stay on this path it’s unlikely I’ll have the chance to tackle long-shot problems and take my conceptual masteries of math, mental health, machine learning, and human-computer interaction to this next level. From my perspective, only a research-based education can properly do that. What excites me about University of Washington is that it represents an intersection of familiar territory and new opportunities to grow and learn. Between Dr. Dey and Dr. Dror Ben-Zeev (Department of Global Health) two of my collaborators have recently made the move to join UW’s faculty, and I’d be overjoyed to pick up old threads with them. I would be particularly interested in working with new faculty member, Dr. Tim Althoff. The scale and importance of his questions have created a trail of papers that I’ve read and admired. His work provides grounded solutions without abandoning rigor or theoretical fundamentals and I believe that my background in mobile sensing, mental health, and circadian rhythmicity put me in a position to make meaningful contributions early on.