Carnegie Mellon University PHD Statement of Purpose Essay Sample

I want to spend my career doing work that uses math and computer science to make the world more private, secure and fair. I am excited about cryptography, both applied and theoretical. Within the field of cryptography, I am interested in all kinds of privacy-enhancing technologies, but I’m particularly excited about work in Zero-Knowledge Proofs (or arguments) and Private Information Retrieval (PIR). I would like to contribute to the theoretical development and practical applications of new ideas related to these primitives. Inspired by my work with Shay Gueron at Meta (and work like [6]), a concrete direction in Zero-Knowledge Proofs that I’m interested in is designing Zero-Knowledge Proof systems that are fast on modern computer hardware, such as CPUs, GPUs, or more novel hardware accelerators that are becoming more common in data centers. There is some existing work in designing hardware accelerators for Zero-Knowledge Proofs, such as [10], but not as much work in designing schemes with suitable properties for hardware acceleration, such as convenient memory access patterns, or the ability to precompute or reuse expensive intermediate results. A direction in PIR that I’m interested in is finding new real-world use cases for PIR and building systems for these use cases. There are some existing use cases for PIR, like blockchain light clients or privately accessing embeddings for machine learning models [7], but I think it is an interesting primitive and that can likely solve other important problems. In addition to solving real-world problems, having more use cases would provide richer benchmarks and practical experience to guide the development of new protocols for PIR. **Previous Research Work:** In my undergraduate research, I mostly worked with Ari Juels’ group. I am proud of the work I did on CanDID (published in IEEE S&P 2021 [8]), which was a system exploring designs to solve the problem of “decentralized identity”. The system could “bootstrap” a decentralized and pseudonymous identity from existing real-world credentials, like a Social Security Number. I used Intel SGX to implement a part of the system that generated these identities. This part could also be a “drop in” replacement for the equivalent Multi-Party Computation-based implementation we had developed for the project. One interesting challenge I worked on was implementing an elliptic curve-based hash function (Baby Jubjub) from scratch within the constraints of an SGX enclave. I also worked on an interesting project called “Clockwork Finance” [2], which was a project trying to systematically analyze “MEV”. See [4] for an introduction to MEV, which is a consequence of how some cryptocurrency protocols are underspecified. I worked on distributing our large computations on AWS so they could be done faster and more cheaply. Outside of school, I am proud of some of my contributions to the cryptography community. I made some contributions to the open-source book Mastering Ethereum [1], and actively edit sections of Wikipedia related to cryptography and theoretical computer science. I have also made some cryptography-related contributions to open-source projects like Facebook’s folly [9]. As a member of the Cryptography Infrastructure team at Meta, I have worked on some interesting cryptography-related projects, collaborating with the cryptographers Shay Gueron and Rafael Misoczki. For instance, we worked on creating a new mode of operation for the AES block cipher with a nonce space large enough for all practical purposes (as well as features like authentication and key commitment). I wrote an optimized assembly implementation, gave feedback on the design/security proof, and deployed it to production. The new “Double Nonce AES GCM” mode was 2-3 times faster (depending on the input size) on modern processors than XChaCha20-Poly1305, the most popular ”extended nonce” encryption algorithm at Meta. I have also contributed to discussions about designs for cryptographic systems, such as NIST discussions around PQC and block cipher modes of operation, applications of new Trusted Execution Environments (TEEs) at the company, or uses of Fully Homomorphic Encryption and Multi-Party Computation for serving ads. I also worked on a project to measure the security of code produced by different LLMs [3]. Some recognition I received for my work was a “Redefines Expectations” performance rating in 2022, which is the highest rating given to 1-5% of engineers at Meta. **Why a Ph.D.?** After 2 years as an engineer at Meta, I feel that I have a good understanding of what I like about my job. I love the parts of my job that involve reading papers, tuning algorithms, and having input on designs for important real-world systems. I wish I could do more of this work. I also wish there was more freedom to pursue more ambitious or abstract projects. The projects I work on need “sign-offs” from many people, as well as business justification. I also like doing math, and there is not very much mathematics at my job. I get to do some mentorship, but I miss the experiences I had teaching students in my 7 semesters as a Teaching Assistant at Cornell. Finally, I am very excited about how fast research in all kinds of privacy enhancing technologies is moving, and I would desperately like to be a part of it. **Interest in CMU:** A major reason for my interest in Carnegie Mellon is the set of possible advisors there. Elaine Shi is one advisor who I would like to work with. I attended her talk at Meta about Piano [11] and we had a short discussion about the idea of using PIR in the context of a cloud authorization system. I thought the talk and its ideas were very cool. My previous experience working on research in blockchains may be useful for working together. We are interested in similar cryptographic primitives, and I think we could be productive working together. Another potential advisor who I would be interested in working with is Riad Wahby. We are both excited about proof systems and other work in secure computation. He has recent papers in building efficient proof systems (like [5], which is very close to the kind of work I would like to do in graduate school). Additionally, I have experience from my career working with all kinds of cryptographic hardware, like a few kinds of proprietary Hardware Security Modules, Trusted Execution Environments (Intel SGX and AMD SEV-SNP) and hardware instructions for cryptography (Intel’s AES-NI, GF-NI and SHA extensions). Overall, I think there is a good alignment of my interests and experiences with Riad Wahby, and that we could do interesting work together. I am also excited about the potential to work with Wenting Zheng. As previously mentioned, I really enjoyed Elaine Shi’s talk about Piano. Furthermore, I am also excited about building practical systems that use interesting cryptographic primitives, and I think my experience doing similar work at Meta may be useful in such research. We are both excited about primitives like MPC or PIR, and I think it would be great to work together in these areas during a Ph.D. Besides advisors, I am also interested in CMU because CyLab seems like a good environment to grow as a researcher. Additionally, I watched a talk by CMU Ph.D. student Quang Dao at the Cornell Security Seminar that I enjoyed. It would be great to have him as a colleague. **Conclusion:** To conclude, I want to earn a Ph.D. in Computer Science because I think it’s the best way to continue doing work that I think is impactful and interesting. This is in pursuit of the eventual goal of working as a researcher at a company or another large institution. Thank you for your consideration. **References** [1] A. Antonopoulos and G. Wood. Mastering Ethereum: Building Smart Contracts and DApps. O’Reilly Media, 2018. https://github.com/ethereumbook/ethereumbook. [2] K. Babel, P. Daian, M. Kelkar, and A. Juels. Clockwork finance: Automated analysis of economic security in smart contracts. 2023 IEEE Symposium on Security and Privacy, 2023. https://arxiv.org/abs/2109.04347. [3] M. Bhatt et al. Purple llama cyberseceval: A secure coding benchmark for language models. Preprint/Workshop at NeurIPS 2023, 2023. https://arxiv.org/abs/2312.04724. [4] Ethereum Foundation. Maximal extractable value (mev). https://ethereum.org/en/developers/docs/mev/. [5] A. Golovnev, J. Lee, S. Setty, J. Thaler, and R. S. Wahby. Brakedown: Linear-time and field-agnostic snarks for r1cs. Cryptology ePrint Archive, Paper 2021/1043, 2021. https://eprint.iacr.org/2021/1043. [6] S. Gueron. Intel® advanced encryption standard (aes) new instructions set. 2010. https://www.intel.com/content/dam/doc/white-paper/advanced-encryption-standard-new-instructions-set-paper.pdf. [7] M. Lam, J. Johnson, W. Xiong, K. Maeng, U. Gupta, Y. Li, L. Lai, I. Leontiadis, M. Rhu, H.-H. S. Lee, V. J. Reddi, G.-Y. Wei, D. Brooks, and G. E. Suh. Gpu-based private information retrieval for on-device machine learning inference, 2023. [8] D. Maram, H. Malvai, F. Zhang, N. Jean-Louis, A. Frolov, T. Kell, T. Lobban, C. Moy, A. Juels, and A. Miller. CanDID: Can-do decentralized identity with legacy compatibility, sybil-resistance, and accountability. 2021 IEEE Symposium on Security and Privacy, 2021. https://eprint.iacr.org/2020/934. [9] Meta Platforms. Folly: Facebook open-source library. https://github.com/facebook/folly. [10] Y. Zhang, S. Wang, X. Zhang, J. Dong, X. Mao, F. Long, C. Wang, D. Zhou, M. Gao, and G. Sun1. Pipezk: Accelerating zero-knowledge proof with a pipelined architecture. 2021 International Symposium on Computer Architecture, 2021. https://people.iiis.tsinghua.edu.cn/ ̃gaomy/pubs/pipezk.isca21.pdf. [11] M. Zhou, A. Park, E. Shi, and W. Zheng. Piano: Extremely simple, single-server pir with sublinear server computation. IEEE S&P 2024, 2023. https://eprint.iacr.org/2023/452.