Back to All Essays

Statement of Purpose Essay - Carnegie Mellon University

Program:Phd, CV, ML, Graphics
Type:PHD
License:CC_BY_NC_SA_4_0
Source: Public Success Story (Ava Pun)View Original

Statement of Purpose Ava Pun I am applying for a PhD in computer science focused on graphics, vision, and artificial intelligence. My objective is to craft better tools for visual artists. This is a broad goal that encompasses many topics, with applications in areas like digital illustration, animation, and photography. However, influenced by recent research experiences, I am currently most interested in investigating neural rendering, scene understanding, and image generation in an artistic context. I am uniquely motivated to work in this area because I myself am an avid user of artistic tools. I have been involved in educational cartooning for the past eight years, leading projects such as MathSoc Cartoons and Academy 118. My work as a cartoon artist has taught me the value of using engaging visuals and memorable stories to inspire students. By developing more powerful image-creation tools, I hope to enable more educators to enhance their lessons with visual storytelling. My research background lies primarily in scene understanding and neural rendering of virtual environments. This was in the context of simulating training environments for self-driving vehicles, but the same ideas apply to artistic uses such as augmented reality. I conducted this research over several internships at Uber ATG (later Waabi), publishing one first- and one second-authored paper in top conferences under the guidance of Prof. Raquel Urtasun. In my first project, AdvSim, I devised an algorithm to seamlessly merge real and simulated LiDAR points. This enabled me to generate safety-critical driving simulations by realistically perturbing the vehicles in a LiDAR scene. I moved from LiDAR simulation to neural rendering in my second project, LightSim, where I developed an urban scene relighting method by first using physically-based rendering (PBR) to relight a reconstructed “digital twin” of the scene. As the relit twin lacked realism, I then used the PBR buffers to inform a neural deferred rendering network, which produced more lifelike images. The final model could realistically relight most scenes but still struggled to completely remove strong lighting effects such as cast shadows. Hence, I am currently improving it by designing a NeRF-based inverse rendering technique to disentangle lighting effects from intrinsic material properties. My task is particularly challenging because it deals with large outdoor scenes, to which current state-of-the-art methods face difficulty scaling. I hope to overcome this challenge by exploiting the additional sensor inputs available in a driving scene. For instance, the material that a LiDAR beam hits affects the intensity of the received light, which could be used to infer reflectance. My end goal is to be able to extract 3D geometry, materials, and lighting from any urban scene, enabling the rendering of diverse virtual environments from any viewpoint and under any lighting condition. Before jumping into neural rendering, I spent time exploring other computer graphics topics with artistic applications. Working with Prof. Christopher Batty at the University of Waterloo, I designed and implemented a hybrid method for better artistic control of fluid animations. This method fused Eulerian (grid) with Lagrangian (vortex filament) animation techniques. The former could reproduce effects such as viscosity and buoyancy forces, while the latter was better at preserving small vortex details. Combining the two enabled me to support unique animations such as colliding vortex rings with buoyancy effects, allowing various creative scenes to be produced. As a double major in Computer Science and Combinatorics & Optimization, I also have a strong mathematical background, and I have done a few part-time research projects more on this side. In one project, I enumerated and visualized polyform tilings under the supervision of Prof. Craig Kaplan. Though I was initially attracted to this topic due to the use of tilings in art, my work ended up aiding in the discovery of an aperiodic monotile, resolving a 60-year-old open mathematics problem! I also worked with Prof. Ian Munro to design a compact AVL tree representation. I proved that my representation used less than one bit per node, very close to the information-theoretic lower bound of 0.939 bits. These research experiences granted me a more holistic view of computer science and an appreciation for its underlying mathematics. In the future, drawing from my experience as both an artist and a researcher, I am curious to investigate non-photorealistic neural rendering. I have already worked on image relighting for real scenes, but how would this task change when applied to illustrations? The problem of relighting an illustration is very ill-defined, as 2D drawings do not necessarily represent objects that can exist in 3D. Hence, the visual appeal of the shadows becomes more important than their physical plausibility. Perhaps a generative model, trained on many aesthetic artworks, could be tuned to achieve this task? As another example, what about novel view synthesis for cartoons? I would save an incredible amount of time as an illustrator if I could generate new views of a character from a few input drawings. However, little research has been done in this direction thus far. I applied to CMU in particular because it boasts many faculty whose interests match mine. I am exceptionally eager to work in Prof. Jun-Yan Zhu’s Generative Intelligence Lab, as we share the goal of developing intelligent systems to help visual creators tell their stories. My artistic background uniquely motivates me to work in this lab; as generative AI becomes more widespread, I have seen fellow artists cry out for more creative control and more say over whether their art is used in training datasets. Joining the lab would enable me to address their concerns. As I am a photography enthusiast, I would also love to work with Profs. Srinivasa Narasimhan and Ioannis Gkioulekas in computational imaging. Ultimately, my career mission is to research intelligent tools that empower creative minds, either by becoming a professor or by joining a department like Adobe Research. I am excited to take the next step toward this goal with a PhD from CMU. References [1] Jeremy Chizewer, Stephen Melczer, J. Ian Munro, and Ava Pun. A Compact Representation for AVL Trees. 2023. arXiv: 2311.15511 [math.CO]. [2] Ava Pun, Gary Sun, Jingkang Wang, Yun Chen, Ze Yang, Sivabalan Manivasagam, Wei-Chiu Ma, and Raquel Urtasun. “Neural Lighting Simulation for Urban Scenes”. In: Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS). 2023. url: https://openreview.net/forum?id=mcx8IGneYw. [3] David Smith, Joseph Samuel Myers, Craig S. Kaplan, and Chaim Goodman-Strauss. An aperiodic monotile. 2023. arXiv: 2303.10798 [math.CO]. [4] Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, and Raquel Urtasun. “AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 2021, pp. 9909–9918.