Jie-Ying Lee 李杰穎

Ph.D. Student @ NYCU CS
Software Engineer @ Google Pixel Camera

Email: [email protected]

CV / Scholar / LinkedIn / GitHub / X / Threads / Blog

avatar

About Me

I’m a Ph.D. student in Computer Science at National Yang Ming Chiao Tung University, advised by Prof. Yu-Lun Liu, and a Software Engineer on Google’s Pixel Camera Team. I work on 3D scene synthesis, generative models for vision, and embodied AI, particularly focusing on Neural Radiance Fields, 3D Gaussian Splatting, vision-language navigation, and on-device perception.

I received my B.S. in Computer Science from National Yang Ming Chiao Tung University, with an exchange semester at ETH Zurich. My industry experience includes internships at Google (Pixel Camera Team), Microsoft, and Appier.

I am actively seeking research collaborations. If you are interested in working with me, don’t hesitate to reach out.

Google Google SWE (2025 - Present)
NYCU NYCU Ph.D. Student (2025 - Present) B.S in Computer Science (2021 - 2025)
ETH Zurich ETH Zurich Exchange Student (2024 - 2025)

News

Publications

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery teaser image ▶ Hover / Tap
We present Skyfall-GS, a framework that synthesizes photorealistic, city-block scale 3D urban scenes from satellite imagery using diffusion models, eliminating the need for expensive 3D scanning and manual annotation while enabling real-time exploration.
LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal teaser image ▶ Hover / Tap LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal hover image
We present LightsOut, a diffusion-based outpainting framework that enhances lens flare removal by reconstructing off-frame light sources. Our approach combines a multitask regression module with LoRA fine-tuned diffusion models to produce realistic and physically consistent results.
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation teaser image ▶ Hover / Tap
We present See, Point, Fly (SPF), a training-free framework for aerial vision-and-language navigation. By leveraging vision-language models and reformulating navigation as a 2D spatial grounding task, SPF enables universal unmanned aerial navigation without task-specific training.
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting teaser image ▶ Hover / Tap
We introduce AuraFusion360, a reference-based 360° scene inpainting method with three key innovations: depth-aware occlusion identification, Adaptive Guided Depth Diffusion for zero-shot point placement, and SDEdit-based enhancement for multi-view coherence.
SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes teaser image ▶ Hover / Tap
We present SpectroMotion, the first 3D Gaussian Splatting method capable of reconstructing photorealistic dynamic specular scenes. By combining 3DGS with physically-based rendering and deformation fields, we achieve high-quality synthesis of challenging real-world dynamic reflective surfaces.
BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-scale Scenes teaser image ▶ Hover / Tap
We present BoostMVSNeRFs, a method that enhances rendering quality for MVS-based NeRFs in large-scale scenes. Our approach addresses key limitations including restricted viewport coverage and artifacts from limited input views, enabling generalizable view synthesis in complex environments.

Service

Misc.

Beyond research, I’m passionate about staying active through badminton and hip-hop dance. I also enjoy capturing moments through photography.

Music-wise, I’m into Taiwanese indie and hip-hop, frequently listening to artists like Gummy B, 草東沒有派對 (No Party For Cao Dong), and 國蛋 GorDoN.