Wonseok Oh

I am a first-year M.S. student at the University of Michigan. I am currently majoring in Electrical and Computer Engineering (computer vision track).

I am currently a research intern under the guidance of Professor
Andrew Owens, focusing on multi-modal learning using sound.

Previously I was a research assistant at the ETRI, Daejeon, South Korea, working with Kimin Yun and Yongju Lee in the summer of 2021 and 2022.
I received a B.S. in Computer Science and Engineering at Korea University and a B.E. in Software Technology and Enterprise Program, Chemical and Biological Engineering in 2022 (triple major!).

My research interests are Computer Vision, Machine Learning, and Robotics. I'm most interested in computational creation using generative models and robot perception.

I'm looking for an internship in the summer and a lab to work with from the fall semester of 2025 :)

Recent Projects

EECS556 Image processing (Instructor: Liyue shen)

  1. Enhancing Multi-View Optical Illusion Generation with Latent Diffusion Models and Traditional Image Processing.
    Improve the perception of multi-view optical illusions by combining latent diffusion models with traditional image processing techniques. Our approach generates high-quality images from limited data and provides diverse transformations, including rotations and flips, to enhance the details usually obscured in 2D images.

EECS504 Foundations of Computer Vision (Instructor: Jason Corso)
Selected one of the best project

  1. Domain transfer of sketched facial image into realistic facial image to prevent crime (Vision detectives).
    Improve forensic methods by generating detailed images of criminals from sketches using the pSp model. Then refine these images by applying specific characteristics through InstructPix2Pix. Evaluate the approach with real sketches.

EECS598 Action and Perception (Instructor: Stella Yu)

  1. Robotic Adaptation Strategies: From Simulation to Real-World Execution with RMA.
    Enhanced and evaluated Rapid Motor Adaptation (RMA) for legged robots. Key components include fine-tuning RMA to better adapt to diverse terrains, comparing phase-1 and phase-2 adaptations, and methodically analyzing performance across a range of environmental challenges in the real world. JPG [1] [2] MP4 [1] [2] [3] [4] [5] [6] [7] [8]

EECS542 Advanced topics of computer vision (Instructor: Andrew Owens)

  1. Multimodal with Latent Diffusion Models to Advance Multi-View Optical Illusion Generation.
    Proposed a method that enhances latent diffusion models with multimodal inputs, including sound and text, to generate dynamic multi-view optical illusions. Demonstrate the feasibility of multimodal approaches in enhancing optical illusion generation, evidenced by superior CLIP scores.

EECS598 Biomedical Imaging (Instructor: Liyue shen)

  1. Generate Domain Knowledge Diffusion Models employing the Schrödinger Bridge to enhance data representation for rare diseases.
    By maintaining domain integrity using a domain phase loss method, high-fidelity images for medical imbalance datasets were made. The approach overcomes prior knowledge loss when transferring between domains, outperforming existing models in unpaired image-to-image translation.

EECS453 Principles of Machine Learning (Instructor: Qing Qu)

  1. Learning Accurate and Parsimonious Point Cloud Representations from Images. Making efficient point cloud Nerf.
    Combine the strengths of volumetric neural rendering and deep multi-view stereo, using neural 3D point clouds and features to efficiently model a radiance field, improving both efficiency and visual quality.