About

A computer vision researcher interested in both long-term research and short-term biz-landing.

Now: I am focused on human-centric image and video generation technologies at Meta BizAI. Meanwhile, I am building foundation models capable of understanding and generating across diverse modalities.

Previously: I received my Ph.D. in Computer Science from Leibniz Hanover University, advised by Prof. Michael Yang and Prof. Bodo Rosenhahn. I was a senior ML scientist at Picsart, a research intern at Meta and Bosch. I obtained my Master's degree from Leibniz Hanover University and my Bachelor's degree from Hefei University of Technology.

News

  • Jun 2025: Our GenAI solution for Ads is highlighted at Cannes Lions 2025.
  • Mar 2025: Paper on compositional image generation accepted to IJCV.
  • Feb 2025: Paper on virtual try-on accepted to CVPR.

Selected Publications

Generative Models

  • Attribute-Centric Compositional Text-to-Image Generation
    Yuren Cong, Martin Renqiang Min, Li Erran Li, Bodo Rosenhahn, Michael Ying Yang. IJCV, 2025.
  • Learning Flow Fields in Attention for Controllable Person Image Generation
    Zijian Zhou, Shikun Liu, Xiao Han, Haozhe Liu, Kam Woh Ng, Tian Xie, Yuren Cong, Hang Li, Mengmeng Xu, Juan-Manuel Perez-Rua, Aditya Patel, Tao Xiang, Miaojing Shi, Sen He. CVPR, 2025.
  • FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
    Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He. ICLR, 2024.
  • GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
    Shoufa Chen, Mengmeng Xu, Jiawei Ren, Yuren Cong, Sen He, Yanping Xie, Animesh Sinha, Ping Luo, Tao Xiang, Juan-Manuel Perez-Rua. CVPR , 2024.

Scene Understanding

  • SPAN: Learning Similarity between Scene Graphs and Images with Transformers
    Yuren Cong, Wentong Liao, Bodo Rosenhahn, Michael Ying Yang. PAMI , 2025.
  • Reltr: Relation Transformer for Scene Graph Generation
    Yuren Cong, Wentong Liao, Bodo Rosenhahn, Michael Ying Yang. PAMI , 2023.
  • Spatial-temporal Transformer for Dynamic Scene Graph Generation
    Yuren Cong, Wentong Liao, Hanno Ackermann, Bodo Rosenhahn, Michael Ying Yang. ICCV , 2021.
  • NODIS: Neural Ordinary Differential Scene Understanding
    Yuren Cong, Hanno Ackermann, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn. ECCV , 2020.

Embodied AI

  • Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change
    Mariia Khan, Yue Qiu, Yuren Cong, Bodo Rosenhahn, David Suter, Jumana Abu-Khalaf. IROS , 2024.
  • Worldafford: Affordance Grounding Based on Natural Language Instructions
    Changmao Chen, Yuren Cong, Zhen Kan. ICTAI , 2024.

Contact