I am Shiyao Xu (徐诗瑶), a 1st-year ELLIS PhD student at the Center for Mind and Brain (CIMeC) and MHUG group of University of Trento, supervised by Prof. Paolo Rota, co-supervised by Prof. Gül Varol(ENPC).
My research interests lie in human-centric 3D-aware understanding and generation, specifically about vision language models for temporal video understanding.
2023.04: FINALLY! Our paper: DeSRF: Deformable Stylized Radiance Field is accepted by CVPR 2023 Workshop: Generative Models for Computer Vision. See you in Vancouver, Canada (if my visa is approved)!
2022.10: Recieve a graduate school scholarship💰!
2022.08: Happy to announce that our paper "Your3dEmoji" is accepted by SIGGRAPH ASIA 2022 Tech. Comm.!🤪 See you in Korea!
Publications and Preprints
FD-3DGS: Flexible Disentangled 3DGS for Scenes Understanding and Manipulation Shiyao Xu, Junlin Han, Jie Yang.
got rejected by some conference🥲it's ok, both life and research will encounter some rejections🥹. you can see it below. [PDF]
We propose FD-3DGS to distill the semantic information into 3D Gaussians and directly manipulate 3D Gaussians using language.
We propose a more efficient method, DeSRF, to stylize the radiance field, which also transfers style information to the geometry according to the input style.
We propose a novel 3D generative model to translate a real-world face image into its corresponding 3D avatar with only a single style example provided. Our model is 3D-aware in sense and also able to do attribute editing, such as smile, age, etc directly in the 3D domain.
Dynamic Texture Transfer using PatchMatch and Transformers
Guo Pu, Shiyao Xu, Xixin Cao, Zhouhui Lian [PDF] finally available on arxiv... but this is my first project;-)
We propose an automatically method to transfer the dynamic texture of a given video to a still image.
Abstract
How to automatically transfer the dynamic texture of a given video to the target still image is a challenging and ongoing problem.
In this paper, we propose to handle this task via a simple yet effective model that utilizes both PatchMatch and Transformers.
The key idea is to decompose the task of dynamic texture transfer into two stages, where the start frame of the target video with the desired dynamic texture is synthesized in the first stage via a distance map guided texture transfer module based on the PatchMatch algorithm.
Then, in the second stage, the synthesized image is decomposed into structure-agnostic patches, according to which their corresponding subsequent patches can be predicted by exploiting the powerful capability of Transformers equipped with VQ-VAE for processing long discrete sequences.
After getting all those patches, we apply a Gaussian weighted average merging strat- egy to smoothly assemble them into each frame of the target stylized video. Experimental results demonstrate the effectiveness and superiority of the proposed method in dynamic texture transfer compared to the state of the art.
Working Experiences
2024.07 - 2024.09: 3D Algorithm Engineer at Math Magic.
2023.07 - 2024.05: Research Scientist at Cybever Inc., Mountain View (remotely).
2023.06 - 2023.10: Research Assistant at Prof. Yebin Liu's group, Tsinghua University.
2021.08 - 2023.07: Research Intern in DAMO Academy, Alibaba Group. Mentored by Lingzhi Li, Supervised by Dr. Li Shen.
2021.07 - 2021.08: Machine Learning Intern at Apple Inc., Beijing, China.
Education
2024.09 - : ELLIS PhD student at University of Trento, Itlay.
Supervised by Prof. Paolo Rota and Prof. Gül Varol(ENPC).
Worked on 3D understanding especially for human motion and scenes.
2020.09 - 2023.06: M.Sc. in Wangxuan Institute of Computer Techonology(WICT) at Peking University, China.
Supervised by Prof. Zhouhui Lian.
Worked on 3D-aware Generation, Style Transfer, Neural Rendering.
Thesis: 3D-aware Style Transfer based on Neural Radiance Field.
2016.09 - 2020.06: B.Eng. in School of Software at Dalian University of Technology, China.
Major in Big Data and Machine Learning.