I'm a third-year Ph.D. student at the Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen),
supervised by
Professor Zhizheng Wu. I'm participating in and leading the
development of Amphion, which is an
open-source toolkit for Audio, Music, and Speech Generation.
Before
CUHK-Shenzhen, I received my master's degree (2019-2022) at the Institute of Computing Technology
Chinese Academy of Sciences (ICT, CAS), researching Fake
News Detection and Fact Checking.
My research interests include:
- Applications: Speech Generation, AI Music & AI for Music, AI for Social Good
- Technologies: Generative Model, Representation Learning
|
2024/08
|
My first paper about singing voice processing got accepted
by IEEE SLT 2024.
|
2024/05
|
I performed research at Meta (California, USA) as an intern during summer 2024.
|
2023/12
|
My first attempt at leading the development of a
large-scale open-source project, Amphion.
|
2023/04
|
I was admitted by Tencent Rhino-Bird Talent Program 腾讯犀牛鸟精英人才计划 (Top 50+ of China).
|
2022/09
|
I entered the Chinese University of Hong Kong (Shenzhen) as a Ph.D. student.
|
2022/06
|
My first paper about music generation got accepted
by ACM MM 2022.
|
2021/11
|
I got the National Graduate Scholarship 2021 (funded by Ministry of Education of China).
|
2021/01
|
My first paper about fake news detection
got accepted by WWW 2021.
|
2020/01
|
I got the third place at Campus Singer Competition, University of Chinese Academy of Sciences.
|
✍ Call for Cooperations
Our team is broadly interested in
Audio/Speech processing and synthesis, DeepFake detection, and AI + Music.
We publish our work in the top conferences and journals, and deploy research output to products by
collaborating with industry. Our team has open positions for Postdocs, PhD students,
Research Assistants, and visiting research students/researchers (more information can be
seen here). If you are willing to
join
our team, feel free to contact wuzhizheng [AT] cuhk.edu.cn. Besides, you are always welcome to
discuss any ideas with me.
|
Speech, Music, and Audio Generation
|
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglment
Xueyao Zhang, Xiaohui Zhang, Kainan Peng,
Zhenyu
Tang,
Vimal
Manohar,
Yingru Liu, Jeff Hwang, Dangna Li, Yuhao Wang, Julian Chan, Yuan Huang, Zhizheng
Wu, Mingbo Ma
Preprint /
Poster /
Demo
TL;DR: We propose a versatile zero-shot voice imitation framework, with controllable
timbre and style.
|
SLT 2024
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang*, Liumeng Xue*, Yicheng
Gu*, Yuancheng
Wang*, Jiaqi Li*, Haorui He, Chaoren Wang, Songting Liu, Xi Chen, Junan Zhang, Zihao Fang, Haopeng
Chen, Tze Ying Tang,
Lexiao Zou, Mingxuan Wang, Jun Han, Kai
Chen, Haizhou Li, Zhizheng
Wu (*: Equal Contribution)
Proceedings of the IEEE Spoken Language Technology Workshop 2024
Preprint /
GitHub /
HuggingFace /
OpenXLab
TL;DR: We develop a unified audio generation open-source toolkit.
|
SLT 2024
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
Xueyao Zhang, Zihao Fang, Yicheng
Gu, Haopeng Chen, Lexiao Zou, Liumeng Xue, Zhizheng Wu
Proceedings of the IEEE Spoken Language Technology Workshop 2024
Preprint /
Code /
Demo /
Pretrained Model /
HuggingFace Space /
OpenXLab App
TL;DR: We propose to utilize multiple content features for singing voice conversion.
|
MM 2022
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
Xueyao Zhang, Jinchao Zhang, Yao Qiu, Li
Wang, Jie Zhou
Proceedings of the ACM International Conference on Multimedia 2022
PDF /
Preprint /
Code /
Slides /
Demo
TL;DR: We propose to learn harmony for generating form- and texture- enhanced pop
music.
|
Fake News Detection
|
WWW 2021
> 300 citations
Mining Dual Emotion for Fake News Detection
Xueyao Zhang, Juan Cao, Xirong Li, Qiang
Sheng, Lei Zhong, and Kai Shu
Proceedings of the Web Conference 2021
PDF /
Code /
Slides /
Video /
Chinese Video
TL;DR: We leverage both publisher emotion and social emotion for fake news detection.
|
CIKM 2021
Integrating Pattern- and Fact-based Fake News Detection via Model Preference Learning
Qiang Sheng*, Xueyao
Zhang*, Juan
Cao, and Lei Zhong (*: Equal Contribution)
Proceedings of the ACM International Conference on Information and Knowledge Management 2021
PDF /
Poster /
Code /
Chinese Blog
TL;DR: We propose a graph-based model preference learning framework to separately
handle the pattern and fact indicators in fake news detection.
|
ACL 2022
Zoom Out and Observe: News Environment Perception for Fake News Detection
Qiang Sheng, Juan Cao, Xueyao Zhang, Rundong Li, Danding Wang, and Yongchun Zhu
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2022
PDF /
Poster
/
Code
/
Chinese Video
/
Chinese Blog
TL;DR: For the first time, we propose to perceive signals from the news environment for
fake news detection.
|
👉 Full Publications
|
2024/04
|
I was honored to be invited to give a guest lecture, Introduction of Sound, Speech, and Singing
Voice [Slides], at Professor Kanyuan Huang's course at CUHK-Shenzhen (2024/4/24).
|
2024/02
|
I was honored to be invited to give talks, A Comprehensive Guide to Amphion's Singing Voice
Conversion (Amphion的歌声转换指南)
[Slides], at
- Professor Yong Qin's Lab at Nankai
University (2024/01/04),
- Speech Home 语音之家 (2024/01/12),
- BAAI Talk 智源社区 (2024/01/16) [Recording],
- TechBeat 将门创投 (2024/02/07) [Recording],
- Professor Li Liu's course at the Hong Kong University
of
Science and Technology, Guangzhou (2024/2/27).
|
2023/01
|
I drafted a singing voice conversion tutorial at CSC3160
(CUHK-Shenzhen). [Blog]
|
2022/10
|
I got the Best Presentation Award at Huawei-CUHKSZ NLP/Speech Workshop by presenting my research
work,
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning.
[Slides]
[Award]
|
2022/06
|
I gave a talk at two labs of ICT as an outstanding graduate,
如何做有新意的研究——以“做问题”的视角.
[Chinese Blog]
[Chinese Slides]
The talk was also presented at SDS Early Career Colloquium of CUHK-Shenzhen,
Towards a novel research in a problem-driven way.
[Slides]
|
2022/05
|
My master's theis defense,
Research on Fake News Detection Based on Emotion (基于情感的虚假新闻检测方法研究).
[Chinese Slides]
|
2022/01
|
I participated to release the theme song of WeChat Open Class 2022
as a co-producer, Ru
Wei(入微), which was composed by AI and sung by humans.
|
2021/05
|
I gave a talk about composition at Pattern Recognition Center of Wechat AI,
How to create a pop song? (一首流行歌是如何创作的).
[Chinese Slides]
|
Meta Platforms, Inc
|
Research Scientist Intern, Generative AI, @Menlo Park, California, USA (2024/05 ~ 2024/09)
#Speech Generation#
|
WeChat, Tencent
|
Research Intern, Pattern Recognition Center of Wechat AI, @Beijing, China (2021/04 ~ 2022/02, 2023/06 ~
2024/05)
#Music Generation#, #Singing Voice Conversion#
|
Reviewer
|
Conferences:
- ACL Rolling Review
- CSCW 2021
- EMNLP 2021
- ICASSP 2023 (valuable reviewer), 2024,
2025
- ICLR 2025
- ICMC 2023
- MM 2023, 2024
- NCMMSC 2023, 2024
Journals:
- EURASIP Journal on Audio, Speech, and Music Processing
- IEEE Signal Processing Letters (SPL)
- IEEE Transactions on Audio, Speech and Language Processing (TASLP)
- IEEE Transactions on Computational Social Systems (TCSS)
- Information Processing and Management (IP&M)
- Journal of Chinese Information Processing (中文信息学报)
|
Student Volunteer
|
IEEE Spoken Language Technology Workshop 2024
|
Teaching Assistant
|
2017 Fall, Object-Oriented Programming (JAVA), Wuhan University
2022 Fall, CSC3100 Data Structures, CUHK-Shenzhen
2023 Spring, CSC3160/MDS6002 Fundamentals
of Speech and Language Processing, CUHK-Shenzhen. I got the Best Teaching Assistant Award.
2023 Fall, CSC4130 Introduction to
Human-Computer Interaction, CUHK-Shenzhen
2024 Spring, CSC4050 Computing Capstone, CUHK-Shenzhen
|
2022-
|
Ph.D. student in Data Science, supervised by
Professor Zhizheng Wu,
School of Data Science,
The Chinese University of Hong Kong, Shenzhen
|
2019-2022
|
Master in Computer Application Technology (Research-based), supervised by Professor Juan Cao, working closely with
Professor Qiang Sheng,
Institute of Computing Technology Chinese Academy of
Sciences, University of Chinese
Academy of Sciences
|
2015-2019
|
B.Eng. in Software Engineering,
School of Computer Science,
Wuhan University
|
2012-2015
|
Jiyuan No.1 Middle School of Henan
|
2023
|
Admitted by Tencent Rhino-Bird Talent Program 腾讯犀牛鸟精英人才计划
(Top 50+ of China)
|
2022
|
Outstanding Graduate, University of Chinese Academy of Sciences & Beijing Municipal Education Commission
(Top 5%)
|
2021
|
National Graduate Scholarship, Ministry of Education of China (Top 0.2%)
|
2020
|
Third place at Campus Singer Competition, University of Chinese Academy of Sciences (Top 3 among over
50,000)
|
2019
|
Outstanding Graduate, Wuhan University (Top 10%)
|
2019
|
Excellent Bachelor Thesis, Wuhan University (Top 5%)
|
2016
|
National Undergraduate Scholarship, Ministry of Education of China (Top 0.2%)
|
2014
|
First price in Chinese High School Mathematics League (Top 50 in Henan Province)
|
|