Xueyao Zhang (张雪遥)

Ph.D. student,
School of Data Science,
The Chinese University of Hong Kong, Shenzhen

E-mail: xueyaozhang [AT] link.cuhk.edu.cn

Curriculum Vitae    /    Google Scholar    /    DBLP    /    GitHub    /    Zhihu

About me

I'm a third-year Ph.D. student at the Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), supervised by Professor Zhizheng Wu. I'm participating in and leading the development of Amphion, which is an open-source toolkit for Audio, Music, and Speech Generation. Before CUHK-Shenzhen, I received my master's degree (2019-2022) at the Institute of Computing Technology Chinese Academy of Sciences (ICT, CAS), researching Fake News Detection and Fact Checking.

My research interests include:

  • Applications: Speech Generation, AI Music & AI for Music, AI for Social Good
  • Technologies: Generative Model, Representation Learning

Milestones
2024/08 My first paper about singing voice processing got accepted by IEEE SLT 2024.
2024/05 I performed research at Meta (California, USA) as an intern during summer 2024.
2023/12 My first attempt at leading the development of a large-scale open-source project, Amphion.
2023/04 I was admitted by Tencent Rhino-Bird Talent Program 腾讯犀牛鸟精英人才计划 (Top 50+ of China).
2022/09 I entered the Chinese University of Hong Kong (Shenzhen) as a Ph.D. student.
2022/06 My first paper about music generation got accepted by ACM MM 2022.
2021/11 I got the National Graduate Scholarship 2021 (funded by Ministry of Education of China).
2021/01 My first paper about fake news detection got accepted by WWW 2021.
2020/01 I got the third place at Campus Singer Competition, University of Chinese Academy of Sciences.

✍   Call for Cooperations

Our team is broadly interested in Audio/Speech processing and synthesis, DeepFake detection, and AI + Music. We publish our work in the top conferences and journals, and deploy research output to products by collaborating with industry. Our team has open positions for Postdocs, PhD students, Research Assistants, and visiting research students/researchers (more information can be seen here). If you are willing to join our team, feel free to contact wuzhizheng [AT] cuhk.edu.cn. Besides, you are always welcome to discuss any ideas with me.

Representative Works
Speech, Music, and Audio Generation
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglment
Xueyao Zhang, Xiaohui Zhang, Kainan Peng, Zhenyu Tang, Vimal Manohar, Yingru Liu, Jeff Hwang, Dangna Li, Yuhao Wang, Julian Chan, Yuan Huang, Zhizheng Wu, Mingbo Ma
Preprint / Poster / Demo
TL;DR: We propose a versatile zero-shot voice imitation framework, with controllable timbre and style.
SLT 2024      Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang*, Liumeng Xue*, Yicheng Gu*, Yuancheng Wang*, Jiaqi Li*, Haorui He, Chaoren Wang, Songting Liu, Xi Chen, Junan Zhang, Zihao Fang, Haopeng Chen, Tze Ying Tang, Lexiao Zou, Mingxuan Wang, Jun Han, Kai Chen, Haizhou Li, Zhizheng Wu (*: Equal Contribution)
Proceedings of the IEEE Spoken Language Technology Workshop 2024
Preprint / GitHub / HuggingFace / OpenXLab
TL;DR: We develop a unified audio generation open-source toolkit.
SLT 2024   Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
Xueyao Zhang, Zihao Fang, Yicheng Gu, Haopeng Chen, Lexiao Zou, Liumeng Xue, Zhizheng Wu
Proceedings of the IEEE Spoken Language Technology Workshop 2024
Preprint / Code / Demo / Pretrained Model / HuggingFace Space / OpenXLab App
TL;DR: We propose to utilize multiple content features for singing voice conversion.
MM 2022   Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
Xueyao Zhang, Jinchao Zhang, Yao Qiu, Li Wang, Jie Zhou
Proceedings of the ACM International Conference on Multimedia 2022
PDF / Preprint / Code / Slides / Demo
TL;DR: We propose to learn harmony for generating form- and texture- enhanced pop music.
Fake News Detection
WWW 2021   > 300 citations   Mining Dual Emotion for Fake News Detection
Xueyao Zhang, Juan Cao, Xirong Li, Qiang Sheng, Lei Zhong, and Kai Shu
Proceedings of the Web Conference 2021
PDF / Code / Slides / Video / Chinese Video
TL;DR: We leverage both publisher emotion and social emotion for fake news detection.
CIKM 2021   Integrating Pattern- and Fact-based Fake News Detection via Model Preference Learning
Qiang Sheng*, Xueyao Zhang*, Juan Cao, and Lei Zhong (*: Equal Contribution)
Proceedings of the ACM International Conference on Information and Knowledge Management 2021
PDF / Poster / Code / Chinese Blog
TL;DR: We propose a graph-based model preference learning framework to separately handle the pattern and fact indicators in fake news detection.
ACL 2022   Zoom Out and Observe: News Environment Perception for Fake News Detection
Qiang Sheng, Juan Cao, Xueyao Zhang, Rundong Li, Danding Wang, and Yongchun Zhu
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2022
PDF / Poster / Code / Chinese Video / Chinese Blog
TL;DR: For the first time, we propose to perceive signals from the news environment for fake news detection.
👉      Full Publications
Presentations and Talks
2024/04 I was honored to be invited to give a guest lecture, Introduction of Sound, Speech, and Singing Voice [Slides], at Professor Kanyuan Huang's course at CUHK-Shenzhen (2024/4/24).
2024/02 I was honored to be invited to give talks, A Comprehensive Guide to Amphion's Singing Voice Conversion (Amphion的歌声转换指南) [Slides], at
  • Professor Yong Qin's Lab at Nankai University (2024/01/04),
  • Speech Home 语音之家 (2024/01/12),
  • BAAI Talk 智源社区 (2024/01/16) [Recording],
  • TechBeat 将门创投 (2024/02/07) [Recording],
  • Professor Li Liu's course at the Hong Kong University of Science and Technology, Guangzhou (2024/2/27).
2023/01 I drafted a singing voice conversion tutorial at CSC3160 (CUHK-Shenzhen). [Blog]
2022/10 I got the Best Presentation Award at Huawei-CUHKSZ NLP/Speech Workshop by presenting my research work,
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning. [Slides] [Award]
2022/06 I gave a talk at two labs of ICT as an outstanding graduate,
如何做有新意的研究——以“做问题”的视角. [Chinese Blog] [Chinese Slides]

The talk was also presented at SDS Early Career Colloquium of CUHK-Shenzhen,
Towards a novel research in a problem-driven way. [Slides]
2022/05 My master's theis defense,
Research on Fake News Detection Based on Emotion (基于情感的虚假新闻检测方法研究). [Chinese Slides]
2022/01 I participated to release the theme song of WeChat Open Class 2022 as a co-producer, Ru Wei(入微), which was composed by AI and sung by humans.
2021/05 I gave a talk about composition at Pattern Recognition Center of Wechat AI,
How to create a pop song? (一首流行歌是如何创作的). [Chinese Slides]
Professional Experiences
Meta Platforms, Inc Research Scientist Intern, Generative AI, @Menlo Park, California, USA (2024/05 ~ 2024/09)
#Speech Generation#
WeChat, Tencent Research Intern, Pattern Recognition Center of Wechat AI, @Beijing, China (2021/04 ~ 2022/02, 2023/06 ~ 2024/05)
#Music Generation#, #Singing Voice Conversion#
Services
Reviewer Conferences:
  • ACL Rolling Review
  • CSCW 2021
  • EMNLP 2021
  • ICASSP 2023 (valuable reviewer), 2024, 2025
  • ICLR 2025
  • ICMC 2023
  • MM 2023, 2024
  • NCMMSC 2023, 2024
Journals:
  • EURASIP Journal on Audio, Speech, and Music Processing
  • IEEE Signal Processing Letters (SPL)
  • IEEE Transactions on Audio, Speech and Language Processing (TASLP)
  • IEEE Transactions on Computational Social Systems (TCSS)
  • Information Processing and Management (IP&M)
  • Journal of Chinese Information Processing (中文信息学报)
Student Volunteer
  • IEEE Spoken Language Technology Workshop 2024
  • Teaching Assistant
  • 2017 Fall, Object-Oriented Programming (JAVA), Wuhan University
  • 2022 Fall, CSC3100 Data Structures, CUHK-Shenzhen
  • 2023 Spring, CSC3160/MDS6002 Fundamentals of Speech and Language Processing, CUHK-Shenzhen. I got the Best Teaching Assistant Award.
  • 2023 Fall, CSC4130 Introduction to Human-Computer Interaction, CUHK-Shenzhen
  • 2024 Spring, CSC4050 Computing Capstone, CUHK-Shenzhen
  • Education
    2022- Ph.D. student in Data Science, supervised by Professor Zhizheng Wu,
    School of Data Science,
    The Chinese University of Hong Kong, Shenzhen
    2019-2022 Master in Computer Application Technology (Research-based), supervised by Professor Juan Cao, working closely with Professor Qiang Sheng,
    Institute of Computing Technology Chinese Academy of Sciences,
    University of Chinese Academy of Sciences
    2015-2019 B.Eng. in Software Engineering,
    School of Computer Science,
    Wuhan University
    2012-2015 Jiyuan No.1 Middle School of Henan
    Honors and Awards
    2023 Admitted by Tencent Rhino-Bird Talent Program 腾讯犀牛鸟精英人才计划 (Top 50+ of China)
    2022 Outstanding Graduate, University of Chinese Academy of Sciences & Beijing Municipal Education Commission (Top 5%)
    2021 National Graduate Scholarship, Ministry of Education of China (Top 0.2%)
    2020 Third place at Campus Singer Competition, University of Chinese Academy of Sciences (Top 3 among over 50,000)
    2019 Outstanding Graduate, Wuhan University (Top 10%)
    2019 Excellent Bachelor Thesis, Wuhan University (Top 5%)
    2016 National Undergraduate Scholarship, Ministry of Education of China (Top 0.2%)
    2014 First price in Chinese High School Mathematics League (Top 50 in Henan Province)
    Blogs
    #年终总结# 2023 《2023与兔年:开端、挑战与不经意间》
    2022 《2022年:伤痛、时间与起点》
    2021 《2021年:收获、决定与困惑 》
    2020 《去年》
    2019 《2019大事记》
    2018 《请回答2018》
    2017 《2017流水账》
    #哲学思考# 2020/02/10 《对人工智能艺术创作的哲学思考》
    2016/11/24 《为什么有有,没有没有》
    #关于大学# 2019/10/02 《后大学时代》
    2018/08/04 《的大学》
    #关于高考# 2017/10/20 《后高考时代|感情篇》
    2016/05/02 《后高考时代|思辨篇》
    2016/04/26 《后高考时代|离别篇》