Publications: Mr Yinghao Ma
Li C, Chen Y, Ji Y, Xu J, Cui Z, Li S, Zhang Y, Tang J et al.
(
2026
)
.
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
.
Jiang X, Wang Q, Wu J, He X, Xu Z, Ma Y, Piao M, Yang K et al.
(
2026
)
.
AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking
.
Ma Z, Yang G, Chen W, Gao Z, Du Y, Li X, Zheng Z, Zhu H et al.
(
2026
)
.
SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing
.
IEEE Journal of Selected Topics in Signal Processing
vol.
PP
,
(
99
)
1
-
14
.
Li Y, Ma Y, Zhang G, Yuan R, Zhu K, Guo H, Liang Y, Liu J et al.
(
2025
)
.
OmniBench: Towards The Future of Universal Omni-Language Models
.
Ma Y, Xia H, Chen W, Taheri T, Chang S, Gao H, Yuan R, Ding M et al.
(
2025
)
.
A Comprehensive Music Interaction Platform for Evaluating Music Generation Models
.
Conference:
DMRN+20 Digital Music Research Network One-day Workshop 2025
(
King’s College London (Bush House). London, UK
)
from:
16/12/2025
to:
16/12/2025
,
Ma Y, Li Y, Benetos E, Lin C
(
2025
)
.
Controlled Genre-Specific Music Generation: Fine-Tuning with Predictive Data Mixture Optimization
.
Conference:
DMRN+20 Digital Music Research Network One-day Workshop 2025
(
King’s College London (Bush House). London, UK
)
from:
16/12/2025
to:
16/12/2025
,
Taheri T, Ma Y, Benetos E
(
2025
)
.
SAR-LM: Symbolic Audio Reasoning with Large Language Models
.
Conference:
DMRN+20 Digital Music Research Network One-day Workshop 2025
(
King’s College London (Bush House). London, UK
)
from:
16/12/2025
to:
16/12/2025
,
Tang X, Lei X, Zhu C, Chen S, Yuan R, Li Y, Oh C, Zhang G et al.
(
2025
)
.
AutoMV: An Automatic Multi-Agent System for Music Video Generation
.
Taheri T, Ma Y, Benetos E
(
2025
)
.
SAR-LM: Symbolic Audio Reasoning with Large Language Models
.
Ma Y, Li S, Yu J, Benetos E, Maezawa A
(
2025
)
.
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
.
Conference:
26th International Society for Music Information Retrieval Conference (ISMIR)
(
Daejeon, Korea
)
from:
21/09/2025
to:
25/09/2025
,
Yuan R, Lin H, Guo S, Zhang G, Pan J, Zang Y, Liu H, Liang Y et al.
(
2025
)
.
YuE: Scaling Open Foundation Models for Long-Form Music Generation
.
Ma Y, Li S, Yu J, Benetos E, Maezawa A
(
2025
)
.
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
.
Ma Z, Ma Y, Zhu Y, Yang C, Chao Y-W, Xu R, Chen W, Chen Y et al.
(
2025
)
.
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
.
Xue L, Zhou Z, Pan J, Li Z, Fan S, Ma Y, Cheng S, Yang D et al.
(
2025
)
.
Audio-FLAN: A Preliminary Release
.
Qu X, Bai Y, Ma Y, Zhou Z, Lo KM, Liu J, Yuan R, Min L et al.
(
2024
)
.
MuPT: A Generative Symbolic Music Pretrained Transformer
.
Yuan R, Lin H, Wang Y, Tian Z, Wu S, Shen T, Zhang G, Wu Y et al.
(
2024
)
.
ChatMusician: Understanding and Generating Music Intrinsically with LLM
.
Conference:
62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)
(
Bangkok, Thailand
)
from:
11/08/2024
to:
16/08/2024
,
Zhuo L, Yuan R, Pan J, Ma Y, LI Y, Zhang G, Liu S, Dannenberg R et al.
(
2024
)
.
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
.
Li Y, Yuan R, Zhang G, Ma Y, Chen X, Yin H, Xiao C, Lin C et al.
(
2024
)
.
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
.
Conference:
International Conference on Learning Representations (ICLR)
(
Vienna, Austria
)
from:
07/05/2024
to:
11/05/2024
,
Deng Q, Yang Q, Yuan R, Huang Y, Wang Y, Liu X, Tian Z, Pan J et al.
(
2024
)
.
ComposerX: Multi-Agent Symbolic Music Composition with LLMs
.
Li D, Ma Y, Wei W, Kong Q, Wu Y, Che M, Xia F, Benetos E et al.
(
2024
)
.
Mertech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model with Multi-Task Finetuning
.
Conference:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
vol.
00
,
521
-
525
.
Deng Z, Ma Y, Liu Y, Guo R, Zhang G, Chen W, Huang W, Benetos E
(
2024
)
.
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
.
Conference:
Findings of the Association for Computational Linguistics: NAACL 20243643
-
3655
.
Yuan R, Ma Y, Li Y, Zhang G, Chen X, Yin H, Zhuo L, Liu Y et al.
(
2023
)
.
MARBLE: Music Audio Representation Benchmark for Universal Evaluation
.
Li D, Ma Y, Wei W, Kong Q, Wu Y, Che M, Xia F, Benetos E et al.
(
2023
)
.
MERTech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model With Multi-Task Finetuning
.
Deng Z, Ma Y, Liu Y, Guo R, Zhang G, Chen W, Huang W, Benetos E
(
2023
)
.
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
.
Ma Y, Yuan R, Li Y, Zhang G, Chen X, Yin H, Lin C, Benetos E et al.
(
2023
)
.
On the Effectiveness of Speech Self-supervised Learning for Music
.
Miller J, Lewis D, Guo Z, Li Y, Ma Y, Vahidi C, Boon H, Wolstanholme L et al.
(
2022
)
.
DMRN+17: Digital Music Research Network One-day Workshop 2022
.
Conference:
DMRN+17: Digital Music Research Network One-day Workshop 2022
(
Queen Mary Univeristy of London
)
from:
20/12/2022
to:
20/12/2022
,
Li Y, Yuan R, Zhang G, Ma Y, Lin C, Chen X, Ragni A, Yin H et al.
(
2022
)
.
Large-Scale Pretrained Model for Self-Supervised Music Audio Representation Learning
.
Conference:
DMRN+17: Digital Music Research Network One-day Workshop 2022
(
London, UK
)
from:
20/12/2022
to:
20/12/2022
,
Li Y, Yuan R, Zhang G, Ma Y, Lin C, Chen X, Ragni A, Yin H et al.
(
2022
)
.
MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning
.