I am a Research Scientist at Adobe Research. I received a PhD in Computer Science and a master in Music at University of California San Diego (UCSD).
Before that, I earned my Bachelor's degree in Software Engineering at Fudan University. Previously, I interned at Adobe, Mitsubishi Electric Research Laboratories (MERL), Apple and other companies focusing on the music and general audio research.
My main interest lies in Audio Representation Learning and Generation, and its application to downstreaming tasks. It is an inter-disciplinary among music, general audio, and technology:
- Audio Representation Learning
- Language-Audio Modeling
- Audio Understanding
- Audio Source Separation
- Music Information Retrieval
- Music Source Separation
- Chord Recognition
- Singing Melody Extraction
- Music Recommendation
- Music Generation
- Multitrack Music Generation
- Interactive Music Improvisation and Performance
- Controllability in Music Generation
Highlighted Projects with regard to the above research I lead or serve as a main contributor:
- FLAM: Frame-wise Language-Audio Modeling
- CLAP: Contrastive Language-Audio Pretraining
- HTS-AT: Hierarchical Audio Transformer
- BACHI: Chord Recognition on Symbolic Music
- POP909-CL: POP909 with human annotations
- POP909: A Dataset for Pop Music Arrangement
- Music SketchNet: Controllable Algorithmic Music Composition
- TONet: A Singing Melody Extraction Framework
- Choral Music Separation
I was the website maintainer of New Interfaces for Musical Expression. My alias "Knut" comes from one of my favorite composers Knut Nystedt.