-
Evaluating ASR-LLM Setups for Japanese Speech Recognition with Multipass Augmented Generative Error Correction
査読
Yuka Ko, Sheng Li, Chao-Han Huck Yang, Tatsuya Kawahara
Proc. IEEE-ICASSP
2026年5月
-
Expressive Voice Conversion with Controllable Emotional Intensity
査読
Nannan Teng, Ying Hu, Zhijian Ou, Sheng Li
Proc. IEEE-ICASSP
2026年5月
-
What Should Automated Vehicles Communicate to Human Drivers? Prioritizing External Human-Machine Interface Information Based on the Four-Sides Model
Di Zhou, Guanghui Zhang, Tianqi Peng, Sheng Li
International Journal of Human-Computer Interaction
2026年3月
-
Unified multi-prototype network with pretrained swin transformer for visual and audio open set recognition
査読
Haiyan Yang, Sheng Li, Juncheng Li, Jun Shi, Jun Wang
Signal, Image and Video Processing
20
(
1
)
2026年1月
-
Emotion-aware Speech Translation Correction with Large Language Models
査読
Zhengdong Yang, Sheng Li, Chenhui Chu
Journal of Natural Language Processing
2026年
-
Speech Foundation Bench for Robotic and EdgeAI systems
査読
Sheng Li, Takahiro Shinozaki
Proc. IEEE-ICASSP demo 2026.
2026年
-
Emotional Text-To-Speech Based on Mutual-Information-Guided Emotion-Timbre Disentanglement
Jianing Yang, Sheng Li, Takahiro Shinozaki, Yuki Saito, Hiroshi Saruwatari
APSIPA ASC 2025
2025年12月
-
LatentSpeech: Latent Diffusion for Text-To-Speech Generation
招待
査読
Haowei Lou, Hye young Paik, Pari Delir Haghighi, Sheng Li, Wen Hu, Lina Yao
Proc. RO-MAN
2025年12月
-
Towards Emotion Co-regulation with LLM-powered Socially Assistive Robots: Integrating LLM Prompts and Robotic Behaviors to Support Parent-Neurodivergent Child Dyads
査読
Jing Li, Felix Schijve, Sheng Li, Yuye Yang, Jun Hu, Emilia Barakova
Proc. IROS
2025年12月
-
Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR,
査読
Hongli Yang, S. Li, Hao Huang, Ayiduosi Tuohan, Yizhou Peng
Proc. Interspeech
2025年12月
-
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
査読
Zhao Ren, Rathi Adarshi Rammohan, Kevin Scheck, Sheng Li, Tanja Schultz
International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)
2025年12月
-
Bandwidth Extension System for Throat Microphone Speech Reconstruction
査読
Yu Xu, Xiaokai Qin, Tianyu Fan, Eng Siong Chng, Sheng Li, Nobuaki Minematsu, Daisuke Saito
Proc. IEEE-ICME
2025年12月
-
Generative Error Correction for Emotion-aware Speech-to-text Translation
査読
Zhengdong Yang, Sheng Li, Chenhui Chu
Proc. ACL (findings)
2025年12月
-
SIQ: Exterminating Speech Intelligence Quotient Cross Cognitive Levels in Voice Understanding Large Language Models
査読
Zhen Wan, Chao-Han Huck Yang, Yahan Yu, Jinchuan Tian, Sheng Li, Ke Hu, Zhehuai Chen, Shinji Watanabe, Fei Cheng, Chenhui Chu, Sadao Kurohashi
Proc. ACL (long main)
2025年12月
-
Simple and Effective Content Encoder for Singing Voice Conversion via Dimension Reduction,
査読
Wangjin Zhou, Tianjiao Du, Chenglin Xu, S. Li, Yi Zhao, Tatsuya Kawahara
Proc. Interspeech
2025年12月
-
Adapting Whisper for Parameter-efficient Code-Switching Speech Recognition via Soft Prompt Tuning
査読
Hongli Yang, Yizhou Peng, Hao Huang, S. Li
Proc. Interspeech
2025年12月
-
Designing an LLM-powered Social Robot for Supporting Emotion Regulation In Parent-Child Dyads
Jing Li, Sheng Li, Emilia I. Barakova, Felix Schijve, Jun Hu
Proc. RO-MAN (late breaking)
2025年12月
-
Designing an LLM-powered Social Robot for Supporting Emotion Regulation In Parent-Child Dyads
査読
Jing Li, Felix Schijve, Sheng Li, Emilia Barakova, Jun Hu
Interactive AI for Preventive Health (IAI4PH) 2025
2025年12月
-
Empowering Māori Automatic Speech Recognition through EMD-Based Augmentation
Chengxi Lei, Sheng Li, Satwinder Singh, Feng Hou, Huia Jahnke, Ruili Wang
22nd Pacific Rim International Conference on Artificial Intelligence (PRICAI 2025)
2025年11月
-
Extending Whisper for Emotion Prediction Using Word-level Pseudo Labels
査読
Chin Yuen Kwok, Sheng Li, Jia Qi Yip, Chenhui Chu, Tatsuya Kawahara, Eng Siong Chng
IEEE-ICASSP
2025年3月
-
Similarity-based accent recognition with continuous and discrete self-supervised speech representations
査読
Jun-You Wang, Sheng Li, Li-An Lu, Sydney Chia-Chun Kao, Jyh-Shing Roger Jang
IEEE-ICASSP
2025年3月
-
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding
査読
Jiliang Hu, Zuchao Li, Mengjia Shen, Haojun Ai, Sheng Li, Jun Zhang
IEEE-ICASSP
2025年3月
-
RAG-Boost: Retrieval-Augmented Generation Enhanced LLM-based Speech Recognition,
査読
Pengcheng Wang, Sheng Li, Takahiro Shinozaki
Interspeech2025 MLC-SLM Challenge workshop
2025年
-
CoVoGER: A Multilingual Multitask Benchmark for Speech-to-text Generative Error Correction with Large Language Models,
査読
Zhengdong Yang, Zhen Wan, Sheng Li, Chao-Han Huck Yang, Chenhui Chu
Proc. EMNLP (long main)
2025年
-
Collaborative Transformer Prototype Network With Pretrained Contrastive Language-Audio Encoder for Open Set Audio Recognition
査読
Haiyan Yang, Jun Wang, Sheng Li, Di Zhou, Xingwei Chen, Juncheng Li, Yufeng Hua, Jun Shi
IEEE Transactions on Signal Processing
73
4748
-
4763
2025年
-
Cross-Lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition
査読
Zhengdong Yang, Qianying Liu, Sheng Li, Fei Cheng, Chenhui Chu
IEEE Transactions on Audio, Speech and Language Processing
1
-
13
2025年
-
SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models.
Zhen Wan, Chao-Han Huck Yang, Yahan Yu, Jinchuan Tian, Sheng Li 0010, Ke Hu, Zhehuai Chen, Shinji Watanabe 0001, Fei Cheng 0002, Chenhui Chu, Sadao Kurohashi
ACL (1)
30381
-
30398
2025年
-
Multi-Domain Dialogue State Tracking with Large Language Model Rationale and Disentangled Domain-Slot Attention
査読
Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
IEEE Transactions on Audio, Speech and Language Processing
1
-
14
2025年
-
Neural TTS-Based Dynamic Data Augmentation for Improved Speech Separation
査読
Kai Wang, Cuicui Zhu, Lili Yin, Sheng Li, Madina Mansurova, Hao Huang
IEEE Transactions on Audio, Speech and Language Processing
33
2457
-
2470
2025年
-
A Two-Stage LoRA Strategy for Expanding Language Capabilities in Multilingual ASR Models
査読
Chin Yuen Kwok, Hexin Liu, Jia Qi Yip, Sheng Li, Eng Siong Chng
IEEE Transactions on Audio, Speech and Language Processing
33
2576
-
2590
2025年
-
Parallel and Limited Data Voice Conversions on Myanmar Language Speech for Spoofed Detection
査読
Hay Mar Soe Naing, Win Pa Pa, Sheng Li
Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops
1
-
5
2024年12月
-
LaMuCo: Large-Scale Multilingual Conversation Speech Recognition Challenge
査読
Qingqing Zhang, Lei Luo, Simin Xu, Yongjing Chen, Chuang Li, Sheng Li, Ruili Wang
Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops
1
-
3
2024年12月
-
Data Selection using Spoken Language Identification for Low-Resource and Zero-Resource Speech Recognition
査読
Jianan Chen, Chenhui Chu, Sheng Li, Tatsuya Kawahara
2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
1
-
6
2024年12月
-
LLM as decoder: Investigating Lattice-based Speech Recognition Hypotheses Rescoring Using LLM
査読
Sheng Li, Yuka Ko, Akinori Ito
2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
1
-
5
2024年12月
-
Low-resource Language Adaptation with Ensemble of PEFT Approaches
査読
Chin Yuen Kwok, Sheng Li, Jia Qi Yip, Eng Siong Chng
2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
1
-
6
2024年12月
-
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition
査読
Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz
ACM Multimedia Asia 2024
2024年12月
-
Enhancing Privacy of Spatiotemporal Federated Learning Against Gradient Inversion Attacks
査読
Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa
Lecture Notes in Computer Science
457
-
473
2024年10月
-
Investigating ASR Error Correction with Large Language Model and Multilingual 1-best Hypotheses
査読
Sheng Li, Chen Chen, Chin Yuen Kwok, Chenhui Chu, Eng Siong Chng, Hisashi Kawai
Interspeech 2024
1315
-
1319
2024年9月
-
Automatic Post-Editing of Speech Recognition System Output Using Large Language Models
査読
Sheng Li, Jiyi Li, Yang Cao
The DASFAA 2024 Workshop
2024年7月
-
Revisiting Generative Adversarial Network for Downstream Task of Speech Recognition
査読
Sheng Li, Bei Liu, Jianlong Fu
Proc. IEEE GEM
2024年6月
-
Voices of the Himalayas: Benchmarking Speech Recognition Systems for the Tibetan Language
査読
Sheng Li, Jiyi Li, Chenhui Chu
2024年5月
-
Reproducibility Companion Paper: Stable Diffusion for Content-Style Disentanglement in Art Analysis
査読
Yankun Wu, Yuta Nakashima, Noa Garcia, Sheng Li, Zhaoyang Zeng
Proceedings of the 2024 International Conference on Multimedia Retrieval
2024年5月
-
Enhancing Realism in 3D Facial Animation Using Conformer-Based Generation and Automated Post-Processing
査読
Yi Zhao, Chunyu Qiang, Hao Li, Yulan Hu, Wangjin Zhou, Sheng Li
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2024年4月
-
MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction
査読
Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Kawahara Tatsuya
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2024年4月
-
Phantom in the opera: adversarial music attack for robot dialogue system
招待
査読
Sheng Li, Jiyi Li, Yang Cao
Frontiers in Computer Science
6
2024年2月
-
End-to-end Japanese-English Speech-to-text Translation with Spoken-to-Written Style Conversion
査読
Zhengdong Yang, Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
Journal of Natural Language Processing
31
(
3
)
935
-
957
2024年
-
LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement
査読
Zili Qi, Xinhui Hu, Wangjin Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
2023年12月
-
FedCPC: An Effective Federated Contrastive Learning Method for Privacy Preserving Early-Stage Alzheimers Speech Detection
査読
Wenqing Wei, Zhengdong Yang, Yuan Gao, Jiyi Li, Chenhui Chu, Shogo Okada, Sheng Li
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
2023年12月
-
KyotoMOS: An Automatic MOS Scoring System for Speech Synthesis
招待
査読
Wangjin Zhou, Zhengdong Yang, Sheng Li, Chenhui Chu
ACM Multimedia Asia Workshops
2023年12月
-
Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization
査読
Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He
ACM Multimedia Asia 2023
2023年12月
-
GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System
査読
Xiaojiao Chen, Sheng Li, Jiyi Li, Yang Cao, Hao Huang, Liang He
ACM Multimedia Asia 2023
2023年12月
-
Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network
査読
Nan Li, Longbiao Wang, Meng Ge, Masashi Unoki, Sheng Li, Jianwu Dang
Speech Communication
103024
-
103024
2023年12月
-
Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings
査読
Soky Kak, Sheng Li, Chenhui Chu, Tatsuya Kawahara
International Journal of Asian Language Processing (IJALP)
2023年11月
-
Disordered speech recognition considering low resources and abnormal articulation
Yuqin Lin, Longbiao Wang, Jianwu Dang, Sheng Li, Chenchen Ding
Speech Communication
155
103002
-
103002
2023年11月
-
Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition
査読
Sheng Li, Jiyi Li
Artificial Neural Networks and Machine Learning – ICANN 2023
389
-
400
2023年9月
-
The Kyoto Speech-to-Speech Translation System for IWSLT 2023
査読
Zhengdong Yang, Shuichiro Shimizu, Wangjin Zhou, Sheng Li, Chenhui Chu
International Conference on Spoken Language Translation (IWSLT)
2023年7月
-
Towards Speech Dialogue Translation Mediating Speakers of Different Languages
査読
Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings Volume
2023年7月
-
Multi-Domain Dialogue State Tracking with Disentangled Domain-Slot Attention
査読
Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings Volume
2023年7月
-
Dialogue State Tracking with Sparse Local Slot Attention
査読
Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
ACL 2023 Workshop on NLP for Conversational AI
2023年7月
-
Tendency-and-attention-informed deep learning for ENSO forecasts
Shen Qiao, Cuicui Zhang, Xuefeng Zhang, Kai Zhang, Hao Shi, Sheng Li, Hao Wei
Climate Dynamics
2023年6月
-
Development of a Pain Signaling System Using Machine Learning
査読
Helen Korving, Sheng Li, Di Zhou, Paula Sterkenburg, Panos Markopoulos, Emilia Barakova
2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)
2023年6月
-
General or Specific? Investigating Effective Privacy Protection in Federated Learning for Speech Emotion Recognition
査読
Chao Tan, Yang Cao, Sheng Li, Masatoshi Yoshikawa
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2023年6月
-
Hierarchical Softmax for End-To-End Low-Resource Multilingual Speech Recognition
査読
Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2023年6月
-
Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-Based Speech Recognition of Low-Resource Language
査読
Kak Soky, Sheng Li, Chenhui Chu, Tatsuya Kawahara
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2023年6月
-
Speakeraugment: Data Augmentation for Generalizable Source Separation via Speaker Parameter Manipulation
査読
Kai Wang, Yuhang Yang, Hao Huang, Ying Hu, Sheng Li
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2023年6月
-
Speech-Text Based Multi-Modal Training with Bidirectional Attention for Improved Speech Recognition
査読
Yuhang Yang, Haihua Xu, Hao Huang, Eng Siong Chng, Sheng Li
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2023年6月
-
An End-to-End Chinese and Japanese Bilingual Speech Recognition Systems with Shared Character Decomposition
査読
Sheng Li, Jiyi Li, Qianying Liu, Zhuo Gong
Communications in Computer and Information Science
493
-
503
2023年4月
-
Investigating Effective Domain Adaptation Method for Speaker Verification Task
査読
Guangxing Li, Wangjin Zhou, Sheng Li, Yi Zhao, Jichen Yang, Hao Huang
Communications in Computer and Information Science
517
-
527
2023年4月
-
GhostVec: Directly Extracting Speaker Embedding from End-to-End Speech Recognition Model Using Adversarial Examples
査読
Xiaojiao Chen, Sheng Li, Hao Huang
Communications in Computer and Information Science
482
-
492
2023年4月
-
SpecMNet: Spectrum Mend Network for Monaural Speech Enhancement
査読
Cunhang Fan, Hongmei Zhang, Jiangyan Yi, Zhao Lv, Jianhua Tao, Taihao Li, Guanxiong Pei, Xiaopei Wu, Sheng Li
Applied Acoustics
194
(
108792
)
2022年12月
-
Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling
査読
Siqing Qin, Longbiao Wang, Sheng Li, Jianwu Dang, Lixin Pan
EURASIP Journal on Audio, Speech, and Music Processing
2022
(
1
)
1
-
10
2022年12月
-
Can We Train a Language Model Inside an End-to-End ASR Model? - Investigating Effective Implicit Language Modeling
査読
Zhuo Gong, Saito Daisuke, Sheng Li, Hisashi Kawai, Minematsu Nobuaki
Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
42
-
47
2022年12月
-
Subband-based Spectrogram Fusion for Speech Enhancement by Combining Mapping and Masking Approaches
査読
Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara
2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
2022年11月
-
Nict-Tib1: A Public Speech Corpus Of Lhasa Dialect For Benchmarking Tibetan Language Speech Recognition Systems
査読
Kak Soky, Zhuo Gong, Sheng Li
2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)
2022年11月
-
Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction
査読
Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara
Interspeech 2022
2022年9月
-
Multi-Domain Dialogue State Tracking with Top-k Slot Self Attention
査読
Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki
In Proc. SIGdial Meeting Discourse \& Dialogue
2022年9月
-
Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism
査読
Kak Soky, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara
in Proc. INTERSPEECH
2022年9月
-
Augmented Adversarial Self-Supervised Learning for Early-Stage Alzheimer's Speech Detection
査読
Longfei Yang, Wenqing Wei, Sheng Li, Jiyi Li, Takahiro Shinozaki
in Proc. INTERSPEECH
2022年9月
-
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection
査読
Kai Li, Sheng Li, Xugang Lu, Masato Akagi, Meng Liu, Lin Zhang, Chang Zeng, Longbiao Wang, Jianwu Dang, Masashi Unoki
in Proc. INTERSPEECH
2022年9月
-
Fusion of Self-supervised Learned Models for MOS Prediction
査読
Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao
in Proc. INTERSPEECH
2022年9月
-
Finer-grained Modeling units-based Meta-Learning for Low-resource Tibetan Speech Recognition
査読
Siqing Qin, Longbiao Wang, Sheng Li, Yuqin Lin, Jianwu Dang
in Proc. INTERSPEECH
2022年9月
-
Global Signal-to-noise Ratio Estimation Based on Multi-subband Processing Using Convolutional Neural Network
査読
Nan LI, Meng Ge, Longbiao Wang, Masashi Unoki, Sheng Li, Jianwu Dang
in Proc. INTERSPEECH
2022年9月
-
Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network
査読
Kai Li, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Masashi Unoki
2022 30th European Signal Processing Conference (EUSIPCO)
379
-
383
2022年8月
-
TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies
招待
査読
Kak Soky, Masato Mimura, Tatsuya Kawahara, Chenhui Chu, Sheng Li, Chenchen Ding, Sethserey Sam
International Journal of Asian Language Processing
31
(
03n04
)
2022年7月
-
Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model
査読
Zhuo Gong, Daisuke Saito, Longfei Yang, Takahiro Shinozaki, Sheng Li, Hisashi Kawai, Nobuaki Minematsu
The Speaker and Language Recognition Workshop (Odyssey 2022)
2022年6月
-
Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection.
査読
Sheng Li, Jiyi Li, Qianying Liu, Zhuo Gong
In Proc. LREC (Language Resources and Evaluation Conference)
7291
-
7297
2022年6月
-
Mining Hard Samples Locally And Globally For Improved Speech Separation
査読
Kai Wang, Yizhou Peng, Hao Huang, Ying Hu, Sheng Li
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2022年5月
-
Compressing Transformer-Based ASR Model by Task-Driven Loss and Attention-Based Multi-Level Feature Distillation
査読
Yongjie Lv, Longbiao Wang, Meng Ge, Sheng Li, Chenchen Ding, Lixin Pan, Yuguang Wang, Jianwu Dang, Kiyoshi Honda
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2022年5月
-
Cross-Lingual Transfer Learningfor End-to-End Speech Translation
Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
自然言語処理
29
(
2
)
611
-
637
2022年
-
Adversarial Attack and Defense on Deep Neural Network-based Voice Processing Systems: An Overview
査読
Xiaojiao Chen, Sheng Li, Hao Huang
Applied Sciences, Special Issues of Machine Speech Communication, 2021.
2021年12月
-
Spectrograms Fusion-based End-to-End Robust Automatic Speech Recognition
査読
Hao Shi, Longbiao Wang, Sheng Li, Cunhang Fan, Jianwu Dang, Tatsuya Kawahara
In Proc. APSIPA ASC
2021年12月
-
Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework
査読
Yizhou Peng, Jicheng Zhang, Haobo Zhang, Haihua Xu, Hao Huang, Sheng Li, Eng Siong Chng
In Proc. APSIPA ASC
2021年12月
-
On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora
査読
Kak Soky, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara
In Proc. APSIPA ASC
2021年12月
-
Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC)
査読
Kak Soky, Masato Mimura, Tatsuya Kawahara, Sheng Li, Chenchen Ding, Chenhui Chu, Sethserey Sam
in Proc. O-COCOSDA
2021年12月
-
An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model
査読
Ding Wang, Shuaishuai Ye, Xinhui Hu, Sheng Li, Xinkang Xu
Interspeech 2021
3266
-
3270
2021年8月
-
End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain
査読
Kai Wang, Hao Huang, Ying Hu, Zhihua Huang, Sheng Li
Interspeech 2021
3046
-
3050
2021年8月
-
Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network
Nan Li, Longbiao Wang, Masashi Unoki, Sheng Li, Rui Wang, Meng Ge, Jianwu Dang
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2021年6月
-
An Investigation of Using Hybrid Modeling Units for Improving End-to-End Speech Recognition System
Shunfei Chen, Xinhui Hu, Sheng Li, Xinkang Xu
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2021年6月
-
Encoder-Decoder Based Pitch Tracking and Joint Model Training for Mandarin Tone Classification
査読
Hao Huang, Kai Wang, Ying Hu, Sheng Li
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2021年6月
-
Phantom in the Opera: Effective Adversarial Music Attack on Keyword Spotting Systems.
Heran Zhang, Sheng Li, Xingjun Ma, Yi Zhao, Yang Cao, Tatsuya Kawahara
IEEE-SLT2021
2021年
-
Simultaneous Progressive Filtering-Based Monaural Speech Enhancement
査読
Haoran Yin, Hao Shi, Longbiao Wang, Luya Qiang, Sheng Li, Meng Ge, Gaoyan Zhang, Jianwu Dang
Communications in Computer and Information Science
213
-
221
2021年
-
Exploring Effective Speech Representation via ASR for High-Quality End-to-End Multispeaker TTS
査読
Dawei Liu, Longbiao Wang, Sheng Li, Haoyu Li, Chenchen Ding, Ju Zhang, Jianwu Dang
Communications in Computer and Information Science
110
-
118
2021年
-
Speech Dereverberation Based on Scale-Aware Mean Square Error Loss
査読
Luya Qiang, Hao Shi, Meng Ge, Haoran Yin, Nan Li, Longbiao Wang, Sheng Li, Jianwu Dang
Communications in Computer and Information Science
55
-
63
2021年
-
VOIS: The First Speech Therapy App Specifically Designed for Myanmar Hearing-Impaired Children
Aye Thida, Nway Nway Han, Sheinn Thawtar Oo, Sheng Li, Chenchen Ding
2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)
2020年11月
-
Compensation on x-vector for Short Utterance Spoken Language Identification
Peng Shen, Xugang Lu, Komei Sugiura, Sheng Li, Hisashi Kawai
Odyssey 2020 The Speaker and Language Recognition Workshop
47
-
52
2020年11月
-
Joint Training End-to-End Speech Recognition Systems with Speaker Attributes
Sheng Li, Xugang Lu, Raj Dabre, Peng Shen, Hisashi Kawai
Odyssey 2020 The Speaker and Language Recognition Workshop
2020年11月
-
Voice-Indistinguishability -- Protecting Voiceprint with Differential Privacy under an Untrusted Server
Yaowei Han, Yang Cao, Sheng Li, Qiang Ma, Masatoshi Yoshikawa
Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security
2125
-
2127
2020年10月
-
Singing Voice Extraction with Attention-Based Spectrograms Fusion
Hao Shi, Longbiao Wang, Sheng Li, Chenchen Ding, Meng Ge, Nan Li, Jianwu Dang, Hiroshi Seki
Interspeech 2020
2020年10月
-
Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription
Yuqin Lin, Longbiao Wang, Sheng Li, Jianwu Dang, Chenchen Ding
Interspeech 2020
2020年10月
-
Voice-Indistinguishability: Protecting Voiceprint In Privacy-Preserving Speech Data Release
Yaowei Han, Sheng Li, Yang Cao, Qiang Ma, Masatoshi Yoshikawa
2020 IEEE International Conference on Multimedia and Expo (ICME)
2020年7月
-
Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation
Hao Shi, Longbiao Wang, Meng Ge, Sheng Li, Jianwu Dang
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2020年5月
-
End-to-End Articulatory Modeling for Dysarthric Articulatory Attribute Detection
Yuqin Lin, Longbiao Wang, Jianwu Dang, Sheng Li, Chenchen Ding
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
2020年5月
-
Automatic speech recognition
Xugang Lu, Sheng Li, Masakiyo Fujimoto
SpringerBriefs in Computer Science
21
-
38
2020年
-
Investigation of Effectively Synthesizing Code-Switched Speech Using Highly Imbalanced Mix-Lingual Data
Shaotong Guo, Longbiao Wang, Sheng Li, Ju Zhang, Cheng Gong, Yuguang Wang, Jianwu Dang, Kiyoshi Honda
Neural Information Processing
36
-
47
2020年
-
Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification
Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai
IEEE/ACM Transactions on Audio, Speech, and Language Processing
28
2674
-
2683
2020年
-
Multi-lingual transformer training for khmer automatic speech recognition
査読
Kak Soky, Sheng Li, Tatsuya Kawahara, Sopheap Seng
2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
1893
-
1896
2019年11月
-
Effective Training End-to-End ASR systems for Low-resource Lhasa Dialect of Tibetan Language
Lixin Pan, Sheng Li, Longbiao Wang, Jianwu Dang
2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
2019年11月
-
Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection
Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai
Interspeech 2019
2019年9月
-
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation
Sheng Li, Dabre Raj, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai
Interspeech 2019
2019年9月
-
End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition
Sheng Li, Chenchen Ding, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai
Interspeech 2019
2019年9月
-
Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese
Sheng Li, Xugang Lu, Chenchen Ding, Peng Shen, Tatsuya Kawahara, Hisashi Kawai
Interspeech 2019
2019年9月
-
Interactive learning of teacher-student model for short utterance spoken language identification.
査読
P.Shen, X.Lu, S. Li, H.Kawai
Proc. IEEE-ICASSP
2019年
-
INVESTIGATION OF SEQUENCE-LEVEL KNOWLEDGE DISTILLATION METHODS FOR CTC ACOUSTIC MODELS
査読
Ryoichi Takashima, Li Sheng, Hisashi Kawai
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
6156
-
6160
2019年
-
Feature Representation of Short Utterances based on Knowledge Distillation for Spoken Language Identification.
査読
P.Shen, X.Lu, S. Li, H.Kawai
Proc. INTERSPEECH
2018年
-
Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks
査読
Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6
3708
-
3712
2018年
-
CTC LOSS FUNCTION WITH A UNIT-LEVEL AMBIGUITY PENALTY
査読
Ryoichi Takashima, Sheng Li, Hisashi Kawai
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
5909
-
5913
2018年
-
AN INVESTIGATION OF A KNOWLEDGE DISTILLATION METHOD FOR CTC ACOUSTIC MODELS
査読
Ryoichi Takashima, Sheng Li, Hisashi Kawai
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
5809
-
5813
2018年
-
Temporal Attentive Pooling for Acoustic Event Detection.
査読
X.Lu, P.Shen, S. Li, Y.Tsao, H.Kawai
Proc. INTERSPEECH
2018年
-
IMPROVING VERY DEEP TIME-DELAY NEURAL NETWORK WITH VERTICAL-ATTENTION FOR EFFECTIVELY TRAINING CTC-BASED ASR SYSTEMS
査読
Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018)
77
-
83
2018年
-
INCREMENTAL TRAINING AND CONSTRUCTING THE VERY DEEP CONVOLUTIONAL RESIDUAL NETWORK ACOUSTIC MODELS
査読
Sheng Li, Xugang Lu, Peng Shen, Ryoichi Takashima, Tatsuya Kawahara, Hisashi Kawai
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU)
222
-
227
2017年
-
SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING
査読
Sheng Li, Xugang Lu, Shinsuke Sakai, Masato Mimura, Tatsuya Kawahara
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
5270
-
5274
2017年
-
Conditional generative adversarial nets classifier for spoken language identification
査読
Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2017-
2814
-
2818
2017年
-
Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses
査読
Sheng Li, Yuya Akita, Tatsuya Kawahara
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
24
(
9
)
1524
-
1534
2016年9月
-
Confidence Estimation for Speech Recognition Systems using Conditional Random Fields Trained with Partially Annotated Data
査読
Sheng Li, Xugang Lu, Shinsuke Mori, Yuya Akita, Tatsuya Kawahara
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP)
2016年
-
Data Selection from Multiple ASR Systems' Hypotheses for Unsupervised Acoustic Model Training
査読
Sheng Li, Yuya Akita, Tatsuya Kawahara
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS
5875
-
5879
2016年
-
Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training
査読
Sheng Li, Yuya Akita, Tatsuya Kawahara
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
E98D
(
8
)
1545
-
1552
2015年8月
-
Discriminative Data Selection for Lightly Supervised Training of Acoustic Model using Closed Caption Texts
査読
Sheng Li, Yuya Akita, Tatsuya Kawahara
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5
3526
-
3530
2015年
-
Ensemble Speaker Modeling using Speaker Adaptive Training Deep Neural Network for Speaker Adaptation
査読
Sheng Li, Xugang Lu, Yuya Akita, Tatsuya Kawahara
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5
2892
-
2896
2015年
-
Corpus and Transcription System of Chinese Lecture Room
査読
Sheng Li, Yuya Akita, Tatsuya Kawahara
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP)
442
-
445
2014年
-
Phoneme-level articulatory animation in pronunciation training
査読
Lan Wang, Hui Chen, Sheng Li, Helen M. Meng
SPEECH COMMUNICATION
54
(
7
)
845
-
856
2012年9月
-
Cross Linguistic Comparison of Mandarin and English EMA Articulatory Data
査読
Sheng Li, Lan Wang
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3
902
-
905
2012年
-
The Phoneme-level Articulator Dynamics for Pronunciation Animation
査読
Sheng Li, Lan Wang, En Qi
Proc. IALP
2011年
-
IELS: A Computer-aided Pronunciation Training System for Undergraduate Students
査読
Jinyu Chen, Lan Wang, Chongguo Li, Jin Hu, Sheng Li
ICETC
2010年