研究者詳細 - 李　勝

写真a

リ　シェン

李　勝

LI SHENG

所属

工学院助教

連絡先

ホームページ

https://halspeech.github.io/index-modern-jp.html

通称等の別名

李勝

プロフィール

Sheng LI received his BS and ME degrees in 2006 and 2009, respectively, from Nanjing University, Nanjing, China, and his Ph.D. in 2016 from Kyoto University, Kyoto, Japan. From 2009 to 2012, he worked at the joint lab of the Chinese University of Hong Kong and Shenzhen City, researching speech technology-assisted language learning. From 2016 to 2017, he worked as a researcher at Kyoto University, studying speech recognition systems for humanoid robots. From 2017 to Feb. 2025, he was a researcher at the National Institute of Information and Communications Technology in Kyoto, Japan, working on speech-to-speech translation. In March 2025, he was an assistant professor at the Institute of Science, Tokyo. From April 2026, he was hired by both the Institute of Science and Kyoto University as an assistant professor working on speech recognition. He is also a visiting scientist at RIKEN.

He served as a workshop/special session co-organizer and session chair in Interspeech2020, COLING2022, Odyssey2022, ACM Multimedia Asia2023/2024, RO-MAN2025, IROS2025, and ICASSP2024/2026. He is a member of the Acoustical Society of Japan (ASJ) and the International Speech Communication Association (ISCA), and a senior member of IEEE. He is now a member of the Speech, Language, and Audio (SLA) Technical Committee for APSIPA. He is also a member of the Applied Signal Processing Systems Technical Committee (ASPS TC) of the IEEE Signal Processing Society (SPS).

Research interest:
次世代音声翻訳・音声認識・合成処理技術の研究開発

セキュリティ対応の音声処理

ロボット聴覚

https://search.star.titech.ac.jp/titech-ss/pursuer.act?event=outside&key_t2r2Rid=CTT100930321&lang=en

https://educ.titech.ac.jp/ict/faculty/ (教員名：ら行)

https://youtu.be/pP6YtlSVqlM

外部リンク

学位

博士（情報学）（ 2016年3月京都大学）

研究キーワード

音声認識/翻訳
メディア処理技術を用いた語学学習支援(CALL)
マルチモーダル音声処理
セキュリティ対応の音声処理
大規模な言語モデル (音声、テキスト)

研究分野

情報通信 / 知覚情報処理

学歴

京都大学大学院情報学研究科知能情報学専攻博士後期課程

2012年10月 - 2016年3月

　詳細を見る

researchmap
南京大学中国科学院，香港中文大学，南京大学連携項目課程修士

2007年9月 - 2009年7月

　詳細を見る

researchmap
南京大学（旧国立中央大学(1949年南京大学と改称), 中国C7難関大学, CSRank2025≒京都大学）工学院計算機科学コース (理学)

2002年7月 - 2006年7月

　詳細を見る

researchmap

経歴

京都大学特定助教

2026年4月

　詳細を見る

researchmap
RIKEN Visiting Scientist

2025年10月

　詳細を見る

researchmap
東京科学大学助教

2025年3月 - 現在

　詳細を見る

国名：日本国

researchmap
Eindhoven University of Technology (TU/e), visiting assistant professor

2024年11月

　詳細を見る

国名：オランダ王国

researchmap
南洋理工大学 visiting researcher

2024年2月 - 2024年3月

　詳細を見る

国名：シンガポール共和国

researchmap
京都大学修士課程アドバイザー

2021年12月 - 2023年3月

　詳細を見る

researchmap
国立研究開発法人情報通信研究機構 (NICT) 先進的音声技術研究室(ASTL) テニュアトラック研究員

2020年 - 2025年2月

　詳細を見る

researchmap
Oxford University Computer science department visiting researcher

2019年4月 - 2019年5月

　詳細を見る

researchmap
国立研究開発法人情報通信研究機構 (NICT) 先進的音声技術研究室(ASTL) 研究員

2017年 - 2019年

　詳細を見る

researchmap
京都大学音声メディア研究室研究員

2016年4月 - 2016年12月

　詳細を見る

researchmap
Sogou/Sohuピン音入力方法[株，中国北京市] 研究員

2012年4月 - 2012年9月

　詳細を見る

researchmap
香港中文大学深セン市 joint 研究所 [中国広東省深セン市] 研究員 (computer-assisted language learning)

2009年7月 - 2012年4月

　詳細を見る

researchmap

▼全件表示

所属学協会

APNNS (Asia Pacific Neural Network Society)

2023年12月 - 現在

　詳細を見る

researchmap
ACM (Association for Computing Machinery)

　詳細を見る

researchmap
IEEE/IEEE-SPS/IEEE-RAS

　詳細を見る

researchmap
ISCA (International Speech Communication Association)

　詳細を見る

researchmap
ASJ (日本音響学会)

　詳細を見る

researchmap
SIG-CSLP (Chinese Spoken Language Processing)

　詳細を見る

researchmap
APSIPA (Asia Pacific Signal and Information Processing Association)

　詳細を見る

researchmap

▼全件表示

委員歴

JSAI Co-organizer of OS

2026年6月

　詳細を見る

団体区分：学協会

researchmap
IEEE ICASSP2026 meta reviewer

2026年1月

　詳細を見る

団体区分：学協会

researchmap
APSIPA Speech, Language, and Audio (SLA) Technical Committee (till 2026)

2026年

　詳細を見る

団体区分：学協会

researchmap
IEEE IROS2025 session chair

2025年10月

　詳細を見る

団体区分：学協会

researchmap
IEEE RO-MAN2025 Co-organizer of special session

2025年9月

　詳細を見る

団体区分：学協会

researchmap
IEEE senior member

2025年4月 - 現在

　詳細を見る

団体区分：学協会

researchmap
IEEE Signal Processing Society (SPS) Applied Signal Processing Systems Technical Committee (ASPS TC)

2025年1月 - 2027年1月

　詳細を見る

団体区分：学協会

researchmap
Co-organizing ACM Multimedia Asia 2024 workshop: Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages (M3Oriental) Co-organizer

2024年12月

　詳細を見る

researchmap
Session Chair of DASFAA2024

2024年7月

　詳細を見る

researchmap
Publicity Chair of ACM Multimedia Asia 2024

2024年6月 - 2024年12月

　詳細を見る

団体区分：学協会

researchmap
Session Chair of IEEE-ICASSP2024

2024年4月

　詳細を見る

団体区分：学協会

researchmap
Co-organizing ACM Multimedia Asia 2023 workshop: Multimodal, Multilingual and Multitask Modeling Technologies for Oriental Languages (M3Oriental)

2023年12月

　詳細を見る

researchmap
Session Chair of ICANN 2023

2023年9月

　詳細を見る

researchmap
Area Chair of APSIPA ASC 2023

2023年7月

　詳細を見る

researchmap
Area Chair of EMNLP 2023

2023年7月

　詳細を見る

researchmap
Co-organizing Coling2022 workshop: when creative ai meets conversational ai (cai + cai = cai^2)

2022年10月

　詳細を見る

団体区分：学協会

researchmap
Session Chair for Speaker Odyssey2022 (Evaluation and Benchmarking Session)

2022年6月

　詳細を見る

団体区分：学協会

researchmap
Session Chair for INTERSPEECH2020 (Topics of ASR I)

2020年10月

　詳細を見る

団体区分：学協会

researchmap
Co-organizing INTERSPEECH2020 SLIMTS (Spoken Language Interaction for Mobile Transportation System) workshop

2020年10月

　詳細を見る

団体区分：学協会

researchmap

▼全件表示

論文

Casting Everything to Online API Services? A Survey of Integrating Localized Speech Recognition Models in Robotic Systems 査読

Sheng Li, Jing Li, Felix Schijve, Jun Hu, Emilia Barakova

International Conference on Social Robotics (ICSR) 2026年7月

　詳細を見る

担当区分：筆頭著者,　責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Evaluating ASR-LLM Setups for Japanese Speech Recognition with Multipass Augmented Generative Error Correction 査読

Yuka Ko, Sheng Li, Chao-Han Huck Yang, Tatsuya Kawahara

Proc. IEEE-ICASSP 2026年5月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Expressive Voice Conversion with Controllable Emotional Intensity 査読

Nannan Teng, Ying Hu, Zhijian Ou, Sheng Li

Proc. IEEE-ICASSP 2026年5月

　詳細を見る

担当区分：最終著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
What Should Automated Vehicles Communicate to Human Drivers? Prioritizing External Human-Machine Interface Information Based on the Four-Sides Model

Di Zhou, Guanghui Zhang, Tianqi Peng, Sheng Li

International Journal of Human–Computer Interaction 1 - 25 2026年4月

　詳細を見る

担当区分：最終著者,　責任著者掲載種別：研究論文（学術雑誌）

DOI： 10.1080/10447318.2026.2647134

researchmap
Unified multi-prototype network with pretrained swin transformer for visual and audio open set recognition 査読

Haiyan Yang, Sheng Li, Juncheng Li, Jun Shi, Jun Wang

Signal, Image and Video Processing 20 ( 1 ) 2026年1月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1007/s11760-025-04968-x

researchmap

その他リンク： https://link.springer.com/article/10.1007/s11760-025-04968-x
Emotion-aware Speech Translation Correction with Large Language Models 査読

Zhengdong Yang, Sheng Li, Chenhui Chu

Journal of Natural Language Processing 2026年

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

researchmap
Speech Foundation Bench for Robotic and EdgeAI systems 査読

Sheng Li, Takahiro Shinozaki

Proc. IEEE-ICASSP demo 2026. 2026年

　詳細を見る

researchmap
Emotional Text-To-Speech Based on Mutual-Information-Guided Emotion-Timbre Disentanglement

Jianing Yang, Sheng Li, Takahiro Shinozaki, Yuki Saito, Hiroshi Saruwatari

APSIPA ASC 2025 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
LatentSpeech: Latent Diffusion for Text-To-Speech Generation 招待査読

Haowei Lou, Hye young Paik, Pari Delir Haghighi, Sheng Li, Wen Hu, Lina Yao

Proc. RO-MAN 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Towards Emotion Co-regulation with LLM-powered Socially Assistive Robots: Integrating LLM Prompts and Robotic Behaviors to Support Parent-Neurodivergent Child Dyads 査読

Jing Li, Felix Schijve, Sheng Li, Yuye Yang, Jun Hu, Emilia Barakova

Proc. IROS 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR, 査読

Hongli Yang, S. Li, Hao Huang, Ayiduosi Tuohan, Yizhou Peng

Proc. Interspeech 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2025-1875

researchmap
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning 査読

Zhao Ren, Rathi Adarshi Rammohan, Kevin Scheck, Sheng Li, Tanja Schultz

International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Bandwidth Extension System for Throat Microphone Speech Reconstruction 査読

Yu Xu, Xiaokai Qin, Tianyu Fan, Eng Siong Chng, Sheng Li, Nobuaki Minematsu, Daisuke Saito

Proc. IEEE-ICME 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Generative Error Correction for Emotion-aware Speech-to-text Translation 査読

Zhengdong Yang, Sheng Li, Chenhui Chu

Proc. ACL (findings) 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
SIQ: Exterminating Speech Intelligence Quotient Cross Cognitive Levels in Voice Understanding Large Language Models 査読

Zhen Wan, Chao-Han Huck Yang, Yahan Yu, Jinchuan Tian, Sheng Li, Ke Hu, Zhehuai Chen, Shinji Watanabe, Fei Cheng, Chenhui Chu, Sadao Kurohashi

Proc. ACL (long main) 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Simple and Effective Content Encoder for Singing Voice Conversion via Dimension Reduction, 査読

Wangjin Zhou, Tianjiao Du, Chenglin Xu, S. Li, Yi Zhao, Tatsuya Kawahara

Proc. Interspeech 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Adapting Whisper for Parameter-efficient Code-Switching Speech Recognition via Soft Prompt Tuning 査読

Hongli Yang, Yizhou Peng, Hao Huang, S. Li

Proc. Interspeech 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Designing an LLM-powered Social Robot for Supporting Emotion Regulation In Parent-Child Dyads

Jing Li, Sheng Li, Emilia I. Barakova, Felix Schijve, Jun Hu

Proc. RO-MAN (late breaking) 2025年12月

　詳細を見る

DOI： 10.48550/arXiv.2507.10427

researchmap
Designing an LLM-powered Social Robot for Supporting Emotion Regulation In Parent-Child Dyads 査読

Jing Li, Felix Schijve, Sheng Li, Emilia Barakova, Jun Hu

Interactive AI for Preventive Health (IAI4PH) 2025 2025年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Empowering Māori Automatic Speech Recognition through EMD-Based Augmentation

Chengxi Lei, Sheng Li, Satwinder Singh, Feng Hou, Huia Jahnke, Ruili Wang

22nd Pacific Rim International Conference on Artificial Intelligence (PRICAI 2025) 2025年11月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Spolacq-GDS: A generative dialogue simulator for spoken interaction learning 査読

Taisei Awashima, Renon Toyosaki, Koki Mikuriya, Kota Kawakita, Sheng Li, Takahiro Shinozaki

The Journal of the Acoustical Society of America 158 ( 4_Supplement ) A260 - A260 2025年10月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1121/10.0040820

researchmap
Extending Whisper for Emotion Prediction Using Word-level Pseudo Labels 査読

Chin Yuen Kwok, Sheng Li, Jia Qi Yip, Chenhui Chu, Tatsuya Kawahara, Eng Siong Chng

IEEE-ICASSP 2025年3月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Similarity-based accent recognition with continuous and discrete self-supervised speech representations 査読

Jun-You Wang, Sheng Li, Li-An Lu, Sydney Chia-Chun Kao, Jyh-Shing Roger Jang

IEEE-ICASSP 2025年3月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding 査読

Jiliang Hu, Zuchao Li, Mengjia Shen, Haojun Ai, Sheng Li, Jun Zhang

IEEE-ICASSP 2025年3月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
RAG-Boost: Retrieval-Augmented Generation Enhanced LLM-based Speech Recognition, 査読

Pengcheng Wang, Sheng Li, Takahiro Shinozaki

Interspeech2025 MLC-SLM Challenge workshop 2025年

　詳細を見る

DOI： 10.48550/arXiv.2508.14048

researchmap
CoVoGER: A Multilingual Multitask Benchmark for Speech-to-text Generative Error Correction with Large Language Models, 査読

Zhengdong Yang, Zhen Wan, Sheng Li, Chao-Han Huck Yang, Chenhui Chu

Proc. EMNLP (long main) 2025年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Collaborative Transformer Prototype Network With Pretrained Contrastive Language-Audio Encoder for Open Set Audio Recognition 査読

Haiyan Yang, Jun Wang, Sheng Li, Di Zhou, Xingwei Chen, Juncheng Li, Yufeng Hua, Jun Shi

IEEE Transactions on Signal Processing 73 4748 - 4763 2025年

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1109/tsp.2025.3616585

researchmap
Cross-Lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition 査読

Zhengdong Yang, Qianying Liu, Sheng Li, Fei Cheng, Chenhui Chu

IEEE Transactions on Audio, Speech and Language Processing 1 - 13 2025年

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1109/taslpro.2025.3617233

researchmap
SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models.

Zhen Wan, Chao-Han Huck Yang, Yahan Yu, Jinchuan Tian, Sheng Li 0010, Ke Hu, Zhehuai Chen, Shinji Watanabe 0001, Fei Cheng 0002, Chenhui Chu, Sadao Kurohashi

ACL (1) 30381 - 30398 2025年

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）

researchmap

その他リンク： https://dblp.uni-trier.de/rec/conf/acl/2025-1
Multi-Domain Dialogue State Tracking with Large Language Model Rationale and Disentangled Domain-Slot Attention 査読

Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki

IEEE Transactions on Audio, Speech and Language Processing 1 - 14 2025年

　詳細を見る

掲載種別：研究論文（学術雑誌）出版者・発行元：Institute of Electrical and Electronics Engineers (IEEE)

DOI： 10.1109/taslpro.2025.3604650

researchmap
Neural TTS-Based Dynamic Data Augmentation for Improved Speech Separation 査読

Kai Wang, Cuicui Zhu, Lili Yin, Sheng Li, Madina Mansurova, Hao Huang

IEEE Transactions on Audio, Speech and Language Processing 33 2457 - 2470 2025年

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1109/taslpro.2025.3578779

researchmap
A Two-Stage LoRA Strategy for Expanding Language Capabilities in Multilingual ASR Models 査読

Chin Yuen Kwok, Hexin Liu, Jia Qi Yip, Sheng Li, Eng Siong Chng

IEEE Transactions on Audio, Speech and Language Processing 33 2576 - 2590 2025年

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1109/taslpro.2025.3578752

researchmap
Parallel and Limited Data Voice Conversions on Myanmar Language Speech for Spoofed Detection 査読

Hay Mar Soe Naing, Win Pa Pa, Sheng Li

Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops 1 - 5 2024年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1145/3700410.3702120

researchmap
LaMuCo: Large-Scale Multilingual Conversation Speech Recognition Challenge 査読

Qingqing Zhang, Lei Luo, Simin Xu, Yongjing Chen, Chuang Li, Sheng Li, Ruili Wang

Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops 1 - 3 2024年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1145/3700410.3702135

researchmap
Data Selection using Spoken Language Identification for Low-Resource and Zero-Resource Speech Recognition 査読

Jianan Chen, Chenhui Chu, Sheng Li, Tatsuya Kawahara

2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1 - 6 2024年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/apsipaasc63619.2025.10848811

researchmap
LLM as decoder: Investigating Lattice-based Speech Recognition Hypotheses Rescoring Using LLM 査読

Sheng Li, Yuka Ko, Akinori Ito

2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1 - 5 2024年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/apsipaasc63619.2025.10848752

researchmap
Low-resource Language Adaptation with Ensemble of PEFT Approaches 査読

Chin Yuen Kwok, Sheng Li, Jia Qi Yip, Eng Siong Chng

2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1 - 6 2024年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/apsipaasc63619.2025.10848814

researchmap
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition 査読

Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz

ACM Multimedia Asia 2024 2024年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1145/3696409.3700187

researchmap
Enhancing Privacy of Spatiotemporal Federated Learning Against Gradient Inversion Attacks 査読

Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa

Lecture Notes in Computer Science 457 - 473 2024年10月

　詳細を見る

記述言語：英語掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-981-97-5552-3_31

researchmap
Investigating ASR Error Correction with Large Language Model and Multilingual 1-best Hypotheses 査読

Sheng Li, Chen Chen, Chin Yuen Kwok, Chenhui Chu, Eng Siong Chng, Hisashi Kawai

Interspeech 2024 1315 - 1319 2024年9月

　詳細を見る

担当区分：筆頭著者,　責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/interspeech.2024-368

researchmap
Automatic Post-Editing of Speech Recognition System Output Using Large Language Models 査読

Sheng Li, Jiyi Li, Yang Cao

The DASFAA 2024 Workshop 2024年7月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1007/978-981-96-0914-7_12

researchmap
Revisiting Generative Adversarial Network for Downstream Task of Speech Recognition 査読

Sheng Li, Bei Liu, Jianlong Fu

Proc. IEEE GEM 2024年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Voices of the Himalayas: Benchmarking Speech Recognition Systems for the Tibetan Language 査読

Sheng Li, Jiyi Li, Chenhui Chu

2024年5月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1142/s2717554524500012

researchmap
Reproducibility Companion Paper: Stable Diffusion for Content-Style Disentanglement in Art Analysis 査読

Yankun Wu, Yuta Nakashima, Noa Garcia, Sheng Li, Zhaoyang Zeng

Proceedings of the 2024 International Conference on Multimedia Retrieval 2024年5月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1145/3652583.3658372

researchmap
Enhancing Realism in 3D Facial Animation Using Conformer-Based Generation and Automated Post-Processing 査読

Yi Zhao, Chunyu Qiang, Hao Li, Yulan Hu, Wangjin Zhou, Sheng Li

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024年4月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp48485.2024.10447526

researchmap
MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction 査読

Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Kawahara Tatsuya

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024年4月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp48485.2024.10446041

researchmap
Phantom in the opera: adversarial music attack for robot dialogue system 招待査読

Sheng Li, Jiyi Li, Yang Cao

Frontiers in Computer Science 6 2024年2月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.3389/fcomp.2024.1355975

researchmap
End-to-end Japanese-English Speech-to-text Translation with Spoken-to-Written Style Conversion 査読

Zhengdong Yang, Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi

Journal of Natural Language Processing 31 ( 3 ) 935 - 957 2024年

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.5715/jnlp.31.935

researchmap
LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement 査読

Zili Qi, Xinhui Hu, Wangjin Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu

2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/asru57964.2023.10389788

researchmap
FedCPC: An Effective Federated Contrastive Learning Method for Privacy Preserving Early-Stage Alzheimers Speech Detection 査読

Wenqing Wei, Zhengdong Yang, Yuan Gao, Jiyi Li, Chenhui Chu, Shogo Okada, Sheng Li

2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023年12月

　詳細を見る

担当区分：最終著者,　責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/asru57964.2023.10389690

researchmap
KyotoMOS: An Automatic MOS Scoring System for Speech Synthesis 招待査読

Wangjin Zhou, Zhengdong Yang, Sheng Li, Chenhui Chu

ACM Multimedia Asia Workshops 2023年12月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（研究会，シンポジウム資料等）

DOI： 10.1145/3611380.3628562

researchmap
Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization 査読

Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He

ACM Multimedia Asia 2023 2023年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1145/3595916.3626366

researchmap
GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System 査読

Xiaojiao Chen, Sheng Li, Jiyi Li, Yang Cao, Hao Huang, Liang He

ACM Multimedia Asia 2023 2023年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1145/3595916.3626367

researchmap
Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network 査読

Nan Li, Longbiao Wang, Meng Ge, Masashi Unoki, Sheng Li, Jianwu Dang

Speech Communication 103024 - 103024 2023年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1016/j.specom.2023.103024

researchmap
Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings 査読

Soky Kak, Sheng Li, Chenhui Chu, Tatsuya Kawahara

International Journal of Asian Language Processing (IJALP) 2023年11月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1142/S2717554523500248

researchmap
Disordered speech recognition considering low resources and abnormal articulation

Yuqin Lin, Longbiao Wang, Jianwu Dang, Sheng Li, Chenchen Ding

Speech Communication 155 103002 - 103002 2023年11月

　詳細を見る

掲載種別：研究論文（学術雑誌）出版者・発行元：Elsevier BV

DOI： 10.1016/j.specom.2023.103002

researchmap
Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition 査読

Sheng Li, Jiyi Li

Artificial Neural Networks and Machine Learning – ICANN 2023 389 - 400 2023年9月

　詳細を見る

担当区分：筆頭著者記述言語：英語掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-3-031-44195-0_32

researchmap
The Kyoto Speech-to-Speech Translation System for IWSLT 2023 査読

Zhengdong Yang, Shuichiro Shimizu, Wangjin Zhou, Sheng Li, Chenhui Chu

International Conference on Spoken Language Translation (IWSLT) 2023年7月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Towards Speech Dialogue Translation Mediating Speakers of Different Languages 査読

Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi

In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings Volume 2023年7月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Multi-Domain Dialogue State Tracking with Disentangled Domain-Slot Attention 査読

Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki

In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023): Findings Volume 2023年7月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Dialogue State Tracking with Sparse Local Slot Attention 査読

Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki

ACL 2023 Workshop on NLP for Conversational AI 2023年7月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Tendency-and-attention-informed deep learning for ENSO forecasts

Shen Qiao, Cuicui Zhang, Xuefeng Zhang, Kai Zhang, Hao Shi, Sheng Li, Hao Wei

Climate Dynamics 2023年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1007/s00382-023-06854-z

researchmap

その他リンク： https://link.springer.com/article/10.1007/s00382-023-06854-z/fulltext.html
Development of a Pain Signaling System Using Machine Learning 査読

Helen Korving, Sheng Li, Di Zhou, Paula Sterkenburg, Panos Markopoulos, Emilia Barakova

2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) 2023年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icasspw59220.2023.10193643

researchmap
General or Specific? Investigating Effective Privacy Protection in Federated Learning for Speech Emotion Recognition 査読

Chao Tan, Yang Cao, Sheng Li, Masatoshi Yoshikawa

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp49357.2023.10096844

researchmap
Hierarchical Softmax for End-To-End Low-Resource Multilingual Speech Recognition 査読

Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023年6月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp49357.2023.10095133

researchmap
Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-Based Speech Recognition of Low-Resource Language 査読

Kak Soky, Sheng Li, Chenhui Chu, Tatsuya Kawahara

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp49357.2023.10095644

researchmap
Speakeraugment: Data Augmentation for Generalizable Source Separation via Speaker Parameter Manipulation 査読

Kai Wang, Yuhang Yang, Hao Huang, Ying Hu, Sheng Li

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023年6月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp49357.2023.10094767

researchmap
Speech-Text Based Multi-Modal Training with Bidirectional Attention for Improved Speech Recognition 査読

Yuhang Yang, Haihua Xu, Hao Huang, Eng Siong Chng, Sheng Li

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp49357.2023.10096726

researchmap
An End-to-End Chinese and Japanese Bilingual Speech Recognition Systems with Shared Character Decomposition 査読

Sheng Li, Jiyi Li, Qianying Liu, Zhuo Gong

Communications in Computer and Information Science 493 - 503 2023年4月

　詳細を見る

担当区分：筆頭著者,　責任著者記述言語：英語掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-981-99-1645-0_41

researchmap
Investigating Effective Domain Adaptation Method for Speaker Verification Task 査読

Guangxing Li, Wangjin Zhou, Sheng Li, Yi Zhao, Jichen Yang, Hao Huang

Communications in Computer and Information Science 517 - 527 2023年4月

　詳細を見る

記述言語：英語掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-981-99-1645-0_43

researchmap
GhostVec: Directly Extracting Speaker Embedding from End-to-End Speech Recognition Model Using Adversarial Examples 査読

Xiaojiao Chen, Sheng Li, Hao Huang

Communications in Computer and Information Science 482 - 492 2023年4月

　詳細を見る

担当区分：筆頭著者掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-981-99-1645-0_40

researchmap
SpecMNet: Spectrum Mend Network for Monaural Speech Enhancement 査読

Cunhang Fan, Hongmei Zhang, Jiangyan Yi, Zhao Lv, Jianhua Tao, Taihao Li, Guanxiong Pei, Xiaopei Wu, Sheng Li

Applied Acoustics 194 ( 108792 ) 2022年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1016/j.apacoust.2022.108792

researchmap
Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling 査読

Siqing Qin, Longbiao Wang, Sheng Li, Jianwu Dang, Lixin Pan

EURASIP Journal on Audio, Speech, and Music Processing 2022 ( 1 ) 1 - 10 2022年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1186/s13636-021-00233-4

researchmap

その他リンク： https://link.springer.com/article/10.1186/s13636-021-00233-4/fulltext.html
Can We Train a Language Model Inside an End-to-End ASR Model? - Investigating Effective Implicit Language Modeling 査読

Zhuo Gong, Saito Daisuke, Sheng Li, Hisashi Kawai, Minematsu Nobuaki

Proceedings of the Second Workshop on When Creative AI Meets Conversational AI 42 - 47 2022年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Subband-based Spectrogram Fusion for Speech Enhancement by Combining Mapping and Masking Approaches 査読

Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara

2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2022年11月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.23919/apsipaasc55919.2022.9979930

researchmap
Nict-Tib1: A Public Speech Corpus Of Lhasa Dialect For Benchmarking Tibetan Language Speech Recognition Systems 査読

Kak Soky, Zhuo Gong, Sheng Li

2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) 2022年11月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/o-cocosda202257103.2022.9997917

researchmap
Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction 査読

Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara

Interspeech 2022 2022年9月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/interspeech.2022-11268

researchmap
Multi-Domain Dialogue State Tracking with Top-k Slot Self Attention 査読

Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki

In Proc. SIGdial Meeting Discourse \& Dialogue 2022年9月

　詳細を見る

担当区分：筆頭著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism 査読

Kak Soky, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara

in Proc. INTERSPEECH 2022年9月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2022-343

researchmap
Augmented Adversarial Self-Supervised Learning for Early-Stage Alzheimer's Speech Detection 査読

Longfei Yang, Wenqing Wei, Sheng Li, Jiyi Li, Takahiro Shinozaki

in Proc. INTERSPEECH 2022年9月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2022-943

researchmap
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection 査読

Kai Li, Sheng Li, Xugang Lu, Masato Akagi, Meng Liu, Lin Zhang, Chang Zeng, Longbiao Wang, Jianwu Dang, Masashi Unoki

in Proc. INTERSPEECH 2022年9月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2022-10088

researchmap
Fusion of Self-supervised Learned Models for MOS Prediction 査読

Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao

in Proc. INTERSPEECH 2022年9月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2022-10262

researchmap
Finer-grained Modeling units-based Meta-Learning for Low-resource Tibetan Speech Recognition 査読

Siqing Qin, Longbiao Wang, Sheng Li, Yuqin Lin, Jianwu Dang

in Proc. INTERSPEECH 2022年9月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2022-10015

researchmap
Global Signal-to-noise Ratio Estimation Based on Multi-subband Processing Using Convolutional Neural Network 査読

Nan LI, Meng Ge, Longbiao Wang, Masashi Unoki, Sheng Li, Jianwu Dang

in Proc. INTERSPEECH 2022年9月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2022-154

researchmap
Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network 査読

Kai Li, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Masashi Unoki

2022 30th European Signal Processing Conference (EUSIPCO) 379 - 383 2022年8月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.23919/eusipco55093.2022.9909649

researchmap
TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies 招待査読

Kak Soky, Masato Mimura, Tatsuya Kawahara, Chenhui Chu, Sheng Li, Chenchen Ding, Sethserey Sam

International Journal of Asian Language Processing 31 ( 03n04 ) 2022年7月

　詳細を見る

担当区分：筆頭著者記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1142/s2717554522500072

researchmap
Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model 査読

Zhuo Gong, Daisuke Saito, Longfei Yang, Takahiro Shinozaki, Sheng Li, Hisashi Kawai, Nobuaki Minematsu

The Speaker and Language Recognition Workshop (Odyssey 2022) 2022年6月

　詳細を見る

担当区分：筆頭著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/odyssey.2022-58

researchmap
Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection. 査読

Sheng Li, Jiyi Li, Qianying Liu, Zhuo Gong

In Proc. LREC (Language Resources and Evaluation Conference) 7291 - 7297 2022年6月

　詳細を見る

担当区分：筆頭著者,　責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Mining Hard Samples Locally And Globally For Improved Speech Separation 査読

Kai Wang, Yizhou Peng, Hao Huang, Ying Hu, Sheng Li

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022年5月

　詳細を見る

担当区分：筆頭著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp43922.2022.9747797

researchmap
Compressing Transformer-Based ASR Model by Task-Driven Loss and Attention-Based Multi-Level Feature Distillation 査読

Yongjie Lv, Longbiao Wang, Meng Ge, Sheng Li, Chenchen Ding, Lixin Pan, Yuguang Wang, Jianwu Dang, Kiyoshi Honda

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022年5月

　詳細を見る

担当区分：筆頭著者,　責任著者記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp43922.2022.9746113

researchmap
Cross-Lingual Transfer Learningfor End-to-End Speech Translation

Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi

自然言語処理 29 ( 2 ) 611 - 637 2022年

　詳細を見る

掲載種別：研究論文（学術雑誌）

DOI： 10.5715/jnlp.29.611

researchmap
Adversarial Attack and Defense on Deep Neural Network-based Voice Processing Systems: An Overview 査読

Xiaojiao Chen, Sheng Li, Hao Huang

Applied Sciences, Special Issues of Machine Speech Communication, 2021. 2021年12月

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：研究論文（学術雑誌）

researchmap
Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC) 査読

Kak Soky, Masato Mimura, Tatsuya Kawahara, Sheng Li, Chenchen Ding, Chenhui Chu, Sethserey Sam

in Proc. O-COCOSDA 2021年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora 査読

Kak Soky, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara

In Proc. APSIPA ASC 2021年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Spectrograms Fusion-based End-to-End Robust Automatic Speech Recognition 査読

Hao Shi, Longbiao Wang, Sheng Li, Cunhang Fan, Jianwu Dang, Tatsuya Kawahara

In Proc. APSIPA ASC 2021年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework 査読

Yizhou Peng, Jicheng Zhang, Haobo Zhang, Haihua Xu, Hao Huang, Sheng Li, Eng Siong Chng

In Proc. APSIPA ASC 2021年12月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model 査読

Ding Wang, Shuaishuai Ye, Xinhui Hu, Sheng Li, Xinkang Xu

Interspeech 2021 3266 - 3270 2021年8月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/interspeech.2021-374

researchmap
End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain 査読

Kai Wang, Hao Huang, Ying Hu, Zhihua Huang, Sheng Li

Interspeech 2021 3046 - 3050 2021年8月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/interspeech.2021-504

researchmap
Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network

Nan Li, Longbiao Wang, Masashi Unoki, Sheng Li, Rui Wang, Meng Ge, Jianwu Dang

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021年6月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：IEEE

DOI： 10.1109/icassp39728.2021.9415045

researchmap
An Investigation of Using Hybrid Modeling Units for Improving End-to-End Speech Recognition System

Shunfei Chen, Xinhui Hu, Sheng Li, Xinkang Xu

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021年6月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：IEEE

DOI： 10.1109/icassp39728.2021.9414598

researchmap
Encoder-Decoder Based Pitch Tracking and Joint Model Training for Mandarin Tone Classification 査読

Hao Huang, Kai Wang, Ying Hu, Sheng Li

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021年6月

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/icassp39728.2021.9413888

researchmap
Phantom in the Opera: Effective Adversarial Music Attack on Keyword Spotting Systems.

Heran Zhang, Sheng Li, Xingjun Ma, Yi Zhao, Yang Cao, Tatsuya Kawahara

IEEE-SLT2021 2021年

　詳細を見る

担当区分：責任著者

researchmap
Simultaneous Progressive Filtering-Based Monaural Speech Enhancement 査読

Haoran Yin, Hao Shi, Longbiao Wang, Luya Qiang, Sheng Li, Meng Ge, Gaoyan Zhang, Jianwu Dang

Communications in Computer and Information Science 213 - 221 2021年

　詳細を見る

記述言語：英語掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-3-030-92307-5_25

researchmap
Exploring Effective Speech Representation via ASR for High-Quality End-to-End Multispeaker TTS 査読

Dawei Liu, Longbiao Wang, Sheng Li, Haoyu Li, Chenchen Ding, Ju Zhang, Jianwu Dang

Communications in Computer and Information Science 110 - 118 2021年

　詳細を見る

担当区分：責任著者記述言語：英語掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-3-030-92310-5_13

researchmap
Speech Dereverberation Based on Scale-Aware Mean Square Error Loss 査読

Luya Qiang, Hao Shi, Meng Ge, Haoran Yin, Nan Li, Longbiao Wang, Sheng Li, Jianwu Dang

Communications in Computer and Information Science 55 - 63 2021年

　詳細を見る

記述言語：英語掲載種別：論文集(書籍)内論文

DOI： 10.1007/978-3-030-92307-5_7

researchmap
VOIS: The First Speech Therapy App Specifically Designed for Myanmar Hearing-Impaired Children

Aye Thida, Nway Nway Han, Sheinn Thawtar Oo, Sheng Li, Chenchen Ding

2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) 2020年11月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：IEEE

DOI： 10.1109/o-cocosda50338.2020.9295024

researchmap
Compensation on x-vector for Short Utterance Spoken Language Identification

Peng Shen, Xugang Lu, Komei Sugiura, Sheng Li, Hisashi Kawai

Odyssey 2020 The Speaker and Language Recognition Workshop 47 - 52 2020年11月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/odyssey.2020-7

researchmap

その他リンク： https://dblp.uni-trier.de/db/conf/odyssey/odyssey2020.html#ShenLS0K20
Joint Training End-to-End Speech Recognition Systems with Speaker Attributes

Sheng Li, Xugang Lu, Raj Dabre, Peng Shen, Hisashi Kawai

Odyssey 2020 The Speaker and Language Recognition Workshop 2020年11月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/odyssey.2020-54

researchmap
Voice-Indistinguishability -- Protecting Voiceprint with Differential Privacy under an Untrusted Server

Yaowei Han, Yang Cao, Sheng Li, Qiang Ma, Masatoshi Yoshikawa

Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security 2125 - 2127 2020年10月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ACM

DOI： 10.1145/3372297.3420025

researchmap
Singing Voice Extraction with Attention-Based Spectrograms Fusion

Hao Shi, Longbiao Wang, Sheng Li, Chenchen Ding, Meng Ge, Nan Li, Jianwu Dang, Hiroshi Seki

Interspeech 2020 2020年10月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/interspeech.2020-1043

researchmap
Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription

Yuqin Lin, Longbiao Wang, Sheng Li, Jianwu Dang, Chenchen Ding

Interspeech 2020 2020年10月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/interspeech.2020-1755

researchmap
Voice-Indistinguishability: Protecting Voiceprint In Privacy-Preserving Speech Data Release

Yaowei Han, Sheng Li, Yang Cao, Qiang Ma, Masatoshi Yoshikawa

2020 IEEE International Conference on Multimedia and Expo (ICME) 2020年7月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：IEEE

DOI： 10.1109/icme46284.2020.9102875

researchmap
End-to-End Articulatory Modeling for Dysarthric Articulatory Attribute Detection

Yuqin Lin, Longbiao Wang, Jianwu Dang, Sheng Li, Chenchen Ding

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020年5月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：IEEE

DOI： 10.1109/icassp40776.2020.9054233

researchmap
Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation

Hao Shi, Longbiao Wang, Meng Ge, Sheng Li, Jianwu Dang

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020年5月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：IEEE

DOI： 10.1109/icassp40776.2020.9054661

researchmap
Automatic speech recognition

Xugang Lu, Sheng Li, Masakiyo Fujimoto

SpringerBriefs in Computer Science 21 - 38 2020年

　詳細を見る

記述言語：英語掲載種別：論文集(書籍)内論文出版者・発行元：Springer

DOI： 10.1007/978-981-15-0595-9_2

Scopus

researchmap
Investigation of Effectively Synthesizing Code-Switched Speech Using Highly Imbalanced Mix-Lingual Data

Shaotong Guo, Longbiao Wang, Sheng Li, Ju Zhang, Cheng Gong, Yuguang Wang, Jianwu Dang, Kiyoshi Honda

Neural Information Processing 36 - 47 2020年

　詳細を見る

掲載種別：論文集(書籍)内論文出版者・発行元：Springer International Publishing

DOI： 10.1007/978-3-030-63830-6_4

researchmap
Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification

Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai

IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 2674 - 2683 2020年

　詳細を見る

掲載種別：研究論文（学術雑誌）出版者・発行元：Institute of Electrical and Electronics Engineers (IEEE)

DOI： 10.1109/taslp.2020.3023627

researchmap
Effective Training End-to-End ASR systems for Low-resource Lhasa Dialect of Tibetan Language

Lixin Pan, Sheng Li, Longbiao Wang, Jianwu Dang

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2019年11月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：IEEE

DOI： 10.1109/apsipaasc47483.2019.9023100

researchmap
Multi-lingual transformer training for khmer automatic speech recognition 査読

Kak Soky, Sheng Li, Tatsuya Kawahara, Sopheap Seng

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 1893 - 1896 2019年11月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/APSIPAASC47483.2019.9023137

Scopus

researchmap
Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection

Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai

Interspeech 2019 2019年9月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/interspeech.2019-2271

researchmap
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation

Sheng Li, Dabre Raj, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai

Interspeech 2019 2019年9月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/interspeech.2019-2112

researchmap
End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition

Sheng Li, Chenchen Ding, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai

Interspeech 2019 2019年9月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/interspeech.2019-2092

researchmap
Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese

Sheng Li, Xugang Lu, Chenchen Ding, Peng Shen, Tatsuya Kawahara, Hisashi Kawai

Interspeech 2019 2019年9月

　詳細を見る

掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：ISCA

DOI： 10.21437/interspeech.2019-2104

researchmap
Interactive learning of teacher-student model for short utterance spoken language identification. 査読

P.Shen, X.Lu, S. Li, H.Kawai

Proc. IEEE-ICASSP 2019年

　詳細を見る

DOI： 10.1109/icassp.2019.8683371

researchmap
INVESTIGATION OF SEQUENCE-LEVEL KNOWLEDGE DISTILLATION METHODS FOR CTC ACOUSTIC MODELS 査読

Ryoichi Takashima, Li Sheng, Hisashi Kawai

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) 6156 - 6160 2019年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/ICASSP.2019.8682671

Web of Science

researchmap
Feature Representation of Short Utterances based on Knowledge Distillation for Spoken Language Identification. 査読

P.Shen, X.Lu, S. Li, H.Kawai

Proc. INTERSPEECH 2018年

　詳細を見る

researchmap
Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks 査読

Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6 3708 - 3712 2018年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.21437/Interspeech.2018-1475

Web of Science

researchmap
CTC LOSS FUNCTION WITH A UNIT-LEVEL AMBIGUITY PENALTY 査読

Ryoichi Takashima, Sheng Li, Hisashi Kawai

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) 5909 - 5913 2018年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

Web of Science

researchmap
AN INVESTIGATION OF A KNOWLEDGE DISTILLATION METHOD FOR CTC ACOUSTIC MODELS 査読

Ryoichi Takashima, Sheng Li, Hisashi Kawai

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) 5809 - 5813 2018年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

Web of Science

researchmap
Temporal Attentive Pooling for Acoustic Event Detection. 査読

X.Lu, P.Shen, S. Li, Y.Tsao, H.Kawai

Proc. INTERSPEECH 2018年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

researchmap
IMPROVING VERY DEEP TIME-DELAY NEURAL NETWORK WITH VERTICAL-ATTENTION FOR EFFECTIVELY TRAINING CTC-BASED ASR SYSTEMS 査読

Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai

2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018) 77 - 83 2018年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/slt.2018.8639675

Web of Science

researchmap
SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING 査読

Sheng Li, Xugang Lu, Shinsuke Sakai, Masato Mimura, Tatsuya Kawahara

2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) 5270 - 5274 2017年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/ICASSP.2017.7953162

Web of Science

researchmap
Conditional generative adversarial nets classifier for spoken language identification 査読

Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017- 2814 - 2818 2017年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）出版者・発行元：International Speech Communication Association

DOI： 10.21437/Interspeech.2017-553

Scopus

researchmap
INCREMENTAL TRAINING AND CONSTRUCTING THE VERY DEEP CONVOLUTIONAL RESIDUAL NETWORK ACOUSTIC MODELS 査読

Sheng Li, Xugang Lu, Peng Shen, Ryoichi Takashima, Tatsuya Kawahara, Hisashi Kawai

2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) 222 - 227 2017年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/ASRU.2017.8268939

Web of Science

researchmap
Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses 査読

Sheng Li, Yuya Akita, Tatsuya Kawahara

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 24 ( 9 ) 1524 - 1534 2016年9月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1109/TASLP.2016.2562505

Web of Science

researchmap
Data Selection from Multiple ASR Systems' Hypotheses for Unsupervised Acoustic Model Training 査読

Sheng Li, Yuya Akita, Tatsuya Kawahara

2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS 5875 - 5879 2016年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/ICASSP.2016.7472804

Web of Science

researchmap
Confidence Estimation for Speech Recognition Systems using Conditional Random Fields Trained with Partially Annotated Data 査読

Sheng Li, Xugang Lu, Shinsuke Mori, Yuya Akita, Tatsuya Kawahara

2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) 2016年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/ISCSLP.2016.7918419

Web of Science

researchmap
Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training 査読

Sheng Li, Yuya Akita, Tatsuya Kawahara

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E98D ( 8 ) 1545 - 1552 2015年8月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1587/transinf.2015EDP7047

Web of Science

researchmap
Ensemble Speaker Modeling using Speaker Adaptive Training Deep Neural Network for Speaker Adaptation 査読

Sheng Li, Xugang Lu, Yuya Akita, Tatsuya Kawahara

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 2892 - 2896 2015年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

Web of Science

researchmap
Discriminative Data Selection for Lightly Supervised Training of Acoustic Model using Closed Caption Texts 査読

Sheng Li, Yuya Akita, Tatsuya Kawahara

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 3526 - 3530 2015年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

Web of Science

researchmap
Corpus and Transcription System of Chinese Lecture Room 査読

Sheng Li, Yuya Akita, Tatsuya Kawahara

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) 442 - 445 2014年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/ISCSLP.2014.6936595

Web of Science

researchmap
Phoneme-level articulatory animation in pronunciation training 査読

Lan Wang, Hui Chen, Sheng Li, Helen M. Meng

SPEECH COMMUNICATION 54 ( 7 ) 845 - 856 2012年9月

　詳細を見る

記述言語：英語掲載種別：研究論文（学術雑誌）

DOI： 10.1016/j.specom.2012.02.003

Web of Science

researchmap
Cross Linguistic Comparison of Mandarin and English EMA Articulatory Data 査読

Sheng Li, Lan Wang

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 902 - 905 2012年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

Web of Science

researchmap
The Phoneme-level Articulator Dynamics for Pronunciation Animation 査読

Sheng Li, Lan Wang, En Qi

Proc. IALP 2011年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/IALP.2011.13

researchmap
IELS: A Computer-aided Pronunciation Training System for Undergraduate Students 査読

Jinyu Chen, Lan Wang, Chongguo Li, Jin Hu, Sheng Li

ICETC 2010年

　詳細を見る

記述言語：英語掲載種別：研究論文（国際会議プロシーディングス）

DOI： 10.1109/ICETC.2010.5529236

researchmap

▼全件表示

書籍等出版物

Voices of the Himalayas: Investigation of Speech Recognition Technology for the Tibetan Language

Sheng Li（担当：単著）

2023年2月（ ISBN:9784904020289 ）

　詳細を見る

researchmap
Bridging Eurasia: Multilingual Speech Recognition for Silkroad

Sheng Li（担当：単著）

2023年1月（ ISBN:9784904020296 ）

　詳細を見る

researchmap
Phantom in the Opera: The Vulnerabilities of Speech-based Artificial Intelligence Systems

Sheng Li（担当：単著）

2022年11月（ ISBN:9784904020265 ）

　詳細を見る

researchmap
Automatic speech recognition: Speech-to-Speech Translation

X. Lu, S. Li, M. Fujimoto（担当：共著範囲: Chapter 3.3.2: From Shallow to Deep and Very Deep. Chapter 3.3.3: End-to-End and CTC models.）

Springer Singapore 2020年

　詳細を見る

researchmap

MISC

Evaluating Tibetan ASR with Segmented Word Error Rate: Beyond Character-Level Metrics

Jacob Moore, Sheng Li, Paula Lauren

TechRxiv 2026年2月

　詳細を見る

記述言語：英語

DOI： 10.36227/techrxiv.177102186.63648582/v1

researchmap
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning.

Zhao Ren, Rathi Adarshi Rammohan, Kevin Scheck, Sheng Li, Tanja Schultz

2025年12月

　詳細を見る

DOI： 10.48550/arXiv.2507.07806

researchmap
Emotional Text-To-Speech Based on Mutual-Information-Guided Emotion-Timbre Disentanglement

Jianing Yang, Sheng Li, Takahiro Shinozaki, Yuki Saito, Hiroshi Saruwatari

arXiv 2025年10月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2510.01722

researchmap
Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR

Hongli Yang, Sheng Li, Hao Huang, Ayiduosi Tuohan, Yizhou Peng

arxiv 2025年7月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2506.21577

researchmap
Adapting Whisper for Parameter-efficient Code-Switching Speech Recognition via Soft Prompt Tuning

Hongli Yang, Yizhou Peng, Hao Huang, Sheng Li

2025年7月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2506.21576

researchmap
Generalized Multilingual Text-to-Speech Generation with Language-Aware Style Adaptation

Haowei Lou, Hye-young Paik, Sheng Li, Wen Hu, Lina Yao

arXiv preprint arXiv:2504.08274 2025年4月

　詳細を見る

記述言語：英語

researchmap
Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition

Zhengdong Yang, Qianying Liu, Sheng Li, Fei Cheng, Chenhui Chu

arXiv 2025年1月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2501.17615

researchmap
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding

Jiliang Hu, Zuchao Li, Mengjia Shen, Haojun Ai, Sheng Li, Jun Zhang

arXiv 2025年1月

　詳細を見る

記述言語：英語掲載種別：研究発表ペーパー・要旨（国際会議）

DOI： 10.48550/arXiv.2501.07329

researchmap
Multi-Prototype Network with Swin Transformer for Open Set Recognition

Jun Wang, Haiyan Yang, Sheng Li, Di Zhou, Xingwei Chen, Juncheng Li, Yufeng Hua, Jun Shi

SSRN 2025年

　詳細を見る

記述言語：英語掲載種別：記事・総説・解説・論説等（学術雑誌）

DOI： 10.2139/ssrn.5134636

researchmap
A Unified Speech LLM for Diarization and Speech Recognition in Multilingual Conversations

Phurich Saengthong, Boonnithi Jiaramaneepini, Sheng Li, Manabu Okumura, Takahiro Shinozaki

arXiv 2025年

　詳細を見る

DOI： 10.48550/arXiv.2507.02927

researchmap
Towards Emotion Co-regulation with LLM-powered Socially Assistive Robots: Integrating LLM Prompts and Robotic Behaviors to Support Parent-Neurodivergent Child Dyads

Jing Li, Felix Schijve, Sheng Li, Yuye Yang, Jun Hu, Emilia Barakova

arXiv 2025年

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2507.10427

researchmap
Multi-Prototype Network with Swin Transformer for Open Set Recognition

Jun Wang, Haiyan Yang, Sheng Li, Di Zhou, Xingwei Chen, Juncheng Li, Yufeng Hua, Jun Shi

SSRN 2025年

　詳細を見る

記述言語：英語

DOI： 10.2139/ssrn.5134636

researchmap
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction

Yuka Ko, Sheng Li, Chao-Han Huck Yang, Tatsuya Kawahara

arXiv 2024年12月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2408.16180

researchmap
Extracting Spatiotemporal Data from Gradients with Large Language Models

Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa

arXiv 2024年10月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2410.16121

researchmap
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition

Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz

arXiv 2024年10月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2410.13221

researchmap
Enhancing Privacy of Spatiotemporal Federated Learning against Gradient Inversion Attacks

Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa

arXiv 2024年7月

　詳細を見る

記述言語：英語

DOI： 10.48550/arXiv.2407.08529

researchmap
MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction

Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara

2024年1月

　詳細を見る

記述言語：英語掲載種別：研究発表ペーパー・要旨（国際会議）

DOI： 10.48550/arXiv.2401.13249

researchmap
End-to-End Speech-to-Speech Translation toolkit

Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li

ACM Multimedia Asia 2023 workshop released tookit 2023年12月

　詳細を見る

researchmap
FedCPC: An Effective Federated Contrastive Learning Method for Privacy Preserving Early-Stage Alzheimer's Speech Detection

Wenqing Wei, Zhengdong Yang, Yuan Gao, Jiyi Li, Chenhui Chu, Shogo Okada, Sheng Li

2023年11月

　詳細を見る

担当区分：最終著者,　責任著者記述言語：英語

DOI： 10.48550/arXiv.2311.13043

researchmap
LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement

Zili Qi, Xinhui Hu, Wangjin Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu

2023年11月

　詳細を見る

DOI： 10.48550/arXiv.2311.10656

researchmap
Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization

Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He

2023年11月

　詳細を見る

DOI： 10.48550/arXiv.2311.10664

researchmap
GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System

Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He

2023年11月

　詳細を見る

DOI： 10.48550/arXiv.2311.10689

researchmap
Towards Speech Dialogue Translation Mediating Speakers of Different Languages

Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi

arXiv:2305.09210 2023年5月

　詳細を見る

記述言語：英語

researchmap
Robust Voice Activity Detection Using an Auditory-Inspired Masked Modulation Encoder Based Convolutional Attention Network

Nan LI, Longbiao Wang, Meng Ge, Masashi Unoki, Sheng Li, Jianwu Dang

2023年

　詳細を見る

記述言語：英語

DOI： 10.2139/ssrn.4557926

researchmap
Speech-text based multi-modal training with bidirectional attention for improved speech recognition

Yuhang Yang, Haihua Xu, Hao Huang, Eng Siong Chng, Sheng Li

arXiv:2211.00325 2022年10月

　詳細を見る

researchmap
Tendency-and-Attention-Informed Deep Learning for ENSO Forecasts

Shen Qiao, Cuicui Zhang, Xuefeng Zhang, Kai Zhang, Hao Shi, Sheng Li, Hao Wei

2022年6月

　詳細を見る

出版者・発行元：Research Square Platform LLC

Abstract

Deep learning has been acknowledged as an increasingly important technology for ENSO forecasts. The most cutting-edge deep learning algorithm is developed based on Convolutional Neural Network (CNN), which can achieve a multi-year (about 17-month-lead) forecast and has conquered the ‘spring forecast barrier’ problem. However, this group of methods are still challenged by several critical issues. First, they usually utilize the global sea surface temperature (SST) fields as inputs without considering the specific contributions of variant oceanic regions in ENSO forecasts. Consequently, they cannot effectively investigate the role of the ‘teleconnection’ mechanism among different oceans (Indian, Pacific, and Atlantic Oceans) and different ocean parts (the tropic and non-tropic regions) especially in the forecast of extreme ENSO events. Second, existing methods mainly utilize the discrete monthly SST fields for Deep Learning for ENSO Forecasts ENSO forecasts without investigating the rate-of-changes between adjacent months, which also provides important information to the prediction of variation tendency. To solve these problems, this paper develops a Tendency-and-Attention-Informed Deep Residual Network (TA-DRN) for multi-year ENSO forecasts. The contributions of different oceanic regions can be learned by a spatial attention module while the variation tendency of adjacent previous and current months can be interpreted by the first-and-second order of differences of SST fields. Through informed by these two modules, the performance of TA-DRN can be improved significantly, especially in predicting extreme El Niño and La Niña events.

DOI： 10.21203/rs.3.rs-1733575/v1

researchmap

その他リンク： https://www.researchsquare.com/article/rs-1733575/v1.html
Fusion of Self-supervised Learned Models for MOS Prediction

Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao

CoRR abs/2204.04855 2022年4月

　詳細を見る

担当区分：責任著者

researchmap
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition.

Qianying Liu, Yuhang Yang, Zhuo Gong, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Sadao Kurohashi

abs/2204.03855 2022年4月

　詳細を見る

担当区分：責任著者

researchmap
Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release

Yaowei Han, Sheng Li, Yang Cao, Qiang Ma, Masatoshi Yoshikawa

CoRR abs/2004.07442 2020年6月

　詳細を見る

担当区分：筆頭著者

researchmap
Deep progressive multi-scale attention for acoustic event classification

Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai

CoRR abs/1912.12011 2019年4月

　詳細を見る

担当区分：筆頭著者

researchmap

▼全件表示

講演・口頭発表等

大規模言語モデルの統合による音声認識システムの改善招待

李勝

NICT Open House 2024 2024年6月

　詳細を見る

開催年月日： 2024年6月

記述言語：日本語

researchmap
Diversity-driven Semi-supervised Ensemble DNN Acoustic Model Training (音声)

LI Sheng, LU Xugang, SAKAI Shinsuke, KAWAHARA Tatsuya

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2016年8月電子情報通信学会

　詳細を見る

開催年月日： 2016年8月

記述言語：英語

researchmap
Discriminative Data Selection from Multiple ASR Systems' Hypotheses for Unsupervised Acoustic Model Training (音声) -- (第17回音声言語シンポジウム)

LI SHENG, AKITA YUYA, KAWAHARA TATSUYA

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2015年12月電子情報通信学会

　詳細を見る

開催年月日： 2015年12月

記述言語：英語

researchmap
相互情報量最小化による感情・音色の分離に基づく感情的音声合成,

楊家寧, 李勝, 篠崎隆宏, 齋藤佑樹, 猿渡洋

日本音響学会研究発表会講演論文集, 秋季 2025年10月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
RAG-Boost: Retrieval-Augmented Generation Enhanced Speech Recognition in LLM-based Spoken Dialogue Systems

王鵬程, 李勝, 篠崎隆宏

日本音響学会研究発表会講演論文集, 秋季 2025年10月

　詳細を見る

researchmap
Application of the RFID based audio service in regional navigation system

S. Li, C. Li

Bulletin of Advanced Technology Research 2009年

　詳細を見る

researchmap
The Phoneme-level Articulator Dynamics for 3D Pronunciation Animation for Chinese

S. Li, K. Luo, L. Wang

Bulletin of Advanced Technology Research 2011年

　詳細を見る

researchmap
Phoneme-level articulatory animation in pronunciation training using EMA data

李勝

Speech Synthesis Lab., Tsinghua University, host: Prof. Zhiyong Wu. 2012年

　詳細を見る

researchmap
Vocal Tract Length Normalization for Chinese Spontaneous Speech Recogntion

李勝

Technical-report.（Kyoto university） 2013年

　詳細を見る

researchmap
Multi-lingual transformer training for Khmer automatic speech recognition

K. Soky, S. Li, T. Kawahara, S. Seng

Interspeech 2020 Satellite Workshop (SLIMTS2020). (abstract paper)

　詳細を見る

researchmap
Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release

Y. Han, S. Li, Y. Cao, Q. Ma, M. Yoshikawa

Interspeech 2020 Satellite Workshop (SLIMTS2020). (abstract paper)(invited report)

　詳細を見る

researchmap
Automatic Transcription of Chinese Spoken Lectures

S. Li, M. Mimura, T. Kawahara

Acoustical Society of Japan, autumn 2013年

　詳細を見る

researchmap
DNN-based Acoustic Modeling and Decoding for Chinese Spontaneous Speech Recogntion with HTK

李勝

Technical-report.（Kyoto university） 2014年

　詳細を見る

researchmap
Lightly-supervised training and confidence estimation by using CRF classifiers,

李勝

Speech and Cognition Lab., Tianjin University, host: Prof. Jianwu Dang and Prof. Kiyoshi Honda. 2014年

　詳細を見る

researchmap
Effective combination of multiple ASR hypotheses with CRF-based classifiers

S. Li, Y. Akita, T. Kawahara

Acoustical Society of Japan, autumn 2015年

　詳細を見る

researchmap
Discriminative data selection from multiple ASR systems' hypotheses for unsupervised acoustic model training

S. Li, Y. Akita, T. Kawahara

IPSJ SIG-SLP-109-8 2015年

　詳細を見る

researchmap
Data Selection Assisted by Caption to Improve Acoustic Modeling for Lecture Transcription

S. Li, Y. Akita, T. Kawahara

Acoustical Society of Japan, spring 2014年

　詳細を見る

researchmap
Classifier-based data selection for lightly-supervised training of acoustic model for lecture transcription

S. Li, Y. Akita, T. Kawahara

IPSJ SIG-SLP-102-4 2014年

　詳細を見る

researchmap
Unsupervised Training of Deep Neural Network Acoustic Models for Lecture Transcriptions

S. Li, Y. Akita, T. Kawahara

Acoustical Society of Japan, autumn 2014年

　詳細を見る

researchmap
Incorporating divergences from hypotheses of multiple ASR systems to improve unsupervised acoustic model training

S. Li, Y. Akita, T. Kawahara

Acoustical Society of Japan 2015年

　詳細を見る

researchmap
Diversity-driven Semi-supervised Ensemble DNN Acoustic Model Training

S. Li, X. Lu, S. Sakai, T. Kawahara

Acoustical Society of Japan, autumn 2016年

　詳細を見る

researchmap
Very deep convolutional residual network acoustic models for Japanese lecture transcription

S. Li, X. Lu, P. Shen, H. Kawai

Acoustical Society of Japan, autumn 2017年

　詳細を見る

researchmap
cGAN-classifier: Conditional Generative Adversarial Nets for Classification

P. Shen, X. Lu, S. Li, H. Kawai

Acoustical Society of Japan, autumn 2017年

　詳細を見る

researchmap
CTC 音響モデルのための knowledge distillation 方式の検討

R.Takashima, S. Li, H. Kawai

Acoustical Society of Japan, spring 2018年

　詳細を見る

researchmap
Short utterance-based spoken language identification

P. Shen, X. Lu, S. Li, H. Kawai

Acoustical Society of Japan, autumn 2018年

　詳細を見る

researchmap
Training CTC and LFMMI-based TDNN with CNTK

李勝

NICT internal report 2018年

　詳細を見る

researchmap
CTC音響モデルのためのシーケンスレベル知識蒸留法の検討

高島遼一, 李勝, 河井恒

IPSJ SIG-SLP 2018年

　詳細を見る

researchmap
An Empirical Comparison of Sequence Training Methods for the Very Deep Time-delay Neural Network

S. Li, X. Lu, R.Takashima, P. Shen, H. Kawai

Acoustical Society of Japan, autumn 2018年

　詳細を見る

researchmap
Improving CTC-based acoustic model with very deep residual neural network

S. Li, X. Lu, R.Takashima, P. Shen, H. Kawai

Acoustical Society of Japan, spring 2018年

　詳細を見る

researchmap
End-to-end音声認識技術の研究

李勝

情報通信フェア2019 2019年9月

　詳細を見る

researchmap
End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition

S. Li, C. Ding, X. Lu, P. Shen and H. Kawai,

Acoustical Society of Japan, spring 2020年

　詳細を見る

会議種別：口頭発表（一般）

researchmap
Joint Training End-to-End Systems for Speech and Speaker Recognition with Speaker Attributes,

S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai,

Acoustical Society of Japan, spring 2020年

　詳細を見る

会議種別：口頭発表（一般）

researchmap
Improvement of x-vector for short utterance spoken language identification,

P. Shen, X. Lu, K. Sugiura, S. Li, H. Kawai,

Acoustical Society of Japan, spring 2020年

　詳細を見る

会議種別：口頭発表（一般）

researchmap
Investigation of multi-domain training for speech recognition,

P. Shen, X. Lu, S. Li, H. Kawai

Acoustical Society of Japan, spring 2019年3月

　詳細を見る

researchmap
Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release 招待

Y. Han, S. Li, Y. Cao, Q. Ma, M. Yoshikawa

INTERSPEECH 2020 Satellite Workshop (SLIMTS2020) (invited report) 2020年10月

　詳細を見る

記述言語：英語会議種別：口頭発表（招待・特別）

researchmap
A Mixture of Character and Word End-to-End System for Keyword Spotting 招待

H. Zhang, S. Ueno, M. Mimura, S. Li, W. Zhang, T. Kawahara

INTERSPEECH 2020 Satellite Workshop (SLIMTS2020)(full paper) 2020年9月

　詳細を見る

記述言語：英語会議種別：口頭発表（招待・特別）

researchmap
Investigation of Effectively Synthesizing Code-switched Speech Using Highly Imbalanced Mix-lingual Data and mask embedding,

S. Guo, L. Wang, S. Li, J. Zhang, C. Gong, Y. Wang, J. Dang, K. Honda

INTERSPEECH 2020 Satellite Workshop (SLIMTS2020). 2020年9月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
Multi-lingual transformer training for Khmer automatic speech recognition, 招待

K. Soky, S. Li, T. Kawahara, S. Seng

INTERSPEECH 2020 Satellite Workshop (SLIMTS2020). 2020年9月

　詳細を見る

記述言語：英語会議種別：口頭発表（招待・特別）

researchmap
System Description for Voice Privacy Challenge (Kyoto Team).

Y. Han, S. Li, Y. Cao, M. Yoshikawa

In special session of INTERSPEECH2020 (VoicePrivacy challenge 2020) 2020年9月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
Description of End-to-End Dialect Identification System (accepted in INTERSPEECH2021)

Ding Wang, Shuaishuai Ye, Xinhui Hu, Sheng Li, Xinkang Xu

In special session of INTERSPEECH2021 (OLR2020 challenge) 2021年9月

　詳細を見る

記述言語：英語会議種別：ポスター発表

researchmap
Adversarial Attack and Defense on Deep Neural Network-based Voice Processing Systems: An Overview

Xiaojiao Chen, Sheng Li, Hao Huang

NCMMSC2021 2021年10月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
System description of Alzheimer's disease early detection (Silk-road team, short speech track)

Wenqing Wei, Rui Wong, Sheng Li, Yachao Guo, Hao Huang

Alzheimer's disease detection challenge (NCMMSC2021) 2021年10月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
System description of joint speech and accent recognition (published in APSIPA ASC, 2021)

Y. Peng, J. Zhang, H. Zhang, H. Xu, H. Huang, S. Li, E.S. Chng

in Challenge of Interspeech2020 Accented English Speech Recognition, AESR, 2020. 2021年12月

　詳細を見る

記述言語：英語会議種別：ポスター発表

researchmap
End-to-End Speech Translation with Cross-lingual Transfer Learning

S Shimizu, C Chu, S Li, S Kurohashi

NLP2021 2021年

　詳細を見る

researchmap
Comparison of End-to-End Models for Joint Speaker and Speech Recognition

K Soky, S Li, M Mimura, C Chu, T Kawahara

IEICE-SP 2021年

　詳細を見る

researchmap
The RoyalFlush(NICT) System Description for AP21-OLR Challenge 招待

Ding Wang, Shuaishuai Ye, Xinhui Hu, Sheng Li

AP21-OLR Challenge 2022年1月

　詳細を見る

記述言語：英語会議種別：口頭発表（招待・特別）

researchmap
The System Description for VoiceMOS Challenge 2022 (main/ood tasks)

2022年

　詳細を見る

researchmap
System Description for the CN-Celeb Speaker Recognition Challenge 2022

Guangxing Li, Wangjin Zhou, Sheng Li, Yi Zhao, Hao Huang, Jichen Yang

CNSRC (the CN-Celeb Speaker Recognition Challenge), Speaker Odyssey 2022 2022年6月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
Study on Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network

Li Kai, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Unoki Masashi

信学技報 2022年8月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
異言語話者の対話を仲介する音声対話翻訳

清水周一郎, 褚晨翚, 李勝, 黒橋禎夫

言語処理学会第 29 回年次大会（NLP2023） 2023年3月

　詳細を見る

記述言語：日本語会議種別：口頭発表（一般）

researchmap
Towards Security-aware Speech Recognition System, 招待

Sheng Li

NECTEC-NICT joint seminar 2023年8月

　詳細を見る

記述言語：英語会議種別：口頭発表（招待・特別）

researchmap
Cross-lingual Mapping for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition

Zhengdong Yang, Qianying Liu, Sheng Li, Chenhui Chu, Fei Cheng, Sadao Kurohashi

ASJ 2023 autumn 2023年9月

　詳細を見る

記述言語：英語会議種別：ポスター発表

researchmap
Correction while Recognition: Combining Pretrained Language Model for Taiwan-accented Speech Recognition 招待

Sheng Li

Joint Seminar with NECTEC Language Understand Group 2023年11月

　詳細を見る

記述言語：英語会議種別：口頭発表（招待・特別）

researchmap
System Description for the Voiceprivacy Challenge 2022

Xiaojiao Chen, Guangxing Li, Wangjin Zhou, Sheng Li, Yang Cao, Hao Huang, Yi Zhao

Voiceprivacy Challenge 2022 2022年9月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
VoicePrivacy Challenge: System description

X. Chen, G. Li, H. Huang, W. Zhou, Y. Cao, S. Li, Y. Zhao

VoicePrivacy 2022 Challenge Workshop (Interspeech2022) 2022年9月

　詳細を見る

記述言語：英語会議種別：口頭発表（一般）

researchmap
Domain and Language Adaptation of Large-scale Pretrained Model for Speech Recognition of Low-resource Language

Kak Soky, Sheng Li, Chenhui Chu, Tatsuya Kawahara

IEICE Tech. Rep. (信学技報) 2022年12月

　詳細を見る

researchmap
Self-Supervised Learning MOS Prediction with Listener Enhancement 招待

Sheng Li

VoiceMOS mini workshop 2023年11月

　詳細を見る

記述言語：英語会議種別：口頭発表（招待・特別）

researchmap
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition 招待

Zhengdong Yang

ICT-innovation 2023 (Kyoto Univ.) 2024年2月

　詳細を見る

記述言語：英語会議種別：公開講演，セミナー，チュートリアル，講習，講義等

researchmap
Investigating effective methods for combining large language model with speech recognition system

李勝, 楊正東, 周汪勁, 褚晨翚, 河井恒

日本音響学会第151回(2024年春季)研究発表会 2024年3月

　詳細を見る

記述言語：英語会議種別：ポスター発表

researchmap
Combining Large Language Model with Speech Recognition System in Low-resource Settings

李勝, 楊正東, 周汪勁, 褚晨翚, Chen Chen, Chng Eng Siong, 河井恒

言語処理学会第30回年次大会 2024年3月

　詳細を見る

会議種別：ポスター発表

researchmap
Enhancing Multi-Step Reasoning in Language Models with Synthetic Math Data Augmentation (HP_Fighters team)

Jieqing Mei, Jiyi Li, Qianying Liu, Sheng Li

NLP2025 ワークショップ：大規模言語モデルのファインチューニング技術と評価 2025年3月

　詳細を見る

記述言語：日本語会議種別：口頭発表（一般）

researchmap
大規模言語モデルを用いた英語学習者発話のCEFR-Jレベル推定

隆宏篠﨑, 秋太朗佐藤, 李勝

CEFR-J 2025国際シンポジウム 2025年3月

　詳細を見る

記述言語：日本語会議種別：口頭発表（一般）

researchmap
音声認識および音声翻訳における生成的誤り訂正のための多言語ベンチマーク

Zhengdong Yang, Zhen Wan, Sheng Li, Chao-Han Huck Yang, Chenhui Chu

言語処理学会第32回年次大会 2026年3月

　詳細を見る

researchmap
Multilingual Retrieval-Augmented Generation Enhanced LLM-based Speech Recognition

王鵬程, 李勝, 篠崎隆宏

日本音響学会第155回(2026年春季)研究発表会 2026年3月

　詳細を見る

会議種別：口頭発表（一般）

researchmap
指示再構成手法に基づく言語モデルベース音声合成のスタイル制御

Zhu Shiao, Li Sheng, 篠崎隆宏

日本音響学会第155回(2026年春季)研究発表会 2026年3月

　詳細を見る

会議種別：口頭発表（一般）

researchmap

▼全件表示

産業財産権

学習方法

李勝, ルーシュガン, 高島遼一, 沈鵬, 河井恒

　詳細を見る

出願人：国立研究開発法人情報通信研究機構

出願番号：特願2017-236626 出願日：2017年12月

公開番号：特開2019-105899 公開日：2019年6月

特許番号/登録番号：特許6979203 登録日：2021年11月

権利者：国立研究開発法人情報通信研究機構

researchmap
時系列情報の学習システム、方法およびニューラルネットワークモデル

高島遼一, 李勝, 河井恒

　詳細を見る

出願番号：特願2018-044134

特許番号/登録番号：特許7070894 登録日：2022年5月

権利者：国立研究開発法人情報通信研究機構

researchmap
音声認識システム、音声認識方法、学習済モデル

李勝, シュガンルー・, 高島遼一, 沈鵬, 河井恒

　詳細を見る

出願番号：特願2018-044491

特許番号/登録番号：特許7109771 登録日：2022年7月

権利者：国立研究開発法人情報通信研究機構

researchmap
識別器、学習済モデル、学習方法

李勝, ルーシュガン, 高島遼一, 沈鵬, 河井恒

　詳細を見る

出願番号：特願2018-142418

特許番号/登録番号：特許7209330 登録日：2023年1月

権利者：国立研究開発法人情報通信研究機構

researchmap
言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム

沈鵬, ルーシュガン, 李勝, 河井恒

　詳細を見る

出願番号：特願2019-086005

特許番号/登録番号：特許7282363 登録日：2023年5月

researchmap
推論器、推論プログラムおよび学習方法

李勝, ルー・シュガン, 丁塵辰, 河原達也, 河井恒

　詳細を見る

出願番号：特願2019-163555

特許番号/登録番号：特許7385900 登録日：2023年11月

researchmap
推論器および推論器の学習方法

李勝, ルーシュガン, 河井恒

　詳細を見る

出願番号：特願2020-059962

特許番号/登録番号：特許7423056 登録日：2024年1月

researchmap

▼全件表示

Works（作品等）

HSoftmax: Hierachical Softmax (https://github.com/Derek-Gong/hsoftmax/)

Zhuo Gong, Qianying Liu, Sheng Li, Zhengdong Yang, Yuhang Yang

2020年

　詳細を見る

作品分類：ソフトウェア

researchmap
https://openslr.org/158/

　詳細を見る

researchmap
Julius decoder with EESEN CTC acoustic model

　詳細を見る

researchmap
Julius decoder with Kaldi acoustic model

　詳細を見る

researchmap
Julius decoder with Kaldi feature extractor

　詳細を見る

researchmap
VTLN for Julius/HTK acoustic model

　詳細を見る

researchmap
Julius for speech foundation models

https, github.com/halspeech/julius-speech-foundation-model

　詳細を見る

researchmap
foundation models for Tibetan language

　詳細を見る

researchmap
online speech recognition module for Erica the human robot

　詳細を見る

researchmap
very deep residual time-delay neural network (TDNN) with LFMMI objective implemented with MS-CNTK

　詳細を見る

researchmap

▼全件表示

受賞

2025年度助成

2026年3月電気通信普及財団

　詳細を見る

researchmap
工学院共通経費による顕彰及び研究助成

2025年11月東京科学大学

　詳細を見る

researchmap
Next Generation Star

2025年10月 IEEE IROS2025 https://youtu.be/pP6YtlSVqlM

　詳細を見る

researchmap
IES SYPA Award

2025年10月 IEEE IROS2025

Sheng Li

　詳細を見る

researchmap
best reviewer

2025年8月 IEEE RO-MAN2025

Sheng Li

　詳細を見る

researchmap
task1: speech recognition error correction using LLM

2024年12月 SLT2024 grand challenge LLM GER

　詳細を見る

researchmap
top2 in one track

2023年12月 ICASSP2024 ICMC-ASR (In-Car Multi-Channel Automatic Speech Recognition) Challenge

　詳細を見る

researchmap
1st place in one track in ASRU2023 special session: VoiceMOS challenge

2023年12月

　詳細を見る

researchmap
IEEE-SPS grant for IEEE-ICASSP2023 oral presentation (Co-supervised PhD student Qianying Liu)

2023年5月 IEEE signal processing society

　詳細を見る

researchmap
1st place in 6 indexes (total 16) of Main/OOD tracks in INTERSPEECH2022 special session: VoiceMOS challenge

2022年

　詳細を見る

researchmap
3rd/4th place in constrained/unconstrained resource multilingual ASR tracks of OLR2021 challenge

2021年12月 Oriental language recognition challenge 2021

　詳細を見る

researchmap
Supervised student (Soky Kak) got best student paper nomination

2021年11月 O-COCOSDA2021

　詳細を見る

researchmap
成績優秀表彰優秀賞（団体）

2021年6月国立研究開発法人情報通信研究機構 (NICT)

　詳細を見る

researchmap
Travel Grant

2020年9月 ISCA Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription

supervised student Yuqin Lin

　詳細を見る

researchmap
Travel Grant

2020年9月 ISCA Singing Voice Extraction with Attention based Spectrograms Fusion

Supervised student Hao Shi

　詳細を見る

researchmap
ICME 2020 best student paper nomination, selected as journal paper in IEEE Trans Multimedia (TMM)

2020年7月

　詳細を見る

researchmap
2020年度国際展開ファンド (新しい提案得点 top1)

2020年5月国立研究開発法人情報通信研究機構 (NICT)

　詳細を見る

researchmap
テニュアトラック研究者として助成金を獲得 (2019年度はわずか3名)

2019年情報通信研究機構

　詳細を見る

researchmap
第34回テレコムシステム技術学生賞

2018年電気通信普及財団

李勝

　詳細を見る

researchmap
2012-2016 入学料・授業料の全部免除

2016年3月京都大学

　詳細を見る

researchmap
Paper nominated as ACM/IEEE Trans. Audio, Speech \& Language Process. cover

2016年

李勝

　詳細を見る

researchmap
ポートランド，Interspeech会議へIBM 旅行補助賞金

2012年 IBM Research

李勝

　詳細を見る

researchmap
京都大学推薦国費留学生特別配置入学

2012年日本文部科学省

李勝

　詳細を見る

researchmap
職員優秀賞

2011年中国科学院

李勝

　詳細を見る

researchmap
香港青年起業家プログラムの創造的な企画賞

2011年

李勝

　詳細を見る

researchmap
勵志奨学金

2004年南京大学

李勝

　詳細を見る

researchmap
香港陳蔭川財団大学新入生優秀者奨学金

2002年

李勝

　詳細を見る

researchmap
化学オリンピック二等賞,生物学オリンピック三等賞

2002年中国江蘇省

李勝

　詳細を見る

researchmap

▼全件表示

共同研究・競争的資金等の研究課題

LLMで強化された次世代ロボット・エッジAI向け音声言語処理

研究課題/領域番号：JPMJBY25F6 2026年4月 - 2031年3月

科学技術振興機構 (JST) 国家戦略分野の若手研究者及び博士後期課程学生の育成事業 (BOOST)

　詳細を見る

担当区分：研究代表者

researchmap
大規模言語モデル強化

2024年4月

東北大学―NICTマッチング研究

　詳細を見る

担当区分：研究代表者

researchmap
意図を的確に伝える音声対話翻訳の基盤技術の創出

2023年4月 - 2028年4月

日本学術振興会(JSPS) 科学研究費助成事業(KAKEN) 基盤研究(B)

　詳細を見る

担当区分：研究分担者

researchmap
M3OLR: Towards Effective Multilingual, Multimodal and Multitask Oriental Low-resourced Language Speech Recognition

2023年4月 - 2026年4月

日本学術振興会(JSPS) 科学研究費助成事業(KAKEN) Grant-in-Aid for Scientific Research (C)

　詳細を見る

担当区分：研究代表者

researchmap
自動話者認識における「なりすまし」の探知

2023年4月 - 2024年4月

ICT Virtual Organization of ASEAN Institutes and NICT (ASEAN IVO)

　詳細を見る

担当区分：研究分担者

researchmap
Bridging Eurasia from Sea -- Multilingual Speech Recognition for Maritime Silkroad

2022年 - 2024年

NICT international funding

　詳細を見る

担当区分：研究代表者

researchmap
Phantom in the Opera -- the Vulnerabilities of Speech Interface for Robotic Dialogue System

2021年4月 - 2023年4月

日本学術振興会(JSPS) 科学研究費助成事業(KAKEN) 若手研究

李勝

　詳細を見る

担当区分：研究代表者

researchmap
Advanced Multilingual End-to-End Speech Recognition

2020年4月 - 2022年4月

国立研究開発法人情報通信研究機構 (NICT) NICT tenure-track start-up funding

李勝

　詳細を見る

担当区分：研究代表者

researchmap
Bridging Eurasia -- Multilingual Speech Recognition for Silkroad

2020年4月 - 2022年4月

国立研究開発法人情報通信研究機構 (NICT) NICT international funding

李勝

　詳細を見る

担当区分：研究代表者

researchmap
Speaker De-identification with Provable Privacy in Speech Data Release

2020年4月 - 2021年4月

NII Open Collaborative Research

　詳細を見る

担当区分：連携研究者

researchmap
Next generation multilingual End-to-End speech recognition (from G30 to G200)

2019年10月 - 2021年3月

独立行政法人日本学術振興会科学研究費助成事業(KAKEN) 研究活動スタート支援

李勝

　詳細を見る

担当区分：研究代表者資金種別：競争的資金

researchmap

▼全件表示

その他

論文誌査読

　詳細を見る

[1] IEEE/ACM Trans. Audio, Speech \& Language Process.
[2] Computer Speech and Language
[3] Speech Communication
[4] IEICE transactions, letters
[5] APSIPA transactions
[6] Applied Acoustics
[7] Transactions on Asian and Low-Resource Language Information Processing (TALLIP)
[8] Digital Signal Processing
[9] behavior information and technology
[10] EURASIP Journal on Audio, Speech, and Music Processing

researchmap
国際会議査読

　詳細を見る

[1] ICASSP-2021/2022/2023/2024/2025/2026 (meta reviewer), INTERSPEECH-2015/2018/2019/2020/2021/2022/2023/2024/2025/2026, SLT-2022/2024, ASRU-2023/2025
[2] APSIPA-2019/2020/2021/2022/2023/2024/2025, IJCNN-2023/2024/2026, ICONIP2023
[3] BC_VCC-2020 (Blizzard Challenge and Voice Conversion Challenge 2020)
[4] ACL-2017/2018/2020/2021/2022/2023/2024/2025/2026, EACL-2020/2022/2026(loresmt), NAACL-HLT-2016/2018/2019/2021
[5] IJCNLP-2017, EMNLP-IJCNLP-2019, EMNLP-2020/2021/2022, AACL-IJCNLP-2020/2022/2023/2025, COLING-2018/2022, SIGDIAL-2024
[6] NLP-2022/2023/2024, IALP-2023/2024
[7] AAAI-2019, ICLR-2021/2024, NeurIPS-2022/2023, ICML-2023/2024
[8] IROS-2019/2025/2026, Ubiquitous Robots (UR)-2020/2026, IEEE-ROMAN 2023/2025/2026
[9] ICME-2020/2021/2022/2023(main+workshop)/2024, ACM Multimedia 2021/2022/2023, ACM Multimedia Asia 2023, MMM 2023
[10] PAKDD-2023, DASFAA-2024, ACM ICMR 2024

researchmap