Faculty Profiles - nakadai kazuhiro

写真a

nakadai kazuhiro

Organization

School of Engineering Professor

Homepage

http://www.ra.sc.e.titech.ac.jp/

Profile

Kazuhiro Nakadai received a B.E. in electrical engineering in 1993, an M.E. in information engineering in 1995, and a Ph.D. in electrical engineering in 2003 from the University of Tokyo. He worked with Nippon Telegraph and Telephone for four years as a system engineer from 1995 to 1999. After that, he was worked on the Kitano Symbiotic Systems Project, ERATO, JST as a researcher from 1999 to 2003. Currently he is a principal researcher for Honda Research Institute Japan, Co., Ltd. He has had a concurrent position at Tokyo Institute of Technology, as a visiting associate professor from 2006 to 2010, a visiting professor from 2011 to 2017, and a specially-appointed professor from July, 2017. He also had a concurrent position as a guest professor at Waseda University from 2011 to 2018. His research interests include AI, robotics, signal processing, computational auditory scene analysis, multi-modal integration and robot audition. He has been an executive board member for JSAI from 2015 to 2016, and for RSJ from 2017 to 2018. He is also a member of IPSJ, ASJ, HIS, ISCA, ACM and IEEE.

External link

News & Topics

Listening drone helps find victims needing rescue in disasters

2017/12/22

Languages： English

　 More details

As part of the ImPACT Tough Robotics Challenge Program, an initiative of the Cabinet Office of Japan, a Japanese research group has developed the first system worldwide that is able to detect acoustic signals such as voices from victims needing rescue, even when they are difficult to find or are in places cameras cannot be used. This system was developed using three technological elements: a microphone array technology for the robot ears, an interface for visualization of invisible sounds, and a microphone array that is easily connected to a drone, even in rainy weather.
ドローンが耳を澄まして要救助者の位置を検出 ―災害発生時の迅速な救助につながる技術を開発―

2017/12/08

Languages： Japanese

　 More details

ドローンのようなロボットによる人命救助はカメラなど視覚的な方法が主集音方法を工夫して雑音減らし、瓦礫の下の人の声などを検出迅速かつ効率的な人命救助に活用できる全天候型システムを開発暗くても、うるさくても、見えない場所でも、音を検出可

Degree

Ph. D. ( The Univ. of Tokyo )

Research Interests

Robot Audition
Computational Auditory Scene Analysis
Acoustic Signal Processing
Robotics
Artificial Intelligence

Research Areas

Informatics / Intelligent robotics / Robot Audition
Informatics / Intelligent informatics / Computational Auditory Scene Analysis
Informatics / Human interface and interaction / HMI, HRI
Informatics / Software / OSS

Education

Graduate School of Engineering, The University of Tokyo Information Engineering

1993.4 - 1995.3

　 More details

researchmap
The University of Tokyo Faculty of Engineering Department of Electrical and Electronics Engineering

1991.4 - 1993.3

　 More details

researchmap
The University of Tokyo School of Arts and Sciences Natural Sciences I

1989.4 - 1991.3

　 More details

researchmap

Research History

Institute of Science Tokyo Dept. of Systems and Control Engineering, School of Engineering Professor Ph.D.

2024.10

　 More details

Country：Japan

researchmap
Tokyo Institute of Technology Department of Systems and Control Engineering, School of Engineering Professor Ph.D.

2022.4 - 2022.9

　 More details

Country：Japan

researchmap
Tokyo Institute of Technology School of Engineering, Department of Systems and Control Engineering Specially-appointed Professor

2016.4 - 2022.3

　 More details

researchmap
Waseda University School of Creative Science and Engineering Guest Professor

2011.4 - 2018.3

　 More details

researchmap
Tokyo Institute of Technology Graduate School of Information Science and Engineering Adjunct Associate Professor -> Adjunct Professor (2012)

2006.4 - 2016.3

　 More details

researchmap
Honda Research Inst. Japan Co., Ltd. Principal Scientist

2003.5 - 2022.3

　 More details

researchmap
JST ERATO Kitano Symbiotic Systems Project Researcher

1999.7 - 2003.4

　 More details

researchmap
NTT Comware Employee

1997.9 - 1999.6

　 More details

researchmap
Nippon Telegraph and Telephone Corporation Employee

1995.4 - 1999.6

　 More details

researchmap

▼display all

Professional Memberships

ISCA

　 More details

researchmap
THE ROBOTICS SOCIETY OF JAPAN

　 More details

researchmap
THE JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE

　 More details

researchmap
IEEE

　 More details

researchmap
HUMAN INTERFACE SOCIETY

　 More details

researchmap
ACOUSTICAL SOCIETY OF JAPAN

　 More details

researchmap
ACM

　 More details

researchmap
INFORMATION PROCESSING SOCIETY OF JAPAN

　 More details

researchmap

▼display all

Committee Memberships

RSJ execuitive committee member

2025.3 - 2027.3

　 More details

Committee type：Academic society

researchmap
JSAI execuitive committee member

2024.7 - 2026.6

　 More details

Committee type：Academic society

researchmap
日本ロボット学会理事

2017.4 - 2019.3

　 More details

Committee type：Academic society

researchmap
人工知能学会理事

2015.7 - 2017.6

　 More details

Committee type：Academic society

researchmap

Papers

What Do Neural Networks Learn for TDOA Estimation? A Cross-Architecture Probing Study.

Yaozhong Kang, Jiang Wang, Runwu Shi, Takeshi Ashizawa, Benjamin Yen 0001, Kazuhiro Nakadai

CoRR abs/2606.22020 2026.6

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2606.22020

researchmap
Fast-SDE: Efficient Single-Microphone Sound Source Distance Estimation in Reverberant Environments.

Jiang Wang, Runwu Shi, Yaozhong Kang, Benjamin Yen 0001, Takeshi Ashizawa, Kazuhiro Nakadai

CoRR abs/2606.12339 2026.6

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2606.12339

researchmap
Ecologically-Constrained Task Arithmetic for Multi-Taxa Bioacoustic Classifiers Without Shared Data.

Ragib Amin Nihal, Benjamin Yen 0001, Runwu Shi, Takeshi Ashizawa, Kazuhiro Nakadai

CoRR abs/2605.03914 2026.5

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2605.03914

researchmap
The Talking Robot: Distortion-Robust Acoustic Models for Robot-Robot Communication.

Hanlong Li, Karishma Kamalahasan, Jiahui Li, Kazuhiro Nakadai, Shreyas Kousik

CoRR abs/2603.07072 2026.3

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2603.07072

researchmap
Unsupervised Single-Channel Audio Separation with Diffusion Source Priors.

Runwu Shi, Chang Li, Jiang Wang, Rui Zhang, Nabeela Khan, Benjamin Yen 0001, Takeshi Ashizawa, Kazuhiro Nakadai

AAAI 25348 - 25356 2026

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1609/aaai.v40i30.39728

researchmap

Other Link： https://dblp.org/db/conf/aaai/aaai2026.html#ShiLWZKYAN26
Pattern Enhanced Multi-Turn Jailbreaking: Exploiting Structural Vulnerabilities in Large Language Models.

Ragib Amin Nihal, Rui Wen, Kazuhiro Nakadai, Jun Sakuma

ACL (Findings) 22123 - 22174 2026

　More details

Publishing type：Research paper (international conference proceedings)

researchmap

Other Link： https://dblp.org/rec/conf/acl/2026f
Single-Microphone-Based Sound Source Localization for Mobile Robots in Reverberant Environments. Reviewed International coauthorship International journal

Jiang Wang, Runwu Shi, Benjamin Yen 0001, He Kong, Kazuhiro Nakadai

CoRR abs/2506.16173 2025.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2506.16173

researchmap
Observability-Aware Active Calibration of Multi-Sensor Extrinsics for Ground Robots via Online Trajectory Optimization. Reviewed International coauthorship

Jiang Wang, Yaozhong Kang, Linya Fu, Kazuhiro Nakadai, He Kong

CoRR abs/2506.13420 2025.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2506.13420

researchmap
合同研究会 2024（SIGAIs 2024）開催報告

馬場雪乃, 松井藤五郎, 中臺一博, 坂地泰紀

人工知能 40 ( 3 ) 426 - 433 2025.5

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jjsai.40.3_426

CiNii Research

researchmap

Other Link： https://ndlsearch.ndl.go.jp/books/R000000004-I034116187
Multilingual Gloss-free Sign Language Translation: Towards Building a Sign Language Foundation Model. Reviewed

Sihan Tan, Taro Miyazaki, Kazuhiro Nakadai

CoRR abs/2505.24355 2025.5

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2505.24355

researchmap
Single-Channel Target Speech Extraction Utilizing Distance and Room Clues. Reviewed

Runwu Shi, Zirui Lin, Benjamin Yen 0001, Jiang Wang, Ragib Amin Nihal, Kazuhiro Nakadai

CoRR abs/2505.14433 2025.5

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2505.14433

researchmap
An Efficient GPU-based Implementation for Noise Robust Sound Source Localization. Reviewed

Zirui Lin, Masayuki Takigahira, Naoya Terakado, Haris Gulzar, Monikka Roslianna Busto, Takeharu Eda, Katsutoshi Itoyama, Kazuhiro Nakadai, Hideharu Amano

CoRR abs/2504.03373 2025.4

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2504.03373

researchmap
Weakly Supervised Multiple Instance Learning for Whale Call Detection and Localization in Long-Duration Passive Acoustic Monitoring. Reviewed

Ragib Amin Nihal, Benjamin Yen 0001, Runwu Shi, Kazuhiro Nakadai

CoRR abs/2502.20838 2025.2

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2502.20838

researchmap
Improvement in Sign Language Translation Using Text CTC Alignment. Reviewed

Sihan Tan, Taro Miyazaki, Nabeela Khan, Kazuhiro Nakadai

COLING 3255 - 3266 2025

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Association for Computational Linguistics

researchmap

Other Link： https://dblp.uni-trier.de/rec/conf/coling/2025
MultiGAU: Real Time Sign Language Generation Using Multimodal Gated Attention. Reviewed

Nabeela Khan, Bowen Wu 0002, Carlos Toshinori Ishi, Kazuhiro Nakadai

IEA/AIE (1) 149 - 160 2025

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Springer

DOI： 10.1007/978-981-96-8889-0_13

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ieaaie/ieaaie2025-1.html#KhanWIN25
Distance Based Single-Channel Target Speech Extraction. Reviewed

Runwu Shi, Benjamin Yen 0001, Kazuhiro Nakadai

ICASSP 1 - 5 2025

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ICASSP49660.2025.10887680

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/icassp/icassp2025.html#Shi0N25
Swarm Active Audition with Robots and Drones: Real-World Performance Validation.

Kazuhiro Nakadai, Kotaro Hoshiba, Benjamin Yen 0001, Makoto Kumon, Yoko Sasaki

IROS 6107 - 6112 2025

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS60139.2025.11247372

researchmap

Other Link： https://dblp.org/db/conf/iros/iros2025.html#NakadaiHYKS25
Single-Microphone-Based Sound Source Localization for Mobile Robots in Reverberant Environments.

Jiang Wang, Runwu Shi, Benjamin Yen 0001, He Kong 0001, Kazuhiro Nakadai

IROS 6135 - 6140 2025

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS60139.2025.11246992

researchmap

Other Link： https://dblp.org/db/conf/iros/iros2025.html#WangSYKN25
Towards Online Sign Language Expression for Real-Time Human-Robot Interaction.

Nabeela Khan, Sihan Tan, Kazuhiro Nakadai

RO-MAN 1123 - 1128 2025

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/RO-MAN63969.2025.11217908

researchmap

Other Link： https://dblp.org/db/conf/ro-man/ro-man2025.html#KhanTN25
Multi-Speaker Localization Based on Von Mises-Bernoulli Vivit.

Haruto Yokota, Benjamin Yen 0001, Kazuhiro Nakadai

EUSIPCO 241 - 245 2025

　More details

Publishing type：Research paper (international conference proceedings)

researchmap

Other Link： https://dblp.org/rec/conf/eusipco/2025
From Blurry to Brilliant Detection: YOLO-Based Aerial Object Detection with Super Resolution.

Ragib Amin Nihal, Benjamin Yen 0001, Takeshi Ashizawa, Katsutoshi Itoyama, Kazuhiro Nakadai

APSIPA 1922 - 1927 2025

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/APSIPAASC65261.2025.11249079

researchmap

Other Link： https://dblp.org/db/conf/apsipa/apsipa2025.html#Nihal0AIN25
SignFlow: End-to-End Sign Language Generation for One-to-Many Modeling using Conditional Flow Matching.

Nabeela Khan, Bowen Wu 0002, Sihan Tan, Carlos Toshinori Ishi, Kazuhiro Nakadai

ICMI 173 - 180 2025

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3716553.3750765

researchmap

Other Link： https://dblp.org/db/conf/icmi/icmi2025.html#Khan0TIN25
Single-Channel Target Speech Extraction Utilizing Distance and Room Clues.

Runwu Shi, Zirui Lin, Benjamin Yen 0001, Jiang Wang, Ragib Amin Nihal, Kazuhiro Nakadai

EUSIPCO 481 - 485 2025

　More details

Publishing type：Research paper (international conference proceedings)

researchmap

Other Link： https://dblp.org/rec/conf/eusipco/2025
Dialect Identification Using Resource-Efficient Fine-Tuning Approaches.

Zirui Lin, Haris Gulzar, Monnika Roslianna Busto, Akiko Masaki, Takeharu Eda, Kazuhiro Nakadai

APSIPA 670 - 675 2025

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/APSIPAASC65261.2025.11249367

researchmap

Other Link： https://dblp.org/db/conf/apsipa/apsipa2025.html#LinGBMEN25
Multilingual Gloss-free Sign Language Translation: Towards Building a Sign Language Foundation Model.

Sihan Tan, Taro Miyazaki, Kazuhiro Nakadai

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 553 - 561 2025

　More details

Publishing type：Research paper (international conference proceedings) Publisher：Association for Computational Linguistics

DOI： 10.18653/v1/2025.acl-short.43

researchmap

Other Link： https://dblp.org/db/conf/acl/acl2025-2.html#TanMN25
AIチャレンジ研究会のすすめ

植村渉, 干場功太郎, 鈴木麗璽, 中臺一博, 光永法明

人工知能学会第二種研究会資料 2024 ( Challenge-066 ) 03 2024.12

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-066_03

CiNii Research

researchmap
屋外環境下でのドローンのローターノイズによる地表材質推定に向けた手法の検討およびマイクロホンアレイ用風防の設計

矢野翼, Yen Benjamin, 中臺一博

人工知能学会第二種研究会資料 2024 ( Challenge-066 ) 13 2024.12

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-066_13

CiNii Research

researchmap
小領域移動物体検出における背景フローの弁別手法

西田健次, 中臺一博, 糸山克寿

人工知能学会第二種研究会資料 2024 ( Challenge-066 ) 10 2024.12

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-066_10

CiNii Research

researchmap
音声強調と雑音特徴量を用いた音声認識の雑音耐性向上

大﨑崇博, 周藤唯, 中臺一博

人工知能学会第二種研究会資料 2024 ( Challenge-066 ) 01 2024.12

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-066_01

CiNii Research

researchmap
複数ドローンとロボットの協調による群アクティブ聴覚システム

中臺一博, 公文誠, 佐々木洋子, 干場功太郎, Yen Benjamin

人工知能学会第二種研究会資料 2024 ( Challenge-066 ) 11 2024.12

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-066_11

CiNii Research

researchmap
話者情報の半教師あり学習を用いたオフライン話者ダイアライゼーション

阿坂脩平, Yen Benjamin, 糸山克寿, 中臺一博

人工知能学会第二種研究会資料 2024 ( Challenge-066 ) 04 2024.12

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-066_04

CiNii Research

researchmap
Swarm Active Audition System with Robots and Drones for a Search and Rescue Task Reviewed

Kazuhiro Nakadai, Makoto Kumon, Yoko Sasaki, Kotaro Hoshiba, Benjamin Yen

2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1 - 6 2024.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/apsipaasc63619.2025.10848937

researchmap
Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance? Reviewed

Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai

EURASIP Journal on Audio, Speech, and Music Processing 2024 ( 1 ) 66 - 66 2024.12

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1186/s13636-024-00387-x

researchmap
A review of deep learning-based approaches to sign language processing. Reviewed

Sihan Tan, Nabeela Khan, Zhaoyi An, Yoshitaka Ando, Rei Kawakami, Kazuhiro Nakadai

Advanced Robotics 38 ( 23 ) 1649 - 1667 2024.12

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2024.2442721

researchmap
Special issue on robot and human interactive communication (Part II). Reviewed International coauthorship

Kazuhiro Nakadai, Emilia I. Barakova, Ki-Uk Kyung

Advanced Robotics 38 ( 23 ) 1647 - 1648 2024.12

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2024.2440161

researchmap
Special issue on robot and human interactive communication. Reviewed International coauthorship

Kazuhiro Nakadai, Emilia I. Barakova, Ki-Uk Kyung

Advanced Robotics 38 ( 19-20 ) 1349 - 1350 2024.10

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2024.2410825

researchmap
Online adaptation of fourier series-based acoustic transfer function model and its application to sound source localization and separation Reviewed

Yui Sudo, Masayuki Takigahira, Hideo Tsuru, Kazuhiro Nakadai, Hirofumi Nakajima

Advanced Robotics 38 ( 19-20 ) 1351 - 1363 2024.7

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Informa UK Limited

DOI： 10.1080/01691864.2024.2379384

researchmap
能動推論に基づく1対1インタラクションモデルの検討

木村, 駿希, 中臺, 一博, 仁科, 繁明, 糸山, 克寿

第86回全国大会講演論文集 2024 ( 1 ) 193 - 194 2024.3

　More details

Language：Japanese Publisher：情報処理学会

能動推論は生物が環境内で未知の状態を推定しながら最適な行動を推論し選択するための理論的アプローチである。本研究ではこの能動推論を人と人との言語的インタラクションに基づく他者の情動推定と発話選択のモデルに適用することを試みた。具体的には、親子のインタラクションにおいて、子供に部屋を掃除させたいという意図を持つ親に対して、できるだけ親に叱られないようにしながら最小限の掃除で済ませたい子どもの発話選択が能動推論によってどのように行われるかを検討した。子から見た親の感情状態を未知状態とみなし、その推定が子にとって望ましい状態になるような発話の選択を、能動推論によって行なった。提案した発話選択モデルに基づく親子間のインタラクションを実装し、シミュレーション実験で評価を行ったところ、子が親の状態の推定誤差を減少させながら、親の発言に対して適切な応答を選択し望ましい状態を達成できることを確認した。

CiNii Books

CiNii Research

researchmap
距離ベース時間周波数マスク推定による音声強調手法の検討

石井, 遼平, 中臺, 一博, 糸山, 克寿

第86回全国大会講演論文集 2024 ( 1 ) 361 - 362 2024.3

　More details

Language：Japanese Publisher：情報処理学会

一般に会議では、複数の人が集まって話をするため、たとえ各話者の口元にマイクをつけて収録した場合でも、収録音には対象話者の音声に加え、他の話者の音声が混入してしまう。このため、収録音中の対象話者の音声の聴取が困難になり、議事録作成などの用途に支障をきたすという問題がある。本稿では、この問題を解決するため、ディープラーニングにより推定された時間周波数マスクを用いて、モノラル収録音から、近距離話者の音声のみを抽出する音声強調法を提案する。提案手法を人間の聴覚と相関があるPESQとSTOIを用いて評価した結果、提案手法の有効性を示すことができた。

CiNii Books

CiNii Research

researchmap
LCMV-based Scan-and-Sum Beamforming for Region Source Extraction. Reviewed

Aoto Yasue, Benjamin Yen 0001, Katsutoshi Itoyama, Kazuhiro Nakadai

APSIPA 1 - 6 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/APSIPAASC63619.2025.10848984

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/apsipa/apsipa2024.html#Yasue0IN24
UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios. Reviewed

Ragib Amin Nihal, Benjamin Yen 0001, Katsutoshi Itoyama, Kazuhiro Nakadai

ICPR (14) 145 - 162 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-031-78341-8_10

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/icpr/icpr2024-14.html#NihalYIN24
A Video Vision Transformer for Sound Source Localization. Reviewed

Haruto Yokota, Mert Bozkurtlar, Benjamin Yen 0001, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

32nd European Signal Processing Conference(EUSIPCO) 106 - 110 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

researchmap

Other Link： https://dblp.uni-trier.de/rec/conf/eusipco/2024
Improving Noise Robustness of Automatic Speech Recognition with Speech Enhancement and Adapters Reviewed

大崎崇博, 周藤唯, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会誌 42 ( 9 ) 2024

　More details

Language：Japanese

J-GLOBAL

researchmap
Improving Impressions of Response Delay in AI-based Spoken Dialogue Systems. Reviewed

Shuhei Asaka, Katsutoshi Itoyama, Kazuhiro Nakadai

33rd IEEE International Conference on Robot and Human Interactive Communication(RO-MAN) 1416 - 1421 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/RO-MAN60168.2024.10731216

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ro-man/ro-man2024.html#AsakaIN24
Performance Improvement and Acceleration of Surface Source Extractionbased on Multiple Constraint MVDR Beamforming and Woodbury Matrix Identity

安江蒼人, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会誌 42 ( 6 ) 2024

　More details

Language：Japanese

J-GLOBAL

researchmap
Drone audition: implementation of an indoor multi-drone system for sound source tracking. Reviewed

Benjamin Yen 0001, Kazuhiro Nakadai

APSIPA 1 - 6 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/APSIPAASC63619.2025.10848928

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/apsipa/apsipa2024.html#0001N24
Drone audition: dataset and methods for ground surface material classification using drone noise in outdoor environment. Reviewed

Tsubasa Yano, Benjamin Yen 0001, Kazuhiro Nakadai

APSIPA 1 - 6 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/APSIPAASC63619.2025.10848914

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/apsipa/apsipa2024.html#Yano0N24
Implementation of a Robot Operation System-based network for sound source localization using multiple drones. Reviewed

Takumi Yamamoto, Kotaro Hoshiba, Benjamin Yen 0001, Kazuhiro Nakadai

APSIPA 1 - 6 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/APSIPAASC63619.2025.10849321

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/apsipa/apsipa2024.html#YamamotoH0N24
Bird Vocalization Embedding Extraction Using Self-Supervised Disentangled Representation Learning.

Runwu Shi, Katsutoshi Itoyama, Kazuhiro Nakadai

CoRR abs/2412.20146 2024

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2412.20146

researchmap
Distance Based Single-Channel Target Speech Extraction.

Runwu Shi, Benjamin Yen 0001, Kazuhiro Nakadai

CoRR abs/2412.20144 2024

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2412.20144

researchmap
FPGA-based Low Power Acceleration of HARK Sound Source Localization. Reviewed

Zirui Lin, Katsutoshi Itoyama, Kazuhiro Nakadai, Hideharu Amano

COOL CHIPS 1 - 6 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/COOLCHIPS61292.2024.10531180

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/coolchips/coolchips2024.html#LinINA24
Improving Noise Robustness of Automatic Speech Recognition Based on a Parallel Adapter Model with Near-Identity Initialization. Reviewed

Takahiro Osaki, Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

IEA/AIE 2023 ( Challenge-063 ) 454 - 466 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Springer

DOI： 10.1007/978-981-97-4677-4_37

J-GLOBAL

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ieaaie/ieaaie2024.html#OsakiSINN24
Real Time Sound Source Localization Using von-Mises ResNet. Reviewed International coauthorship

Mert Bozkurtlar, Benjamin Yen 0001, Katsutoshi Itoyama, Kazuhiro Nakadai

SII 466 - 471 2024

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII58957.2024.10417224

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2024.html#BozkurtlarYIN24
From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution. Reviewed

Ragib Amin Nihal, Benjamin Yen 0001, Katsutoshi Itoyama, Kazuhiro Nakadai

CoRR abs/2401.14661 2024

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2401.14661

researchmap
SLAM-based Joint Calibration of Multiple Asynchronous Microphone Arrays and Sound Source Localization. Reviewed

Jiang Wang, Yuanzheng He, Daobilige Su, Katsutoshi Itoyama, Kazuhiro Nakadai, Junfeng Wu 0001, Shoudong Huang, Youfu Li 0001, He Kong

CoRR abs/2405.19813 2024

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2405.19813

researchmap
Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance? Reviewed

Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai

CoRR abs/2407.15310 2024

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2407.15310

researchmap
UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios. Reviewed

Ragib Amin Nihal, Benjamin Yen 0001, Katsutoshi Itoyama, Kazuhiro Nakadai

CoRR abs/2408.04922 2024

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2408.04922

researchmap
SLAM-Based Joint Calibration of Multiple Asynchronous Microphone Arrays and Sound Source Localization. Reviewed International coauthorship

Jiang Wang, Yuanzheng He, Daobilige Su, Katsutoshi Itoyama, Kazuhiro Nakadai, Junfeng Wu 0001, Shoudong Huang, Youfu Li 0001, He Kong

IEEE Trans. Robotics 40 4024 - 4044 2024

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1109/TRO.2024.3410456

researchmap
Improvement in Sign Language Translation Using Text CTC Alignment.

Tan Sihan, Taro Miyazaki, Khan Nabeela Khanum, Kazuhiro Nakadai

CoRR abs/2412.09014 2024

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2412.09014

researchmap
Monitoring the courtship flight trajectory of Latham's snipe ( Gallinago hardwickii ) using microphone arrays Reviewed

Shiho Matsubayashi, Hideki Osaka, Reiji Suzuki, Kazuhiro Nakadai, Hiroshi G. Okuno

Ecology and Evolution 13 ( 4 ) 2023.3

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Wiley

DOI： 10.1002/ece3.9938

researchmap
Estimating the Soundscape Structure and Dynamics of Forest Bird Vocalizations in an Azimuth-Elevation Space Using a Microphone Array Reviewed

Reiji Suzuki, Koichiro Hayashi, Hideki Osaka, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Applied Sciences 13 ( 6 ) 3607 - 3607 2023.3

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：MDPI AG

Songbirds are one of the study targets for both bioacoustic and ecoacoustic research. In this paper, we discuss the applicability of robot audition techniques to understand the dynamics of forest bird vocalizations in a soundscape measured in azimuth and elevation angles with a single 16-channel microphone array, using HARK and HARKBird. First, we evaluated the accuracy in estimating the azimuth and elevation angles of bird vocalizations replayed from a loudspeaker on a tree, 6.55 m above the height of the array, from different horizontal distances in a forest. The results showed that the localization error of azimuth and elevation angle was equal to or less than 5 degrees and 15 degrees, respectively, in most of cases when the horizontal distance from the array was equal to or less than 35 m. We then conducted a field observation of vocalizations to monitor birds in a forest. The results showed that the system can successfully detect how birds use the soundscape horizontally and vertically. This can contribute to bioacoustic and ecoacoustic research, including behavioral observations and study of biodiversity.

DOI： 10.3390/app13063607

researchmap
深層ブラインド音源分離を用いた転移学習による環境音分離

合澤, 隆拓, 坂東, 宜昭, 糸山, 克寿, 西田, 健次, 中臺, 一博

第85回全国大会講演論文集 2023 ( 1 ) 435 - 436 2023.2

　More details

Language：Japanese Publisher：情報処理学会

音環境理解において雑踏環境下での音源分離は，環境音認識の基盤技術の一つである．環境音は音声と異なり，スペクトル構造が多様であり，事前にあらゆる環境に適応できるモデルを学習することが難しい．本研究では，非線形ブラインド音源分離法のひとつである深層フルランク空間相関分析法を用いた目的環境への教師なし転移学習を行う．事前学習データに含まれない未知の音源信号が混合音に含まれていても，多チャネル信号の空間情報を表す周辺尤度関数に基づき，混合音のみから分離モデルを改善するよう学習できる．環境音を用いた数値混合音により，推論データに対する教師なし転移学習の有効性を確認した．

CiNii Books

CiNii Research

researchmap
Extracting Bird Vocalizations from a Complex Natural Soundscape in Forests Using Robot Audition Techniques Reviewed

Reiji Suzuki, Shinji Sumitani, Zachary Harlow, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

2023 IEEE/SICE International Symposium on System Integration (SII) 1 - 6 2023.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/sii55687.2023.10039198

researchmap
Audio-Visual Class Association Based on Two-stage Self-supervised Contrastive Learning towards Robust Scene Analysis Reviewed

Kei Suzuki, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2023 IEEE/SICE International Symposium on System Integration (SII) 1 - 6 2023.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/sii55687.2023.10039379

researchmap
Metric-Based Multimodal Meta-Learning for Human Movement Identification Via Footstep Recognition Reviewed

Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2023 IEEE/SICE International Symposium on System Integration (SII) abs/2111.07979 1 - 8 2023.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/sii55687.2023.10039089

researchmap
An Ensemble Method for Multiple Speech Enhancement Using Deep Learning Reviewed

Masahiko Fujita, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2023 IEEE/SICE International Symposium on System Integration (SII) 1 - 6 2023.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/sii55687.2023.10039167

researchmap
FPGA based Power-Efficient Edge Server to Accelerate Speech Interface for Socially Assistive Robotics Reviewed

Haris Gulzar, Muhammad Shakeel, Katsutoshi Itoyama, Kazuhiro Nakadai, Kenji Nishida, Hideharu Amano, Takeharu Eda

2023 IEEE/SICE International Symposium on System Integration (SII) 1 - 6 2023.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/sii55687.2023.10039093

researchmap
Reconstruction of Depth Scenes Based on Echolocation Reviewed

Hidehiko Kishinami, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2023 IEEE/SICE International Symposium on System Integration (SII) 1 - 6 2023.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/sii55687.2023.10039271

researchmap
Is the Ideal Ratio Mask Really the Best? - Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers. Reviewed

Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai

CoRR abs/2309.12065 2023

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2309.12065

researchmap
Unsupervised Domain Adaptation of Universal Source Separation Based on Neural Full-Rank Spatial Covariance Analysis. Reviewed

Takahiro Aizawa, Yoshiaki Bando, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai, Masaki Onishi

MLSP 1 - 6 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/MLSP55844.2023.10285999

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/mlsp/mlsp2023.html#AizawaBINNO23
Is the Ideal Ratio Mask Really the Best? - Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers. Reviewed

Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai

APSIPA ASC 1843 - 1850 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/APSIPAASC58517.2023.10317440

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/apsipa/apsipa2023.html#HiroeIN23
Low power implementation of Geometric High-order Decorrelation-based Source Separation on an FPGA board. Reviewed

Ziquan Qin, Kaijie Wei, Hideharu Amano, Kazuhiro Nakadai

IEEE Symposium in Low-Power and High-Speed Chips 1 - 6 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/COOLCHIPS57690.2023.10121954

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/coolchips/coolchips2023.html#QinWAN23
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation. Reviewed

Yui Sudo, Kazuya Hata, Kazuhiro Nakadai

INTERSPEECH 491 - 495 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/Interspeech.2023-1320

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/interspeech/interspeech2023.html#SudoHN23
miniStreamer: Enhancing Small Conformer with Chunked-Context Masking for Streaming ASR Applications on the Edge. Reviewed

Haris Gulzar, Monikka Roslianna Busto, Takeharu Eda, Katsutoshi Itoyama, Kazuhiro Nakadai

INTERSPEECH 3277 - 3281 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/Interspeech.2023-1162

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/interspeech/interspeech2023.html#GulzarBEIN23
Improving Sign Language Understanding Introducing Label Smoothing. Reviewed

Tan Sihan, Khan Nabeela Khanum, Katsutoshi Itoyama, Kazuhiro Nakadai

RO-MAN 113 - 118 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/RO-MAN57019.2023.10309531

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ro-man/ro-man2023.html#SihanKIN23
Online Adaptation of Fourier Series Based Acoustic Transfer Function Model to Improve Sound Source Localization and Separation. Reviewed

Yui Sudo, Masayuki Takigahira, Hideo Tsuru, Kazuhiro Nakadai, Hirofumi Nakajima

RO-MAN 2023 ( Challenge-063 ) 2058 - 2063 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/RO-MAN57019.2023.10309550

J-GLOBAL

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ro-man/ro-man2023.html#SudoTTNN23
Observability Analysis of Graph SLAM-Based Joint Calibration of Multiple Microphone Arrays and Sound Source Localization. Reviewed

Yuanzheng He, Jiang Wang, Daobilige Su, Kazuhiro Nakadai, Junfeng Wu 0001, Shoudong Huang, Youfu Li 0001, He Kong

IEEE/SICE International Symposium on System Integration(SII) 1 - 8 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII55687.2023.10039204

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2023.html#HeWSNWHLK23
Assessment of Simultaneous Calibration for Positions, Orientations, and Time Offsets in Multiple Microphone Arrays Systems. Reviewed

Chishio Sugiyama, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

IEEE/SICE International Symposium on System Integration(SII) 1 - 6 2023

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII55687.2023.10039440

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2023.html#SugiyamaINN23
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation. Reviewed

Yui Sudo, Kazuya Hata, Kazuhiro Nakadai

CoRR abs/2305.17846 2023

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2305.17846

researchmap
深層ブラインド音源分離と転移学習に基づく遠隔音声認識の評価

合澤隆拓, 坂東宜昭, 糸山克寿, 西田健次, 中臺一博

人工知能学会第二種研究会資料 2022 ( Challenge-061 ) 09 2022.11

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2022.challenge-061_09

CiNii Research

J-GLOBAL

researchmap
Outdoor evaluation of sound source localization for drone groups using microphone arrays Reviewed

Taiki Yamada, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 9296 - 9301 2022.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/iros47612.2022.9982039

researchmap
Spotforming by NMF Using Multiple Microphone Arrays Reviewed

Yasuhiro Kagimoto, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 9253 - 9258 2022.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/iros47612.2022.9981808

researchmap
Weakly-Supervised Neural Full-Rank Spatial Covariance Analysis for a Front-End System of Distant Speech Recognition Reviewed

Yoshiaki Bando, Takahiro Aizawa, Katsutoshi Itoyama, Kazuhiro Nakadai

Interspeech 2022 3824 - 3828 2022.9

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/interspeech.2022-11077

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/interspeech/interspeech2022.html#BandoAIN22
Empirical Sampling from Latent Utterance-wise Evidence Model for Missing Data ASR based on Neural Encoder-Decoder Model Reviewed

Ryu Takeda, Yui Sudo, Kazuhiro Nakadai, Kazunori Komatani

Interspeech 2022 3789 - 3793 2022.9

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/interspeech.2022-576

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/interspeech/interspeech2022.html#TakedaSNK22
Auditory Survey of Endangered Eurasian Bittern Using Microphone Arrays and Robot Audition Reviewed

Shiho Matsubayashi, Kazuhiro Nakadai, Reiji Suzuki, Tatsuya Ura, Makoto Hasebe, Hiroshi G. Okuno

Frontiers in Robotics and AI 9 854572 - 854572 2022.4

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.3389/frobt.2022.854572

researchmap

Other Link： https://dblp.uni-trier.de/db/journals/firai/firai9.html#MatsubayashiNSU22
visualizing soundscapes and quantifying interspecific interactions in forest animal vocalizations using robot audition technology

2022 ( 1 ) 475 - 476 2022.2

　More details

Language：English

CiNii Books

CiNii Research

researchmap
深層学習を用いた複数音声強調処理のアンサンブル手法の検討

藤田, 雅彦, 糸山, 克寿, 西田, 健次, 中臺, 一博

第84回全国大会講演論文集 2022 ( 1 ) 337 - 338 2022.2

　More details

Language：Japanese Publisher：情報処理学会

本稿では，複数の音声強調処理から生成される時間周波数マスクに対して，深層学習を用いたアンサンブル学習法を提案する．提案手法は複数の規範から生成される時間周波数マスクを用いるため，様々な環境雑音に対応することができる．提案手法により，得られたアンサンブル時間周波数マスクをビームフォーミングに適用し，音声を強調する評価実験を行った結果，提案手法が既存手法を上回り，深層学習によるアンサンブルの有効性を示すことができた．

CiNii Books

CiNii Research

researchmap
Visual Scene Reconstruction based on Echolocation with a Generative Adversarial Network Reviewed

岸波華彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会誌 40 ( 4 ) 351 - 354 2022

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.40.351

J-GLOBAL

researchmap
Evaluation of a Speech Enhancement Method Combining Ensemble Time-Frequency Masking and Beamforming Reviewed

藤田雅彦, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会誌 40 ( 7 ) 631 - 634 2022

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.40.631

J-GLOBAL

researchmap
An FPGA off-loading of HARK sound source localization. Reviewed

Zhongyang Hou, Kaijie Wei, Hideharu Amano, Kazuhiro Nakadai

2022 Tenth International Symposium on Computing and Networking(CANDARW) 236 - 240 2022

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/CANDARW57323.2022.00057

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ic-nc/candar2022w.html#HouWAN22
Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection. Reviewed

Yui Sudo, Muhammad Shakeel 0001, Kazuhiro Nakadai, Jiatong Shi, Shinji Watanabe 0001

Interspeech 2022(INTERSPEECH) 2022 ( Challenge-061 ) 4641 - 4645 2022

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/Interspeech.2022-11216

J-GLOBAL

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/interspeech/interspeech2022.html#Sudo0NS022
Observability Analysis of Graph SLAM-Based Joint Calibration of Multiple Microphone Arrays and Sound Source Localization. Reviewed

Yuanzheng He, Jiang Wang, Daobilige Su, Kazuhiro Nakadai, Junfeng Wu 0001, Shoudong Huang, Youfu Li 0001, He Kong

CoRR abs/2210.05600 2022

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2210.05600

researchmap
Multichannel environmental sound segmentation: with separately trained spectral and spatial features Reviewed

Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Applied Intelligence 51 ( 11 ) 8245 - 8259 2021.11

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Springer

DOI： 10.1007/s10489-021-02314-5

Scopus

researchmap
CASE: CNN Acceleration for Speech-Classification in Edge-Computing Reviewed

Haris Gulzar, Muhammad Shakeel, Kenji Nishida, Katsutoshi Itoyama, Kazuhiro Nakadai, Hideharu Amano

2021 IEEE Cloud Summit (Cloud Summit) 2021.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ieeecloudsummit52029.2021.00018

researchmap
Assessment of sound source tracking using multiple drones equipped with multiple microphone arrays Reviewed

Taiki Yamada, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

International Journal of Environmental Research and Public Health 18 ( 17 ) 2021.9

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：MDPI

DOI： 10.3390/ijerph18179039

Scopus

PubMed

researchmap
Assessment of von Mises-Bernoulli Deep Neural Network in Sound Source Localization Reviewed

Katsutoshi Itoyama, Yoshiya Morimoto, Shungo Masaki, Ryosuke Kojima, Kenji Nishida, Kazuhiro Nakadai

Interspeech 2021 2152 - 2156 2021.8

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/interspeech.2021-1050

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/interspeech/interspeech2021.html#ItoyamaMMKNN21
Simultaneous Calibration of Positions, Orientations, and Time Offsets, Among Multiple Microphone Arrays Reviewed

Chishio Sugiyama, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2021 IEEE International Conference on Autonomous Systems (ICAS) 2021.8

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/icas49788.2021.9551166

researchmap
Non-Invasive Monitoring of the Spatio-Temporal Dynamics of Vocalizations among Songbirds in a Semi Free-Flight Environment Using Robot Audition Techniques Reviewed

Shinji Sumitani, Reiji Suzuki, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Birds 2 ( 2 ) 158 - 172 2021.4

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：MDPI AG

To understand the social interactions among songbirds, extracting the timing, position, and acoustic properties of their vocalizations is essential. We propose a framework for automatic and fine-scale extraction of spatial-spectral-temporal patterns of bird vocalizations in a densely populated environment. For this purpose, we used robot audition techniques to integrate information (i.e., the timing, direction of arrival, and separated sound of localized sources) from multiple microphone arrays (array of arrays) deployed in an environment, which is non-invasive. As a proof of concept of this framework, we examined the ability of the method to extract active vocalizations of multiple Zebra Finches in an outdoor mesh tent as a realistic situation in which they could fly and vocalize freely. We found that localization results of vocalizations reflected the arrangements of landmark spots in the environment such as nests or perches and some vocalizations were localized at non-landmark positions. We also classified their vocalizations as either songs or calls by using a simple method based on the tempo and length of the separated sounds, as an example of the use of the information obtained from the framework. Our proposed approach has great potential to understand their social interactions and the semantics or functions of their vocalizations considering the spatial relationships, although detailed understanding of the interaction would require analysis of more long-term recordings.

DOI： 10.3390/birds2020012

researchmap
Detecting earthquakes: a novel deep learning-based approach for effective disaster response Reviewed

Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Applied Intelligence 51 ( 11 ) 8305 - 8315 2021.4

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Springer Science and Business Media LLC

DOI： 10.1007/s10489-021-02285-7

researchmap

Other Link： http://link.springer.com/article/10.1007/s10489-021-02285-7/fulltext.html
アンサンブル時間周波数マスクによる音声強調手法の検討

藤田, 雅彦, 糸山, 克寿, 西田, 健次, 中臺, 一博

第83回全国大会講演論文集 2021 ( 1 ) 235 - 236 2021.3

　More details

Language：Japanese Publisher：情報処理学会

本稿では, アンサンブル時間周波数マスクを用いたビームフォーミングに基づく音声強調手法を報告する. 従来の時間周波数マスクベースの音声強調手法は, 単一のキューから時間周波数マスク推定していたため，十分に入力信号に含まれる音声強調の鍵となる特徴量を活かしきれていなかった．そこで，異なるキューから推定される複数の時間周波数マスクを統合して処理のロバスト性を向上するアンサンブル時間周波数マスク法を提案する. 提案手法をCHiME3コーパスを使って, 人間の聴感と相関があるPESQとSTOIを用いて評価した. いずれの評価指標においても提案手法がアンサンブルを行わない既存手法を上回り, 提案手法の有効性を示した.

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00214813/
複数マイクロホンアレイの同期および位置・姿勢推定の同時最適化の検討

杉山, 地塩, 糸山, 克寿, 西田, 健次, 中臺, 一博

第83回全国大会講演論文集 2021 ( 1 ) 363 - 364 2021.3

　More details

Language：Japanese Publisher：情報処理学会

本稿では, 複数のマイクロホンアレイによる観測信号から, マイクロホンアレイ位置・向きおよび音源位置を推定する問題を扱う. 従来法では, マイクロホンアレイ間の同期が取れていることが前提であり, 推定問題自体の探索空間が大きく局所最適解に陥りやすいため, 実問題に適用する際の制約が大きいという問題があった. この問題を解決するため, マイクロホンアレイ間の時間オフセットの導入および, 位置・向きおよび時間オフセットを同時最適化する統合型コスト関数の設計を行った. 結果として, 複数の同期が取れていないマイクロホンアレイを用いた場合でも, 従来法よりも局所最適解に陥りにくく, 推定精度の高い手法を構築することができた.

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00215170/
Investigation of Node Pruning Criteria for Neural Networks Model Compression with Non-Linear Function and Non-Uniform Network Topology Reviewed

Kazuhiro Nakadai, Yosuke Fukumoto, Ryu Takeda

2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings 117 - 124 2021.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical and Electronics Engineers Inc.

DOI： 10.1109/SLT48900.2021.9383593

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/slt/slt2021.html#NakadaiFT21
Visualizing Directional Soundscapes of Bird Vocalizations Using Robot Audition Techniques Reviewed

Reiji Suzuki, Hao Zhao, Shinji Sumitani, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

2021 IEEE/SICE International Symposium on System Integration, SII 2021 487 - 492 2021.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical and Electronics Engineers Inc.

DOI： 10.1109/IEEECONF49454.2021.9382639

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2021.html#SuzukiZSMANO21
Observing Nocturnal Birds Using Localization Techniques Reviewed

Shiho Matsubayashi, Fumiyuki Saito, Reiji Suzuki, Kazuhiro Nakadai, Hiroshi G. Okuno

2021 IEEE/SICE International Symposium on System Integration, SII 2021 493 - 498 2021.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical and Electronics Engineers Inc.

DOI： 10.1109/IEEECONF49454.2021.9382665

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2021.html#MatsubayashiSSN21
Sound Source Tracking Using Integrated Direction Likelihood for Drones with Microphone Arrays Reviewed

Taiki Yamada, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2021 IEEE/SICE International Symposium on System Integration (SII) 394 - 399 2021.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ieeeconf49454.2021.9382619

researchmap
Assessment of a Beamforming Implementation Developed for Surface Sound Source Separation Reviewed

Zhi Zhong, Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2021 IEEE/SICE International Symposium on System Integration (SII) 369 - 374 2021.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ieeeconf49454.2021.9382648

researchmap
Multi-channel Environmental Sound Segmentation utilizing Sound Source Localization and Separation U-Net Reviewed

Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2021 IEEE/SICE International Symposium on System Integration (SII) 382 - 387 2021.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ieeeconf49454.2021.9382730

researchmap
EMC: Earthquake Magnitudes Classification on Seismic Signals via Convolutional Recurrent Networks Reviewed

Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

2021 IEEE/SICE International Symposium on System Integration (SII) 388 - 393 2021.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ieeeconf49454.2021.9382696

researchmap
Fully-Online Always-Adaptation of Transfer Functions and Its Application to Sound Source Localization and Separation. Reviewed

Kazuhiro Nakadai, Masayuki Takigahira, Yusuke Kawai, Hirofumi Nakajima

IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS) 2100 - 2105 2021

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS51168.2021.9636631

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/iros/iros2021.html#NakadaiTKN21
Two-Dimensional Environment Recognition by Audible Sound with Weighted Likelihood Function and Standing Wave

岸波華彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会誌 39 ( 3 ) 271 - 274 2021

　More details

Publishing type：Research paper (scientific journal) Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.39.271

J-GLOBAL

researchmap
Spatial Normalization to Reduce Positional Complexity in Direction-aided Supervised Binaural Sound Source Separation. Reviewed

Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani

APSIPA ASC 248 - 253 2021

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

researchmap

Other Link： https://dblp.uni-trier.de/rec/conf/apsipa/2021
Proposal and Evaluation of Spatial Sound Source Separationusing NMF with Multiple Microphone Arrays Reviewed

鍵本泰宏, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会誌 39 ( 7 ) 2021

　More details

Language：Japanese

J-GLOBAL

researchmap
Age Classification of Evacuees at Times of Disaster Using a Vibration Sensor

Toru Yamashita, Futoshi Asano, Kazuhiro Nakadai

2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings 184 - 188 2020.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical and Electronics Engineers Inc.

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/rec/conf/apsipa/2020
Synchronization of microphones based on rank minimization of warped spectrum for asynchronous distributed recording

Katsutoshi Itoyama, Kazuhiro Nakadai

IEEE International Conference on Intelligent Robots and Systems 4842 - 4847 2020.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical and Electronics Engineers Inc.

DOI： 10.1109/IROS45743.2020.9341584

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/iros/iros2020.html#ItoyamaN20
Sound event aware environmental sound segmentation with Mask U-Net

Y. Sudo, K. Itoyama, K. Nishida, K. Nakadai

Advanced Robotics 34 ( 20 ) 1280 - 1290 2020.10

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Robotics Society of Japan

DOI： 10.1080/01691864.2020.1829040

Scopus

researchmap
Recognition of non-manual content in continuous Japanese sign language

Heike Brock, Iva Farag, Kazuhiro Nakadai

Sensors (Switzerland) 20 ( 19 ) 1 - 21 2020.10

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：MDPI AG

DOI： 10.3390/s20195621

Scopus

PubMed

researchmap

Other Link： https://dblp.uni-trier.de/db/journals/sensors/sensors20.html#BrockFN20
Robot Audition and Computational Auditory Scene Analysis

Kazuhiro Nakadai, Hiroshi G. Okuno

Advanced Intelligent Systems 2 ( 9 ) 2000050 - 2000050 2020.9

　More details

Publishing type：Research paper (scientific journal) Publisher：Wiley

DOI： 10.1002/aisy.202000050

researchmap

Other Link： https://onlinelibrary.wiley.com/doi/full-xml/10.1002/aisy.202000050
Multi-hop wireless command and telemetry communication system for remote operation of robots with extending operation area beyond line-of-sight using 920 MHz/169 MHz Reviewed

Toshinori Kagawa, Fumie Ono, Lin Shan, Ryu Miura, Kazuhiro Nakadai, Kotaro Hoshiba, Makoto Kumon, Hiroshi G. Okuno, Shin Kato, Fumihide Kojima

Advanced Robotics 34 ( 11 ) 756 - 766 2020.6

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Informa UK Limited

DOI： 10.1080/01691864.2020.1760934

Scopus

researchmap
A Spatial Filter Design for Surface Sound Source Separation

2020 ( 1 ) 189 - 190 2020.2

　More details

Language：English

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00205118/
複数マイクロホンアレイを用いた尤度分布統合による移動音源追跡

山田, 泰基, 糸山, 克寿, 西田, 健次, 中臺, 一博

第82回全国大会講演論文集 2020 ( 1 ) 191 - 192 2020.2

　More details

Language：Japanese Publisher：情報処理学会

近年, 複数のマイクロホンアレイを用いた音源位置推定は盛んに研究されている. 特に, 複数マイクロホンアレイのアレイ処理より音源位置尤度を算出し, 静止音源の位置推定を行う手法が報告されているが, 移動音源に対して適用し, 音源追跡を行うことについては十分に研究されていない. そこで本稿では, 音源位置尤度の分布から移動音源のダイナミクスを推定することで音源追跡を行う. 逐次的な音源位置推定に加えて, 音源ダイナミクス推定を行うことで, 音源位置追跡誤差の抑制が期待できる. 既存のデータベースを用いてシミュレーションを行い, 提案手法の有効性を評価する.

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00205119/
バイナリマスク付き非負値行列因子分解に基づく音源分離手法の発音時刻ずれに対する性能評価

日下, 湧太, 糸山, 克寿, 西田, 健次, 中臺, 一博

第82回全国大会講演論文集 2020 ( 1 ) 361 - 362 2020.2

　More details

Language：Japanese Publisher：情報処理学会

本稿では，目的音源の発音時刻を事前情報として利用するバイナリマスク付き非負値行列因子分解による音源分離手法において，入力される発音時刻に時間のずれが含まれている際の分離精度の変化について評価を行う．複数の楽器により構成されるモノラル音響信号から特定の音源のみを分離する処理には目的音源の事前情報を利用する手法が主流となっており，ユーザが容易に作成可能な事前情報として目的音源の発音時刻を利用する音源分離手法を提案した．これまでの報告では，提案法に入力する発音時刻はMIDIやアノテーションから作成した理想的な状況に限られていた．本報告では人間が発音時刻を作成する際に発生する時間のずれをモデル化し，これを用いて音源分離のシミュレーションと分離精度評価を行った．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00205203/
クラウドソーシングを用いた作成した環境音キャプションコーパスの評価

岩月, 道生, 糸山, 克寿, 西田, 健次, 中臺, 一博

第82回全国大会講演論文集 2020 ( 1 ) 201 - 202 2020.2

　More details

Language：Japanese Publisher：情報処理学会

音環境理解の形の一つとして，音響信号に対してその音環境を自然言語で説明するキャプション生成システムの構築がある．機械学習を用いた音響信号に対するキャプション生成システムの構築には，環境音とそれに対応するキャプションのペアを多数集めた，環境音キャプションデータセットが必要となる．本稿では以前に岩月らがクラウドソーシングを用いてアノテーションを行い作成した環境音キャプションコーパスを，RNNを用いた深層学習ベースのモデルに学習させることで，コーパスの評価を行った．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00205123/
Design and Implementation of Real-Time Visualization of Sound Source Positions by Drone Audition Reviewed

Mizuho Wakabayashi, Kai Washizaka, Kotaro Hoshiba, Kazuhiro Nakadai, Hiroshi G. Okuno, Makoto Kumon

2020 IEEE/SICE International Symposium on System Integration (SII) 1 814 - 819 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9025940

CiNii Research

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2020.html#WakabayashiWHNO20
Soundscape Analysis of Bird Songs in Forests Using Microphone Arrays. Reviewed

Shinji Sumitani, Reiji Suzuki, Takemi Morimatsu, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

2020 IEEE/SICE International Symposium on System Integration(SII) 634 - 639 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9026267

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2020.html#SumitaniSMMANO20
Multi-channel environmental sound segmentation Reviewed

Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration (SII2020) 820 - 825 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9025963

researchmap
Design and assessment of a scan-and-sum beamformer for surface sound source separation Reviewed

Zhi Zhong, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration (SII2020) 808 - 813 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9025981

researchmap
Audio-visual 3D reconstruction framework for dynamic scenes Reviewed

Takashi Konno, Kenji Nishida, Katsutoshi Itoyama, Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration (SII2020) 802 - 807 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9025812

researchmap
Sound source tracking by drones with microphone arrays Reviewed

Taiki Yamada, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration (SII2020) 796 - 801 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9026185

researchmap
Sound source localization based on von-Mises-Bernoulli deep neural network Reviewed

Kazuhiro Nakadai, Shungo Masaki, Ryosuke Kojima, Osamu Sugiyama, Katsutoshi Itoyama, Kenji Nishida

2020 IEEE/SICE International Symposium on System Integration (SII2020) 658 - 663 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9025880

researchmap
Calibration of a microphone array based on stochastic model of microphone position and sound source spectrum

段雄啓, 糸山克寿, 西田健次, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 57th 2020

　More details

J-GLOBAL

researchmap
3D Sound Source Tracking for Drones Using Direction Likelihood Integration

山田泰基, 糸山克寿, 西田健次, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 57th 2020

　More details

J-GLOBAL

researchmap
Onset-informed Source Separation using Non-negative Matrix Factorization with Binary Masks

日下湧太, 糸山克寿, 西田健次, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 57th 2020

　More details

J-GLOBAL

researchmap
HARK middleware: Middleware for open-sourced robot audition software HARK

木下智義, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 57th 2020

　More details

J-GLOBAL

researchmap
伸縮スペクトルのランク最小化の緩和に基づくチャネル間同期

糸山克寿, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 38th 2020

　More details

J-GLOBAL

researchmap
Model Compression of Non-uniform Neural Networks with Non-Linear Functions Based on Node Pruning

中臺一博, 中臺一博, 福本陽典, 武田龍

人工知能学会AIチャレンジ研究会(Web) 57th 2020

　More details

J-GLOBAL

researchmap
A Fourier series based Data compression model for Acoustic transfer function.

Yoshiaki Asahara, Kohich Matsuda, Hirofumi Nakajima, Kazuhiro Nakadai

2020 IEEE/SICE International Symposium on System Integration(SII) 664 - 668 2020

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII46433.2020.9026238

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/sii/sii2020.html#AsaharaMNN20
Learning Three-dimensional Skeleton Data from Sign Language Video.

Heike Brock, Felix Law, Kazuhiro Nakadai, Yuji Nagashima

ACM Transactions on Intelligent Systems and Technology 11 ( 3 ) 30 - 24 2020

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1145/3377552

researchmap
Fine-scale observations of spatio-spectro-temporal dynamics of bird vocalizations using robot audition techniques

Shinji Sumitani, Reiji Suzuki, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Remote Sensing in Ecology and Conservation 2020

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1002/rse2.152

Scopus

researchmap
Detection of Ball Spin Direction using Hitting Sound in Tennis

Naoki Yamamoto, Kenji Nishida, Katsutoshi Itoyama, Kazuhiro Nakadai

Proceedings of the 8th International Conference on Sport Sciences Research and Technology Support 57th 30 - 37 2020

　More details

Publishing type：Research paper (international conference proceedings) Publisher：SCITEPRESS - Science and Technology Publications

DOI： 10.5220/0010107600300037

J-GLOBAL

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/icsports/icsports2020.html#YamamotoNIN20
Calibration of a Microphone Array Based on a Probabilistic Model of Microphone Positions

Katsuhiro Dan, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices 614 - 625 2020

　More details

Publishing type：Part of collection (book) Publisher：Springer International Publishing

DOI： 10.1007/978-3-030-55789-8_53

researchmap
Reactive Chameleon: A Method to Mimic Conversation Partner's Body Sway for a Robot.

Ryosuke Hasumoto, Kazuhiro Nakadai, Michita Imai

International Journal of Social Robotics 12 ( 1 ) 239 - 258 2020

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1007/s12369-019-00557-4

researchmap
鳴き声で追う夜行性鳥類：ロボット聴覚技術の応用実例 Invited Reviewed

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奥乃博

景観生態学 24 ( 1・2 ) 104 - 105 2019.12

　More details

Language：Japanese

ポスター賞受賞報告

researchmap
Environmental sound segmentation utilizing mask U-Net Reviewed

Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019) 5340 - 5345 2019.11

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS40897.2019.8967954

researchmap
Improvement of DOA estimation by using quaternion output in sound event localization and detection Reviewed

Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Proceedings of the 2019 DCASE Workshop 244 - 247 2019.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap

Other Link： https://dblp.uni-trier.de/rec/conf/dcase/2019
AI チャレンジ研究会（Challenge）

光永法明, 植村渉, 鈴木麗璽, 干場功太郎, 中臺一博

人工知能 34 ( 5 ) 635 - 638 2019.9

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jjsai.34.5_635

CiNii Books

CiNii Research

researchmap

Other Link： https://ndlsearch.ndl.go.jp/books/R000000004-I029969740
Acoustic simulation in dynamic environments for robot audition

Zhaofeng Zhang, Kazuhiro Nakadai, Hirofumi Nakajima, Naoaki Sumida

European Signal Processing Conference 2019- 1 - 5 2019.9

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：European Signal Processing Conference, EUSIPCO

DOI： 10.23919/EUSIPCO.2019.8902609

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/eusipco/eusipco2019.html#ZhangNNS19
Special issue on robot and human interactive communication Reviewed

Kazuhiro Nakadai, Emilia Barakova, Michita Imai, Tetsunari Inamura

Advanced Robotics 33 ( 7-8 ) 307 - 308 2019.8

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2019.1652953

researchmap
von Mises - Bernoulli RBMを用いた音源定位の検討

正木, 俊伍, 杉山, 治, 小島, 諒介, 中臺, 一博, 糸山, 克寿, 西田, 健次

第81回全国大会講演論文集 2019 ( 1 ) 555 - 556 2019.2

　More details

Language：Japanese Publisher：情報処理学会

本稿では，ニューラルネットワークで音源定位を学習する際に，位相差情報を直接入力情報として使用する手法を検討する．マイクロホンアレイ信号処理など，一般的に用いられる音源定位手法では，位相差情報を重要なキューとして使用する．しかし，位相差情報は周期関数で表現されるため，入力が，0/1信号やガウス分布に従っているとする Bernoulli-Bernoulli 型や Gaussian- Bernoulli 型の RBM(restricted Boltzmann machine) では，扱うことができない．そこで，本手法では，Bernoulli-Bernoulli 型のRBMを位相情報を直接入力できるよう von Mises-Bernoulli 型のRBMに拡張した音源定位モデルを提案する．予備実験の結果，提案手法は位相差情報を入力した場合にも音源定位が可能であることを示すことができた.

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00196911/
Listen and Tell: 深層学習を用いた音響シーンのキャプション生成

岩月, 道生, 周藤, 唯, 糸山, 克寿, 西田, 健次, 中臺, 一博

第81回全国大会講演論文集 2019 ( 1 ) 407 - 408 2019.2

　More details

Language：Japanese Publisher：情報処理学会

本稿では，環境音響信号に対してキャプションを自動生成する手法を検討する．画像に対するキャプション生成手法は show and tell として知られ，深層学習を用いた研究が多く存在する．一方音響信号は時系列の一次元信号であり，かつ各音イベントが可変長であるから画像で用いられる手法をそのまま適用することは難しい．そこで，1) 音響信号を複数の時分割スペクトログラムとすることにより音響信号を画像化し，2) RNNを用いることで可変長の時系列信号を扱えるようにした listen & tell 手法を提案する．提案手法に基づき音の種類とタイミングをキャプションするモデルを構築し、合成データを用いてその有効性を確認した．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00196838/
マルコフ連鎖に基づくマスク付きNMFを用いた特定音源の分離

日下, 湧太, 糸山, 克寿, 西田, 健次, 中臺, 一博

第81回全国大会講演論文集 2019 ( 1 ) 419 - 420 2019.2

　More details

Language：Japanese Publisher：情報処理学会

本稿では，マルコフ連鎖に基づくバイナリマスクを導入した非負値行列因子分解（NMF）により，音楽音響信号から特定の楽器音のみを分離する手法を提案する．一般に，NMFによる音源分離では，必ずしも基底と楽器とが一対一に対応しない．これを解決する手法として，楽器の教師音により基底を事前学習するNMFが提案されているものの，教師音を準備する手間が大きいという問題がある．提案手法では，楽器音の立上り（オンセット）情報の一部を指定したうえで，新たに導入したバイナリマスクを自動推定することにより，教師音なしでの特定楽器音分離を行う．予備的な実験を行い，オンセットを事前情報として与えることで，特定の楽器音のみが分離できることを確認した．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00196844/
視聴覚統合による三次元構造復元に関する検討

紺野, 隆志, 西田, 健次, 糸山, 克寿, 中臺, 一博

第81回全国大会講演論文集 2019 ( 1 ) 207 - 208 2019.2

　More details

Language：Japanese Publisher：情報処理学会

本稿では、音と画像を用いたStructure from Motion (SfM)に基づく、動的環境下における三次元復元アルゴリズムを提案する。SfMは通常、動的物体の存在しない定常環境を仮定するため、動的物体は三次元復元をすることができない。動的物体は、その動きや振動から音を発することが多い。本稿ではこの問題を解決するため、音源定位により得られる音情報を利用した、Audio-Visual SfMを提案する。評価実験において提案手法が、動的物体が存在する領域を適切に復元していることを示す。

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00196734/
マイクロホンと音源位置に関する確率モデルに基づくマイクロホンアレイのキャリブレーションの検討

段, 雄啓, 糸山, 克寿, 西田, 健次, 中臺, 一博

第81回全国大会講演論文集 2019 ( 1 ) 553 - 554 2019.2

　More details

Language：Japanese Publisher：情報処理学会

本稿は，音源定位や音源分離といったマイクロホンアレイ信号処理の性能低下の一因である，マイクロホンアレイを構成するマイクロホンの所与の位置と実際の位置とのずれに対し，このずれのキャリブレーションのため実際の位置を観測信号から推定する手法について述べる．提案法では，マイクロホン位置の存在確率モデルを所与の位置に基づいて定義される事前確率と観測信号及び実際の位置に基づいて定義される尤度関数の組み合わせで定義し，最大事後確率推定によりマイクロホンの実際の位置を推定する．数値シミュレーション実験によるマイクロホン位置情報推定精度の評価では，音到来角度の異なる複数音源を用いることで推定精度の大幅な向上が確認された．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00196910/
複数のマイクロホンアレイを搭載した複数のUAVによる移動音源の三次元追跡

山田, 泰基, Daniel, Gabriel, 糸山, 克寿, 西田, 健次, 中臺, 一博

第81回全国大会講演論文集 2019 ( 1 ) 115 - 116 2019.2

　More details

Language：Japanese Publisher：情報処理学会

本研究では, 複数の移動機体に搭載された複数マイクロホンアレイによる移動音源の 3 次元軌跡推定手法について検討する. 単独のマイクロホンアレイでは音源方向のみ推定可能で，音源位置の推定は困難である．複数マイクロホンアレイを用いることで三角測量に基づく音源位置の推定が可能になり，さらに複数の移動機体を用いることでロバストな移動音源軌跡が可能になると期待される．各マイクロホンアレイから得られた音源方向より音源位置の候補点を算出し，各候補点に重みをつけながらUnscented Kalman Filterを適用することで移動音源軌跡を推定する．数値シミュレーションによる検証では，提案手法の推定誤差は0.1[m]以下であることが確認された．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00196078/
Special issue on robot and human interactive communication.

Kazuhiro Nakadai, Emilia I. Barakova, Michita Imai, Tetsunari Inamura

Advanced Robotics 33 ( 15-16 ) 699 - 699 2019

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2019.1652953

researchmap
2D sound source position estimation using microphone arrays and its application to a VR-based bird song analysis system. Reviewed

Daniel Gabriel, Ryosuke Kojima, Kotaro Hoshiba, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

Advanced Robotics 33 ( 7-8 ) 403 - 414 2019

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2019.1598491

researchmap
An Integrated Framework for Field Recording, Localization, Classification and Annotation of Birdsongs Using Robot Audition Techniques - Harkbird 2.0. Reviewed

Shinji Sumitani, R. Suzuki, Naoaki Chiba, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi Gitchang Okuno

IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019 8246 - 8250 2019

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ICASSP.2019.8683743

researchmap
Design and assessment of multiple-sound source localization using microphone arrays. Reviewed

Daniel Gabriel, Ryosuke Kojima, Kotaro Hoshiba, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

IEEE/SICE International Symposium on System Integration, SII 2019, Paris, France, January 14-16, 2019 199 - 204 2019

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII.2019.8700368

researchmap
Close Sound Source Localization incorporating Semi-Supervised Variational Bayesian NMF. Reviewed

Makoto Kumon, Kai Washizaki, Kazuhiro Nakadai

IEEE/SICE International Symposium on System Integration, SII 2019, Paris, France, January 14-16, 2019 313 - 318 2019

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/SII.2019.8700459

researchmap
Recent R&D Technologies and Future Prospective of Flying Robot in Tough Robotics Challenge. Reviewed

Kenzo Nonami, Kotaro Hoshiba, Kazuhiro Nakadai, Makoto Kumon, Hiroshi G. Okuno, Yasutada Tanabe, Koichi Yonezawa, Hiroshi Tokutake, Satoshi Suzuki, Kohei Yamaguchi, Shigeru Sunada, Takeshi Takaki, Toshiyuki Nakata, Ryusuke Noda, Hao Liu, Satoshi Tadokoro

Disaster Robotics - Results from the ImPACT Tough Robotics Challenge 128 77 - 142 2019

　More details

Publishing type：Part of collection (book) Publisher：Springer

DOI： 10.1007/978-3-030-05321-5_3

Scopus

researchmap
CNN-based Multichannel End-to-End Speech Recognition for Everyday Home Environments*.

Nelson Yalta, Shinji Watanabe 0001, Takaaki Hori, Kazuhiro Nakadai, Tetsuya Ogata

27th European Signal Processing Conference(EUSIPCO) abs/1811.02735 1 - 5 2019

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.23919/EUSIPCO.2019.8902524

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/eusipco/eusipco2019.html#YaltaWHNO19
Weakly-Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation.

Nelson Yalta, Shinji Watanabe 0001, Kazuhiro Nakadai, Tetsuya Ogata

International Joint Conference on Neural Networks(IJCNN) abs/1807.01126 1 - 8 2019

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IJCNN.2019.8851872

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ijcnn/ijcnn2019.html#YaltaWNO19
The 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2018)

Inamura Tetsunari, Nakadai Kazuhiro

Journal of the Robotics Society of Japan 37 ( 1 ) 69 - 69 2019

　More details

Language：Japanese Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.37.69

CiNii Research

researchmap
音で知るフクロウの営巣活動と巣立ち:定位技術を活用した鳥類観測実例

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2019 2019

　More details

J-GLOBAL

researchmap
鳥類集団の音声コミュニケーション理解のための半野外音源定位環境の構築と予備的調査

炭谷晋司, 鈴木麗璽, 和多和宏, 有田隆也, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2019 2019

　More details

J-GLOBAL

researchmap
振動センサと重回帰分析を用いた歩行速度の推定に関する検討

椿順, 浅野太, 中臺一博

電子情報通信学会大会講演論文集(CD-ROM) 2019 2019

　More details

J-GLOBAL

researchmap
地上音源の位置推定を行うドローン聴覚システムのための分散処理環境の開発

公文誠, 中臺一博, 干場功太郎, 奥乃博, 加川敏規, 三浦龍

人工知能学会AIチャレンジ研究会 52nd 2018.12

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

J-GLOBAL

researchmap
単一の振動センサを用いた歩行方向推定

尾崎翔, 浅野太, 中臺一博

電子情報通信学会論文誌 A(Web) J101-A ( 6 ) 137‐149 (WEB ONLY) 2018.6

　More details

Language：Japanese

J-GLOBAL

researchmap
Evaluation of 2D bird localization algorithm using microphone arrays

2018 ( 1 ) 381 - 382 2018.3

　More details

Language：English

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00188613/
可聴音を用いた周波数自動選択に基づく距離推定法の検討

高尾, 麻衣子, 干場, 功太郎, 中臺, 一博

第80回全国大会講演論文集 2018 ( 1 ) 383 - 384 2018.3

　More details

Language：Japanese Publisher：情報処理学会

環境理解は,システムやロボットがその周囲の環境を把握する技術を構築する研究分野であり、自動運転や災害救助ロボットなど様々な目的に利用可能である.本研究では,その第一歩として,音響信号を用い,かつ人間に不快感を与えない形でアクティブに距離計測を行う手法を検討する.分解能が不十分,狭帯域信号に対するノイズ耐性が低いといった音響信号を用いた既存の距離計測手法の問題に対して,本稿ではこの問題を解決するために周波数ごとに重みと尤度を自動的に設定する手法を提案する.実収録データを用いた距離計測実験を行った結果,雑音に対する頑健性,および距離計測の精度について,最も雑音に頑健な手法であるCSP法と比較して,提案法が有効であることを示すことができた.

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00188614/
Quad-directional LSTMを用いた音楽音響信号修復とその評価

谷口, 亮輔, 干場, 功太郎, 中臺, 一博

第80回全国大会講演論文集 2018 ( 1 ) 171 - 172 2018.3

　More details

Language：Japanese Publisher：情報処理学会

本稿ではLSTM(Long Short-Term Memory)を用いた音楽音響信号の修復法を提案し,実際の欠損に対しての修復性能の評価を行う.実際にLSTMを適用した場合,情報が比較的スパースである高域の学習が十分でなくなり,修復性能が劣化してしまう.この問題に対し,我々は,入力信号に対して高域を強調するような周波数フィルタを用いて,その解決を試みた.また,この手法の拡張として,時間方向のみではなく,周波数方向の系列情報も考慮することが可能な QLSTM(Quad-directional LSTM)を用いることを提案した.これらの手法を実際の欠損に対して適用し,評価を行った結果,提案手法は通常のLSTMと比較して,より詳細な修復が可能であるということを確認した.

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00188512/
合同研究会2017 開催報告

小林一郎, 加藤恒昭, 上田康晴, 中臺一博

人工知能 33 ( 2 ) 223 - 230 2018.3

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jjsai.33.2_223

CiNii Books

CiNii Research

researchmap

Other Link： https://ndlsearch.ndl.go.jp/books/R000000004-I028890972
A spatial-Cue-Based probabilistic model for bird song scene analysis Reviewed

Ryosuke Kojima, Reiji Suzuki, Osamu Sugiyama, Kotaro Hoshiba, Kazuhiro Nakadai

Proceedings - 2017 International Conference on Data Science and Advanced Analytics, DSAA 2017 2018- 395 - 404 2018.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical and Electronics Engineers Inc.

DOI： 10.1109/DSAA.2017.34

Scopus

researchmap
A spatiotemporal analysis of acoustic interactions between great reed warblers (Acrocephalus arundinaceus) using microphone arrays and robot audition software HARK Reviewed

Reiji Suzuki, Shiho Matsubayashi, Fumiyuki Saito, Tatsuyoshi Murate, Tomohisa Masuda, Koichi Yamamoto, Ryosuke Kojima, Kazuhiro Nakadai, Hiroshi G. Okuno

Ecology and Evolution 8 ( 1 ) 812 - 825 2018.1

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：John Wiley and Sons Ltd

DOI： 10.1002/ece3.3645

Scopus

PubMed

researchmap
特集「2016 年度研究会優秀賞受賞論文紹介」にあたって

中臺一博, 小林一郎

人工知能 33 ( 1 ) 55 - 56 2018.1

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/jjsai.33.1_55

CiNii Research

researchmap
Extracting the Relationship between the Spatial Distribution and Types of Bird Vocalizations Using Robot Audition System HARK. Reviewed

Shinji Sumitani, Reiji Suzuki, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018, Madrid, Spain, October 1-5, 2018 2485 - 2490 2018

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS.2018.8594130

researchmap
HARK-Bird-Box: A Portable Real-time Bird Song Scene Analysis System. Reviewed

Ryosuke Kojima, Osamu Sugiyama, Kotaro Hoshiba, Reiji Suzuki, Kazuhiro Nakadai

2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018, Madrid, Spain, October 1-5, 2018 2497 - 2502 2018

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS.2018.8594070

researchmap
Multi-timescale Feature-extraction Architecture of Deep Neural Networks for Acoustic Model Training from Raw Speech Signal. Reviewed

Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani

2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018, Madrid, Spain, October 1-5, 2018 2503 - 2510 2018

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS.2018.8593925

researchmap
Data-driven development of Virtual Sign Language Communication Agents. Reviewed

Agathe Balayn, Heike Brock, Kazuhiro Nakadai

27th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2018, Nanjing, China, August 27-31, 2018 370 - 377 2018

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ROMAN.2018.8525717

researchmap
To animate or anime-te?: Investigating sign avatar comprehensibility. Reviewed

Heike Brock, Shigeaki Nishina, Kazuhiro Nakadai

Proceedings of the 18th International Conference on Intelligent Virtual Agents, IVA 2018, Sydney, NSW, Australia, November 05-08, 2018 331 - 332 2018

　More details

Publishing type：Research paper (international conference proceedings) Publisher：ACM

DOI： 10.1145/3267851.3267864

researchmap
Assessment of MUSIC-Based Noise-Robust Sound Source Localization with Active Frequency Range Filtering. Reviewed

Kotaro Hoshiba, Kazuhiro Nakadai, Makoto Kumon, Hiroshi G. Okuno

JRM 30 ( 3 ) 426 - 435 2018

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.20965/jrm.2018.p0426

researchmap
Weakly Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation. Reviewed

Nelson Yalta, Shinji Watanabe, Kazuhiro Nakadai, Tetsuya Ogata

CoRR abs/1807.01126 2018

　More details

researchmap
Signal Restoration based on Bi-directional LSTM with Spectral Filtering for Robot Audition. Reviewed

Ryosuke Taniguchi, Kotaro Hoshiba, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

27th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2018, Nanjing, China, August 27-31, 2018 955 - 960 2018

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ROMAN.2018.8525793

researchmap
CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments. Reviewed

Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya Ogata

CoRR abs/1811.02735 2018

　More details

researchmap
Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms. Reviewed

Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Tatsuya Kawahara, Hiroshi G. Okuno

IEEE ACM Trans. Audio Speech Lang. Process. 26 ( 2 ) 215 - 230 2018

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1109/TASLP.2017.2772340

researchmap
Deep JSLC: A Multimodal Corpus Collection for Data-driven Generation of Japanese Sign Language Expressions. Reviewed

Heike Brock, Kazuhiro Nakadai

Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018. 2018

　More details

Publishing type：Research paper (international conference proceedings) Publisher：European Language Resources Association (ELRA)

researchmap
The 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2017)

Nakadai Kazuhiro, Shibata Tomohiro

Journal of the Robotics Society of Japan 36 ( 2 ) 145 - 145 2018

　More details

Language：Japanese Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.36.145

CiNii Research

researchmap
Synchronization of multiple A/D converters based on spectral stretch

ITOYAMA Katsutoshi, NAKADAI Kazuhiro

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2018 2P1-K05 2018

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

We present a newmethod for synchronization of multiple independent A/D converters for microphone array processing. Since the spectrum obtained in each channel becomes a stretched version of the source spectrum, we construct a probabilistic generative model of spectrum stretch. Synchronization is realized by solving the inverse problem of the probabilistic generative model.

DOI： 10.1299/jsmermd.2018.2p1-k05

CiNii Research

researchmap
ロボットが聴く夜の鳥

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奥乃博

第52回人工知能学会 AIチャレンジ研究会予稿集 52(4) 15 - 20 2018

　More details

Publishing type：Research paper (scientific journal)

CiNii Research

researchmap
The 30th IEEE/RSJ International Conference on Intelligent Systems and Robots (IROS 2017)

Nakadai Kazuhiro, Shibata Tomohiro

Journal of the Robotics Society of Japan 36 ( 1 ) 53 - 55 2018

　More details

Language：Japanese Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.36.53

CiNii Research

researchmap
Development of microphone-array-embedded UAV for search and rescue task Reviewed

Kazuhiro Nakadai, Makoto Kumon, Hiroshi G. Okuno, Kotaro Hoshiba, Mizuho Wakabayashi, Kai Washizaki, Takahiro Ishiki, Daniel Gabriel, Yoshiaki Bando, Takayuki Morito, Ryosuke Kojima, Osamu Sugiyama

IEEE International Conference on Intelligent Robots and Systems 2017- 5985 - 5990 2017.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical and Electronics Engineers Inc.

DOI： 10.1109/IROS.2017.8206494

Scopus

researchmap
Design of UAV-Embedded Microphone Array System for Sound Source Localization in Outdoor Environments Reviewed

Kotaro Hoshiba, Kai Washizaki, Mizuho Wakabayashi, Takahiro Ishiki, Makoto Kumon, Yoshiaki Bando, Daniel Gabriel, Kazuhiro Nakadai, Hiroshi G. Okuno

SENSORS 17 ( 11 ) 2535 2017.11

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.3390/s17112535

Web of Science

PubMed

researchmap
Acoustic model training based on node-wise weight boundary model for fast and small-footprint deep neural networks Reviewed

Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani

COMPUTER SPEECH AND LANGUAGE 46 461 - 480 2017.11

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.csl.2017.02.002

Web of Science

researchmap
Iterative Outlier Removal Method Using In-Cluster Variance Changes in Multi-Microphone Array Sound Source Localization.

2017 ( 1 ) 229 - 230 2017.3

　More details

Language：English

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00180795/
LSTMによる音楽音響信号の修復法の提案-周波数フィルタ導入による学習データ量削減の検討-

谷口, 亮輔, 小島, 諒介, 干場, 功太郎, 中臺, 一博

第79回全国大会講演論文集 2017 ( 1 ) 133 - 134 2017.3

　More details

Language：Japanese Publisher：情報処理学会

本稿では、深層学習の一手法であるLSTM を用いた音楽音響信号修復について報告する．一般に，深層学習では性能の高いモデルを学習するために大量のデータが必要である．実際に音楽音響信号修復に深層学習を用いると,学習データが少ない場合，情報が比較的スパースである高域の修復性能が劣化するという問題が発生する．この問題を解決するため，学習時に，入力信号に対して，周波数フィルタを用いることにより，周波数方向に重みをかけることを提案する．予備検討の結果，少量の学習データであっても提案法が有効であることを確認した．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00180751/
Bird song scene analysis using a spatial-cue-based probabilistic model Reviewed

Ryosuke Kojima, Osamu Sugiyama, Kotaro Hoshiba, Kazuhiro Nakadai, Reiji Suzuki, Charles E. Taylor

Journal of Robotics and Mechatronics 29 ( 1 ) 236 - 246 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0236

Scopus

researchmap
Special issue on robot audition technologies Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai

Journal of Robotics and Mechatronics 29 ( 1 ) 15 - 15 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0015

Scopus

researchmap
Sound source localization using deep learning models Reviewed

Nelson Yalta, Kazuhiro Nakadai, Tetsuya Ogata

Journal of Robotics and Mechatronics 29 ( 1 ) 37 - 48 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0037

Scopus

researchmap
Psychologically-inspired audio-visual speech recognition using coarse speech recognition and missing feature theory Reviewed

Kazuhiro Nakadai, Tomoaki Koiwa

Journal of Robotics and Mechatronics 29 ( 1 ) 105 - 113 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0105

Scopus

researchmap
Ego-noise suppression for robots based on semi-blind infinite non-negative matrix factorization Reviewed

Kazuhiro Nakadai, Taiki Tezuka, Takami Yoshida

Journal of Robotics and Mechatronics 29 ( 1 ) 114 - 124 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0114

Scopus

researchmap
Design and assessment of sound source localization system with a UAV-Embedded microphone array Reviewed

Kotaro Hoshiba, Osamu Sugiyama, Akihide Nagamine, Ryosuke Kojima, Makoto Kumon, Kazuhiro Nakadai

Journal of Robotics and Mechatronics 29 ( 1 ) 154 - 167 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0154

Scopus

researchmap
Outdoor sound source detection using a quadcopter with microphone array Reviewed

Takuma Ohata, Keisuke Nakamura, Akihide Nagamine, Takeshi Mizumoto, Takayuki Ishizaki, Ryosuke Kojima, Osamu Sugiyama, Kazuhiro Nakadai

Journal of Robotics and Mechatronics 29 ( 1 ) 177 - 187 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0177

Scopus

researchmap
Outdoor acoustic event identification with DNN using a quadrotor-embedded microphone array Reviewed

Osamu Sugiyama, Satoshi Uemura, Akihide Nagamine, Ryosuke Kojima, Keisuke Nakamura, Kazuhiro Nakadai

Journal of Robotics and Mechatronics 29 ( 1 ) 188 - 197 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0188

Scopus

researchmap
Harkbird: Exploring acoustic interactions in bird communities using a microphone array Reviewed

Reiji Suzuki, Shiho Matsubayashi, Richard W. Hedley, Kazuhiro Nakadai, Hiroshi G. Okuno

Journal of Robotics and Mechatronics 29 ( 1 ) 213 - 223 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0213

Scopus

researchmap
Acoustic monitoring of the great reed warbler using multiple microphone arrays and robot audition Reviewed

Shiho Matsubayashi, Reiji Suzuki, Fumiyuki Saito, Tatsuyoshi Murate, Tomohisa Masuda, Koichi Yamamoto, Ryosuke Kojima, Kazuhiro Nakadai, Hiroshi G. Okuno

Journal of Robotics and Mechatronics 29 ( 1 ) 224 - 235 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0224

Scopus

researchmap
Development, deployment and applications of robot audition open source software HARK Reviewed

Kazuhiro Nakadai, Hiroshi G. Okuno, Takeshi Mizumoto

Journal of Robotics and Mechatronics 29 ( 1 ) 16 - 25 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Fuji Technology Press

DOI： 10.20965/jrm.2017.p0016

Scopus

researchmap
HARKBird: Exploring acoustic interactions in bird communities using a microphone array Reviewed

Journal of Robotics and Mechatronics 27 ( 1 ) 224 - 235 2017.2

　More details

Language：English Publishing type：Research paper (scientific journal)

researchmap
Node Pruning Based on Entropy of Weights and Node Activity for Small-Footprint Acoustic Model Based on Deep Neural Networks. Reviewed

Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani

Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017 1636 - 1640 2017

　More details

Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.1016/j.csl.2017.02.002

researchmap
Swarm of micro-quadrocopters for consensus-based sound source localization Reviewed

L. Sinapayen, K. Nakamura, K. Nakadai, H. Takahashi, T. Kinoshita

ADVANCED ROBOTICS 31 ( 12 ) 624 - 633 2017

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2017.1310632

Web of Science

researchmap
Variational Bayesian multi-channel robust NMF for human-voice enhancement with a deformable and partially-occluded microphone array Reviewed

Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno

European Signal Processing Conference 2016- 1018 - 1022 2016.11

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：European Signal Processing Conference, EUSIPCO

DOI： 10.1109/EUSIPCO.2016.7760402

Scopus

researchmap
Online simultaneous localization and mapping of multiple sound sources and asynchronous microphone arrays Reviewed

Kouhei Sekiguchi, Yoshiaki Bando, Keisuke Nakamura, Kazuhiro Nakadai, Katsutoshi Itoyama, Kazuyoshi Yoshii

Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2016) 1973 - 1979 2016.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS.2016.7759311

researchmap
会話内非言語音声情報抽出のための音響特徴量の検討

柴田, 健作, 中村, 圭佑, 中臺, 一博

第78回全国大会講演論文集 2016 ( 1 ) 539 - 540 2016.3

　More details

Language：Japanese Publisher：情報処理学会

音声処理の分野では，コンピュータに人間の会話を理解させるための研究として音声認識や自然言語処理の研究が行われている．これらの研究では入力が言語音のみであることを仮定しており，笑い声や咳払いなどの非言語音を含む自然な会話の理解が困難である問題がある．そこで，本研究では会話音声からの非言語音声情報抽出について検討する．非言語音は音声信号に顕著な調波構造を持つとは限らないため，MFCCなどの音声特徴量では非言語音を柔軟に表現することが難しい．そこで本論文では非言語音声情報を抽出するための音響特徴量や特徴量抽出方法，学習方法について検討する．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00162755/
Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition.

Nurul Lubis, Randy Gomez, Sakriani Sakti, Keisuke Nakamura, Koichiro Yoshino, Satoshi Nakamura 0001, Kazuhiro Nakadai

Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016(LREC) 2016

　More details

Publishing type：Research paper (international conference proceedings) Publisher：European Language Resources Association (ELRA)

researchmap

Other Link： https://dblp.uni-trier.de/conf/lrec/2016
Wind-induced noise reduction using a small two-channel microphone array

Sakata Naoto, Murakami Tetsuro, Nakajima Hirofumi, Nakadai Kazuhiro

THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN 72 ( 12 ) 739 - 748 2016

　More details

Language：Japanese Publisher：一般社団法人日本音響学会

風雑音は一般的に非定常な雑音であり，信号の波形レベルでの相関をもとにした処理についてはあまり行われていない。本論文では2チャンネルを近接させたマイクロホンを用いて，各チャンネルで相関のある風雑音の収録を行い，相関の分析・風雑音の低減の二つの実験を行った。振幅・パワー・複素信号のそれぞれについてコヒーレンス関数により相関を分析した結果，どの項目についても125Hz以下で0.3～0.8の相関が確認された。その相関を利用して2種類の線形ビームフォーマにより風雑音の低減を行い，125Hz以下で3～10dB程度のパワーの低減が確認された。また，従来法（パワースペクトルサブトラクション）と提案法とでカートシス比を比較し，提案法は従来法に比べて音質の点で優位であることが確認された。

DOI： 10.20697/jasj.72.12_739

CiNii Books

researchmap
Localizing Bird Songs Using an Open Source Robot Audition System with a Microphone Array. Reviewed

Reiji Suzuki, Shiho Matsubayashi, Kazuhiro Nakadai, Hiroshi G. Okuno

Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016 2626 - 2630 2016

　More details

DOI： 10.21437/Interspeech.2016-782

Web of Science

researchmap
Variational Bayesian multi-channel robust NMF for human-voice enhancement with a deformable and partially-occluded microphone array. Reviewed

Yoshiaki Bando, Katsutoshi Itoyama,Array, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno

24th European Signal Processing Conference, EUSIPCO 2016, Budapest, Hungary, August 29 - September 2, 2016 1018 - 1022 2016

　More details

DOI： 10.1109/EUSIPCO.2016.7760402

Web of Science

researchmap
Multimodal Scene Understanding Framework and Its Application to Cooking Recognition. Reviewed

Ryosuke Kojima, Osamu Sugiyama, Kazuhiro Nakadai

Applied Artificial Intelligence 30 ( 3 ) 181 - 200 2016

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/08839514.2016.1156461

researchmap
Partially Shared Deep Neural Network in sound source separation and identification using a UAV-embedded microphone array. Reviewed

Takayuki Morito, Osamu Sugiyama, Ryosuke Kojima, Kazuhiro Nakadai

2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, Daejeon, South Korea, October 9-14, 2016 1299 - 1304 2016

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2016.7759215

researchmap
Reduction of Computational Cost Using Two-Stage Deep Neural Network for Training for Denoising and Sound Source Identification. Reviewed

Takayuki Morito, Osamu Sugiyama, Satoshi Uemura, Ryosuke Kojima, Kazuhiro Nakadai

Trends in Applied Knowledge-Based Systems and Data Science - 29th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2016, Morioka, Japan, August 2-4, 2016, Proceedings 562 - 573 2016

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-319-42007-3_49

researchmap
Leveraging Phantom Signals for Improved Voice-based Human-Robot Interaction Reviewed

Randy Gomez, Yurii Vasylkiv, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

2016 25TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN) 30 - 35 2016

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ROMAN.2016.7745087

Web of Science

researchmap
Robust Sound Source Mapping using Three-layered Selective Audio Rays for Mobile Robots Reviewed

Daobilige Su, Keisuke Nakamura, Kazuhiro Nakadai, Jaime Valls Miro

2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016) 2771 - 2777 2016

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Partially Shared Deep Neural Network for Sound Source Identification

MORITO Takayuki, SUGIYAMA Osamu, KOJIMA Ryosuke, NAKADAI Kazuhiro

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2016 1A1-09b4 2016

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

This paper addresses Deep Neural Network (DNN) for Sound Source Identification (SSI) of acoustic signals recorded with a microphone array embedded in an Unmanned Aerial Vehicle (UAV), aiming at people’s voice detection quickly and widely in a disastrous situation. It is well known that training a SSI-DNN needs huge dataset to improve its performance, but building such a dataset is not often realistic owing to the cost of annotation done by human. Therefore, we propose Partially Shared Deep Neural Network (PS-DNN) training using noise-suppressed acoustic signals, which can be obtained in automatic process, in addition to label data annotated by human. This results in more accurate SSI in the situation of lack of dataset for training.

DOI： 10.1299/jsmermd.2016.1a1-09b4

CiNii Research

researchmap
音素バランスを考慮した読み上げ用フリー文章データベースの構築手法

松永寛之, 橋本直矢, 佐々木一磨, 中臺一博, 尾形哲也

人工知能学会全国大会論文集 JSAI2016 1E52 - 1E52 2016

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

DOI： 10.11517/pjsai.jsai2016.0_1e52

CiNii Research

researchmap
Simultaneous Optimization of Acoustic Event Detection and Identification with a UAV-embedded Microphone Array

SUGIYAMA Osamu, UEMURA Satoshi, KOJIMA Ryosuke, NAKADAI Kazuhiro

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2016 1A1-09b6 2016

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

This paper addresses acoustic event detection and identification. These functions should be integrated into a UAV system when we consider an application such as human detection in a disaster situation. For this, we should not focus only on one of these functions, but on both of them. We, thus, propose simultaneous optimization of sound source detection and identification. Preliminary results showed that the proposed optimization can improve the total system performance compared to the case when one of theses functions is optimized.

DOI： 10.1299/jsmermd.2016.1a1-09b6

CiNii Research

researchmap
Human-voice enhancement based on online RPCA for a hose-shaped rescue robot with a microphone array Reviewed

Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno

2015 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR) 1 - 6 2015.10

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ssrr.2015.7442949

Web of Science

researchmap
Microphone-accelerometer based 3D posture estimation for a hose-shaped rescue robot. Reviewed

Yoshiaki Bando, Katsutoshi Itoyama,Array, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii,Array

2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, September 28 - October 2, 2015 2015-December 5580 - 5586 2015.9

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2015.7354168

DOI： 10.1109/iros.2015.7354168

Web of Science

researchmap
Audio-visual speech recognition using deep learning Reviewed

Kuniaki Noda, Yuki Yamaguchi, Kazuhiro Nakadai, Hiroshi G. Okuno, Tetsuya Ogata

Appl. Intell. 42 ( 4 ) 722 - 737 2015.6

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1007/s10489-014-0629-7

Web of Science

Scopus

researchmap
Robot audition: Its rise and perspectives Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015.4

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/icassp.2015.7179045

researchmap
Deep Neural Networkを用いた雑音抑圧及びブラインド音源分離手法の提案とその評価

橋本, 直矢, 野田, 邦昭, 中臺, 一博, 尾形, 哲也

第77回全国大会講演論文集 2015 ( 1 ) 115 - 116 2015.3

　More details

Language：Japanese Publisher：情報処理学会

従来音源分離には独立成分分析等の手法を用いることが一般的であったが，分離フィルタが線形写像となるためその性能には限界があった．本研究では任意の非線形写像を近似できるDeep Neural Network (DNN)を分離フィルタ及び雑音抑圧のモデルとして用いる手法を提案する．提案モデルでは，マイクロホンアレイにより収録した混合音声信号の多チャンネルメルフィルタバンク特徴を入力，目的の音源の音響特徴を出力としてDNNを学習し，分離フィルタをモデル化した．DNNの構造や音響特徴量等の条件を，隠れ層の数やSN比を変化させて評価実験を行った結果，多くの場合において提案手法が従来の方法より高い性能を示す事を確認した．

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00164218/
Posture estimation of hose-shaped robot by using active microphone array Reviewed

Yoshiaki Bando, Takuma Otsuka, Takeshi Mizumoto, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Hiroshi G. Okuno

Advanced Robotics 29 ( 1 ) 35 - 49 2015.1

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2014.981291

Web of Science

researchmap
Improved sound source localization in horizontal plane for binaural robot audition. Reviewed

Ui-Hyun Kim, Kazuhiro Nakadai,Array

Appl. Intell. 42 ( 1 ) 63 - 74 2015.1

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1007/s10489-014-0544-y

Web of Science

CiNii Research

researchmap
Outdoor Acoustic Event Identification using Sound Source Separation and Deep Learning with a Quadrotor-Embedded Microphone Array

Uemura Satoshi, Sugiyama Osamu, Kojima Ryosuke, Nakadai Kazuhiro

The Abstracts of the international conference on advanced mechatronics : toward evolutionary fusion of IT and mechatronics : ICAM 2015 329 - 330 2015

　More details

Language：English Publisher：The Japan Society of Mechanical Engineers

We present acoustic event identification by integration of sound source separation and deep learning based on a convolutional neural network for extremely noisy acoustics signals captured with a 16 ch microphone array embedded in an Unmanned Aerial Vehicle (UAV).We showed that the proposed method can identify over 98% sound sources correctly for a 10 class classification task using 16 ch recorded sound data with a microphone array embedded in a quadrotor.

DOI： 10.1299/jsmeicam.2015.6.329

CiNii Books

researchmap
Beat Tracking for Interactive Dancing Robots Reviewed

Jo{\~{a } }o, Lobato Oliveira, GÃ¶khan Ince, Keisuke Nakamura, Kazuhiro Nakadai, Hiroshi G. Okuno, Fabien Gouyon, Lu{\'{\i, Paulo Reis

Int. J. Human. Robot. 12 ( 04 ) 1550023 2015

　More details

Publisher：World Scientific Pub Co Pte Lt

DOI： 10.1142/s0219843615500231

researchmap
Audio-visual scene understanding utilizing text information for a cooking support robot Reviewed

Ryosuke Kojima, Osamu Sugiyama, Kazuhiro Nakadai

Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Scene understanding based on sound and text information for a cooking support robot Reviewed

Ryosuke Kojima, Osamu Sugiyama, Kazuhiro Nakadai

Current Approaches in Applied Artificial Intelligence: 28th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2015, Seoul, South Korea 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Multimodal scene understanding using CNN and hierarchical HMM for a cooking support robot

Ryosuke Kojima, Osamu Sugiyama, Kazuhiro Nakadai

Machine Learning Summer School 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Interactive interface to optimize sound source localization based on microphone array with coarse-to-fine tuning for humanoids Reviewed

Osamu Sugiyama, Ryosuke Kojima, Kazuhiro Nakadai

Humanoid Robots (Humanoids), 2015 IEEE-RAS 15th International Conference on 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Interactive Interface to Optimize Sound Source Localization with HARK Reviewed

Osamu Sugiyama, Ryosuke Kojima, Kazuhiro Nakadai

Current Approaches in Applied Artificial Intelligence: 28th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2015, Seoul, South Korea 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Acoustic model training based on node-wise weight boundary model increasing speed of discrete neural networks. Reviewed

Ryu Takeda, Kazunori Komatani, Kazuhiro Nakadai

2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015 52 - 58 2015

　More details

Publisher：IEEE

DOI： 10.1109/ASRU.2015.7404773

researchmap
Compensating changes in speaker position for improved voice-based human-robot communication. Reviewed

Randy Gomez, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

15th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2015, Seoul, South Korea, November 3-5, 2015 977 - 982 2015

　More details

Publisher：IEEE

DOI： 10.1109/HUMANOIDS.2015.7363488

researchmap
Temporal smearing compensation in reverberant environment for speech-based human-robot interaction. Reviewed

Randy Gomez, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

IEEE International Conference on Robotics and Automation, ICRA 2015, Seattle, WA, USA, 26-30 May, 2015 3347 - 3353 2015

　More details

Publisher：IEEE

DOI： 10.1109/ICRA.2015.7139661

researchmap
Dereverberation for active human-robot communication robust to speaker's face orientation. Reviewed

Randy Gomez, Levko Ivanchuk, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015 180 - 184 2015

　More details

Publisher：ISCA

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/interspeech/interspeech2015.html#conf/interspeech/GomezINMN15
Utilizing visual cues in robot audition for sound source discrimination in speech-based human-robot communication. Reviewed

Randy Gomez, Levko Ivanchuk, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, September 28 - October 2, 2015 4216 - 4222 2015

　More details

Publisher：IEEE

DOI： 10.1109/IROS.2015.7353974

researchmap
Robot-Audition-based Human-Machine Interface for a Car. Reviewed

Kazuhiro Nakadai, Takeshi Mizumoto, Keisuke Nakamura

2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, September 28 - October 2, 2015 6129 - 6136 2015

　More details

Publisher：IEEE

DOI： 10.1109/IROS.2015.7354250

researchmap
Optimized Wavelet-domain Filtering Under Noisy and Reverberant Conditions Reviewed

R.Gomez, T.Kawahara, K.Nakadai

APSIPA Trans. Signal & Information Process. 4 ( e3 ) 1 - 12 2015

　More details

Language：English Publishing type：Research paper (scientific journal)

researchmap
Prevention of accomplishing synchronous multi-modal human-robot cooperation by using visual rhythms Reviewed

Kenta Yonekura, Chyon Hae Kim, Kazuhiro Nakadai, Hiroshi Tsujino, Kazuhito Yokoi

ADVANCED ROBOTICS 29 ( 14 ) 901 - 912 2015

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2015.1031280

Web of Science

researchmap
Robot Audition Based Acoustic Event Identification Using a Bayesian Model Considering Spectral and Temporal Uncertainties Reviewed

Keisuke Nakamura, Kazuhiro Nakadai

2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) 4840 - 4845 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Sound Source Separation for Robot Audition using Deep Learning Reviewed

Kuniaki Noda, Naoya Hashimoto, Kazuhiro Nakadai, Tetsuya Ogata

2015 IEEE-RAS 15TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS) 389 - 394 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Interactive Sound Source Localization using Robot Audition for Tablet Devices Reviewed

Keisuke Nakamura, Lana Sinapayen, Kazuhiro Nakadai

2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) 6137 - 6142 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
A Case Study of An Automatic Volume Control Interface for A Telepresence System Reviewed

Masaaki Takahashi, Masa Ogata, Michita Imai, Keisuke Nakamura, Kazuhiro Nakadai

2015 24TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN) 517 - 522 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
On-the-spot Calibration of Microphone Array Transfer Functions for Robot Audition Reviewed

Keisuke Nakamura, Surya Ambrose, Kazuhiro Nakadai

2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) 3354 - 3359 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Erratum : A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition (Advanced Robotics (2013) 27 (933-945) DOI: 10.1080/01691864.2013.797139) Reviewed

Nakamura K, Nakadai K, Okuno H.G

Advanced Robotics 28 ( 19 ) 1329 2014.10

　More details

DOI： 10.1080/01691864.2014.943342

Web of Science

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
A sound-based online method for estimating the time-varying posture of a hose-shaped robot Reviewed

Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno

2014 IEEE International Symposium on Safety, Security, and Rescue Robotics (2014) 1 - 6 2014.10

　More details

Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/ssrr.2014.7017665

researchmap
Sound Source Orientation Estimation Based on an Orientation-Extended Beamformer Reviewed

Hirofumi Nakajima, Keiko Kikuchi, Kazuhiro Nakadai, Yutaka Kaneda

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E97A ( 9 ) 1875 - 1883 2014.9

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1587/transfun.E97.A.1875

Web of Science

researchmap
Making a robot dance to diverse musical genre in noisy environments. Reviewed

João Lobato Oliveira, Keisuke Nakamura, Thibault Langlois, Fabien Gouyon, Kazuhiro Nakadai, Angelica Lim, Array,Array

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, September 14-18, 2014 1896 - 1901 2014.9

　More details

DOI： 10.1109/IROS.2014.6942812

DOI： 10.1109/iros.2014.6942812

Web of Science

researchmap
Sound Source Localization with an Autonomous Swarm of Quadrocopters Reviewed

Lana Sinapayen, Keisuke Nakamura, Kazuhiro Nakadai, Hideyuki Takahashi, Tetsuo Kinoshita

Proc. of the workshop on Modular and Swarm Systems — from Nature to Robotics of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2014) 2014.9

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Multi-agent based Sound Source Localization with Multicopters Reviewed

Lana Sinapayen, Keisuke Nakamura, Kazuhiro Nakadai, Hideyuki Takahashi, Tetsuo Kinoshita

Proc. of International Conference on Smart Technologies for Energy, Information and Communication 2014 (IC-STEIC2014) 95 - 102 2014.8

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
マイクロホンアレイのオンライン校正とそのロボット聴覚システムへの応用

中臺一博

日本音響学会誌 70 397 - 402 2014

　More details

Publishing type：Research paper (scientific journal)

CiNii Research

researchmap
Auditory-aware Navigation for Mobile Robots based on Reflection-robust Sound Source Localization and Visual SLAM Reviewed

Gautam Narang, Keisuke Nakamura, Kazuhiro Nakadai

2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC) 4021 - 4026 2014

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software. Reviewed

Osamu Sugiyama, Katsutoshi Itoyama, Kazuhiro Nakadai,Array

2014 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2014, San Diego, CA, USA, October 5-8, 2014 2014-January ( January ) 2335 - 2340 2014

　More details

DOI： 10.1109/smc.2014.6974275

DOI： 10.1109/SMC.2014.6974275

Web of Science

Scopus

researchmap
Lipreading using convolutional neural network. Reviewed

Kuniaki Noda, Yuki Yamaguchi, Kazuhiro Nakadai, Hiroshi G. Okuno, Tetsuya Ogata

INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014 1 1149 - 1153 2014

　More details

Publishing type：Research paper (international conference proceedings)

Web of Science

Scopus

CiNii Research

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/interspeech/interspeech2014.html#conf/interspeech/NodaYNOO14
IMPROVED HANDS-FREE AUTOMATIC SPEECH RECOGNITION IN REVERBERANT ENVIRONMENT CONDITION Reviewed

Randy Gomez, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA) 67 - 71 2014

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/HSCMA.2014.6843253

Web of Science

researchmap
Speech-based Human-Robot Interaction Robust to Acoustic Reflections in Real Environment Reviewed

Randy Gomez, Koji Inoue, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

2014 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2014) 1367 - 1373 2014

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2014.6942735

Web of Science

researchmap
Improvement in Outdoor Sound Source Detection Using a Quadrotor-Embedded Microphone Array Reviewed

Takuma Ohata, Keisuke Nakamura, Takeshi Mizumoto, Tezuka Taiki, Kazuhiro Nakadai

2014 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2014) 1902 - 1907 2014

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2014.6942813

Web of Science

researchmap
Ego-motion Noise Suppression for Robots Based on Semi-Blind Infinite Non-negative Matrix Factorization Reviewed

Taiki Tezuka, Takami Yoshida, Kazuhiro Nakadai

2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) 1 6293 - 6298 2014

　More details

Language：English Publishing type：Research paper (scientific journal)

Web of Science

CiNii Research

researchmap
Noise correlation matrix estimation for improving sound source localization by multirotor UAV. Reviewed

Koutarou Furukawa, Keita Okutani, Kohei Nagira, Takuma Otsuka, Katsutoshi Itoyama, Kazuhiro Nakadai,Array

2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, November 3-7, 2013 3943 - 3948 2013.11

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2013.6696920

DOI： 10.1109/iros.2013.6696920

Web of Science

researchmap
Posture estimation of hose-shaped robot using microphone array localization. Reviewed

Yoshiaki Bando, Takeshi Mizumoto, Katsutoshi Itoyama, Kazuhiro Nakadai,Array

2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, November 3-7, 2013 3446 - 3451 2013.11

　More details

DOI： 10.1109/IROS.2013.6696847

DOI： 10.1109/iros.2013.6696847

Web of Science

researchmap
A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition Reviewed

Nakamura K, Nakadai K, Okuno H.G

Advanced Robotics 27 ( 12 ) 933 - 945 2013.8

　More details

DOI： 10.1080/01691864.2013.797139

Web of Science

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Hands-free human-robot communication robust to speaker's radial position. Reviewed

Randy Gomez, Keisuke Nakamura, Kazuhiro Nakadai, Ui-Hyun Kim,Array, Tatsuya Kawahara

2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, May 6-10, 2013 4329 - 4334 2013.5

　More details

DOI： 10.1109/ICRA.2013.6631190

DOI： 10.1109/icra.2013.6631190

Web of Science

researchmap
Merging Viewpoints of User and Avatar in Telecommunication Using Image and Sound Projector

石井健太郎, 谷口祐司, 大澤博隆, 中臺一博, 今井倫太

情報処理学会論文誌 54 ( 4 ) 1413 - 1421 2013.4

　More details

Language：Japanese

本論文では，仮想的な身体を持つアバタを投影する遠隔コミュニケーションシステムPROT AVATARにおけるアバタ操作手法に関する実験をもとに，得られた知見について議論する．PROT AVATARによるコミュニケーションでは，アバタの操作者の映像を遠隔地に投影するため，表情により感情を伝えることができる．さらに，アバタの操作者にとっては明確ではない，アバタの投影に適切な位置をシステムが自動で計算するため，アバタの操作者は投影位置を考えることなく遠隔の環境内を指し示すことができる．しかし，アバタの操作者が採用する視点はアバタの視点とは異なることがあるため，アバタの操作者の発話がアバタとの対話者にとっては自然ではない場合がある．本論文では，アバタの操作手法として，自動操作手法と半自動操作手法の2つの手法を設計・実装し，比較実験を行った．実験の結果，半自動操作手法のほうが自動操作手法よりも，アバタとの対話者にとって自然な発話を引き出すことが示された．また，実験を通して得られた遠隔コミュニケーションシステム関する知見をまとめる．This paper discusses the findings of the viewpoint of an avatar-controlling user on the basis of experimentation with an implemented telecommunication system named PROT AVATAR. Communication using an avatar with facial expressions is useful when a user wants to express emotions. On top of this feature, our system supports avatar movement toward nearest visible location to the target, which is not obvious for the avatar controller. With our system, the avatar controller can easily refer to something remotely. However, sometimes, the words of an avatar controller may not be intuitive for an avatar viewer, because the avatar controller does not necessarily share the viewpoint of the avatar. We designed automatic and semi-automatic methods for controlling the avatar, and we conducted an experiment to compare the two methods. The results showed that semi-automatic control brought more intuitive utterances for an avatar viewer than fully automatic control, and they have design implications for telecommunication systems.

CiNii Books

researchmap
Development of a sound source localization system for assisting group conversation Reviewed

Mihoko Otake, Myagmarbayar Nergui, Seong-Eun Moon, Kentaro Takagi, Tsutomu Kamashima, Kazuhiro Nakadai

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8102 ( 1 ) 532 - 539 2013

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-642-40852-6-54

Scopus

researchmap
Footstep detection and classification using distributed microphones Reviewed

Kazuhiro Nakadai, Yuta Fujii, Shigeki Sugano

International Workshop on Image Analysis for Multimedia Interactive Services 2013

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/WIAMIS.2013.6616127

Scopus

researchmap
Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears. Reviewed

Ui-Hyun Kim, Kazuhiro Nakadai,Array

Recent Trends in Applied Artificial Intelligence, 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2013, Amsterdam, The Netherlands, June 17-21, 2013. Proceedings 7906 LNAI 282 - 291 2013

　More details

Publisher：Springer

DOI： 10.1007/978-3-642-38577-3_29

researchmap
Mitigating the effects of reverberation for effective human-robot interaction in the real world. Reviewed

Randy Gomez, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

13th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2013, Atlanta, GA, USA, October 15-17, 2013 177 - 182 2013

　More details

Publisher：IEEE

DOI： 10.1109/HUMANOIDS.2013.7029973

researchmap
Real-time Super-resolution Three-dimensional Sound Source Localization for Robots Reviewed

Keisuke Nakamura, Randy Gomez, Kazuhiro Nakadai

2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) 3949 - 3954 2013

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Dereverberation Robust to Speaker's Azimuthal Orientation in Multi-channel Human-Robot Communication Reviewed

Randy Gomez, Keisuke Nakamura, Kazuhiro Nakadai

2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) 3439 - 3445 2013

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
ROBUSTNESS TO SPEAKER POSITION IN DISTANT-TALKING AUTOMATIC SPEECH RECOGNITION Reviewed

Randy Gomez, Keisuke Nakamura, Kazuhiro Nakadai

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) 7034 - 7038 2013

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Sound source localization using joint bayesian estimation with a hierarchical noise model Reviewed

Futoshi Asano, Hideki Asoh, Kazuhiro Nakadai

IEEE Transactions on Audio, Speech and Language Processing 21 ( 9 ) 1953 - 1965 2013

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1109/TASL.2013.2263140

Scopus

researchmap
Improvement of audio-visual score following in robot ensemble with human guitarist. Reviewed

Tatsuhiko Itohara, Kazuhiro Nakadai, Array,Array

12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), Osaka, Japan, November 29 - Dec. 1, 2012 574 - 579 2012.11

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/HUMANOIDS.2012.6651577

DOI： 10.1109/humanoids.2012.6651577

Web of Science

Scopus

researchmap
Live assessment of beat tracking for robot audition. Reviewed

Array,Gökhan Ince, Keisuke Nakamura, Kazuhiro Nakadai, Array,Array,Array

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2012, Vilamoura, Algarve, Portugal, October 7-12, 2012 992 - 997 2012.10

　More details

DOI： 10.1109/IROS.2012.6386100

DOI： 10.1109/iros.2012.6386100

Web of Science

researchmap
An active audition framework for auditory-driven HRI: Application to interactive robot dancing Reviewed

Joao Lobato Oliveira, Gokhan Ince, Keisuke Nakamura, Kazuhiro Nakadai, Hiroshi G. Okuno, Luis Paulo Reis, Fabien Gouyon

2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication 2012.9

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/roman.2012.6343892

researchmap
Intelligent Human Tracking Based on Multimodal Integration

NAKAMURA Keisuke, NAKADAI Kazuhiro, ASANO Futoshi, NAKAJIMA Hirofumi, INCE Gokhan

計測自動制御学会論文集 = Transactions of the Society of Instrument and Control Engineers 48 ( 6 ) 349 - 358 2012.6

　More details

Language：Japanese Publisher：計測自動制御学会

DOI： 10.9746/sicetr.48.349

CiNii Books

CiNii Research

researchmap
Efficient Blind Dereverberation and Echo Cancellation Based on Independent Component Analysis for Actual Acoustic Signals Reviewed

Ryu Takeda, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

NEURAL COMPUTATION 24 ( 1 ) 234 - 272 2012.1

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1162/NECO_a_00219

Web of Science

PubMed

researchmap
Robot audition for dynamic environments Reviewed

Kazuhiro Nakadai, Gokhan Ince, Keisuke Nakamura, Hirofumi Nakajima

2012 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2012 125 - 130 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ICSPCC.2012.6335729

Scopus

researchmap
A Role of Multi-modal Rhythms in Physical Interaction and Cooperation

Kenta Yonekura, Chyon Hae Kim, Kazuhiro Nakadai, Hiroshi Tsujino, Shigeki Sugano

EURASIP Journal on Audio, Speech, and Music Processing 2012

　More details

Language：English

DOI： 10.1186/1687-4722-2012-12

researchmap
Multi-party Human-Robot Interaction with Distant-Talking Speech Recognition Reviewed

Randy Gomez, Tatsuya Kawahara, Keisuke Nakamura, Kazuhiro Nakadai

HRI'12: PROCEEDINGS OF THE SEVENTH ANNUAL ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION 439 - 446 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Online Audio Beat Tracking for a Dancing Robot in the Presence of Ego-Motion Noise in a Real Environment Reviewed

Joao Lobato Oliveira, Goekhan Ince, Keisuke Nakamura, Kazuhiro Nakadai

2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) 403 - 408 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Active Audio-Visual Integration for Voice Activity Detection based on a Causal Bayesian Network Reviewed

Takami Yoshida, Kazuhiro Nakadai

2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids) 370 - 375 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Online Learning for Template-based Multi-channel Ego Noise Estimation Reviewed

Goekhan Ince, Kazuhiro Nakadai, Keisuke Nakamura

2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) 3284 - 3289 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Outdoor Auditory Scene Analysis Using a Moving Microphone Array Embedded in a Quadrocopter Reviewed

Keita Okutani, Takami Yoshida, Keisuke Nakamura, Kazuhiro Nakadai

2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) 3290 - 3295 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
SLAM-based Online Calibration for Asynchronous Microphone Array Reviewed

Hiroki Miura, Takami Yoshida, Keisuke Nakamura, Kazuhiro Nakadai

ADVANCED ROBOTICS 26 ( 17 ) 1941 - 1965 2012

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2012.728690

Web of Science

researchmap
Audio-Visual Voice Activity Detection Based on an Utterance State Transition Model Reviewed

Takami Yoshida, Kazuhiro Nakadai

ADVANCED ROBOTICS 26 ( 10 ) 1183 - 1201 2012

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1080/01691864.2012.687152

Web of Science

researchmap
Real-time Super-resolution Sound Source Localization for Robots Reviewed

Keisuke Nakamura, Kazuhiro Nakadai, Goekhan Ince

2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) 694 - 699 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Sound source localization in spatially colored noise using a hierarchical Bayesian model Reviewed

Futoshi Asano, Hideki Asoh, Kazuhiro Nakadai

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 193 - 196 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ICASSP.2012.6287850

Scopus

researchmap
Ego noise cancellation of a robot using missing feature masks Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura

APPLIED INTELLIGENCE 34 ( 3 ) 360 - 371 2011.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1007/s10489-011-0285-0

Web of Science

researchmap
ロボット聴覚用オープンソースソフトウェアHARKの展開 (特集世界に飛び出す日本のソフトウェア)

中臺一博, 奥乃博

情報処理学会デジタルプラクティス 2 ( 2 ) 133 - 140 2011.6

　More details

Language：Japanese Publisher：情報処理学会

CiNii Books

researchmap
Design and implementation of selectable sound separation on the Texai telepresence system using HARK. Reviewed

Takeshi Mizumoto, Kazuhiro Nakadai, Takami Yoshida, Ryu Takeda, Takuma Otsuka, Toru Takahashi, Array

IEEE International Conference on Robotics and Automation, ICRA 2011, Shanghai, China, 9-13 May 2011 2130 - 2137 2011.5

　More details

DOI： 10.1109/ICRA.2011.5979849

DOI： 10.1109/icra.2011.5979849

Web of Science

researchmap
Robot audition: Missing feature theory approach and active audition Reviewed

Okuno H.G, Nakadai K, Kim H.-D

Springer Tracts in Advanced Robotics 70 ( STAR ) 227 - 244 2011

　More details

DOI： 10.1007/978-3-642-19457-3_14

Web of Science

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Real-Time Audio-to-Score Alignment Using Particle Filter for Coplayer Music Robots. Reviewed

Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Array,Array

EURASIP J. Adv. Sig. Proc. 2011 2011

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1155/2011/384651

Web of Science

Scopus

CiNii Research

researchmap
Robust intonation pattern classification in human robot interaction Reviewed

Martin Heckmann, Kazuhiro Nakadai, Hirofumi Nakajima

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 3144 - + 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Whole Body Motion Noise Cancellation of a Robot for Improved Automatic Speech Recognition Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura

ADVANCED ROBOTICS 25 ( 11-12 ) 1405 - 1426 2011

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1163/016918611X579448

Web of Science

researchmap
SLAM-based Online Calibration of Asynchronous Microphone Array for Robot Audition Reviewed

Hiroaki Miura, Takami Yoshida, Keisuke Nakamura, Kazuhiro Nakadai

2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 524 - 529 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Restoration of Clipped Audio Signal Using Recursive Vector Projection Reviewed

Shin Miura, Hirofumi Nakajima, Shigeki Miyabe, Shoji Makino, Takeshi Yamada, Kazuhiro Nakadai

2011 IEEE REGION 10 CONFERENCE TENCON 2011 394 - 397 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Incremental Learning for Ego Noise Estimation of a Robot Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Jun-ichi Imura, Keisuke Nakamura, Hirofumi Nakajima

2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 131 - 136 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
HARK based Real-time Single Pane 3D Auditory Scene Visualizer Empowered by Speech Arrow Reviewed

Zheng Gong, Kazuhiro Nakadai, Hirofumi Nakajima, Ichiro Hagiwara

2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 530 - 535 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
CORRELATION MATRIX INTERPOLATION IN SOUND SOURCE LOCALIZATION FOR A ROBOT Reviewed

Keisuke Nakamura, Kazuhiro Nakadai, Hirofumi Nakajima, Goekhan Ince

2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING 4324 - 4327 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Bayesian Extension of MUSIC for Sound Source Localization and Tracking. Reviewed

Takuma Otsuka, Kazuhiro Nakadai, Tetsuya Ogata, Hiroshi G. Okuno

INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011 3109 - 3112 2011

　More details

Publishing type：Research paper (international conference proceedings)

Web of Science

Scopus

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/interspeech/interspeech2011.html#conf/interspeech/OtsukaNOO11
Incremental Bayesian Audio-to-Score Alignment with Flexible Harmonic Structure Models. Reviewed

Takuma Otsuka, Kazuhiro Nakadai, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, Florida, USA, October 24-28, 2011 525 - 530 2011

　More details

Publishing type：Research paper (international conference proceedings) Publisher：University of Miami

Scopus

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/ismir/ismir2011.html#conf/ismir/OtsukaNOO11
Assessment of Single-channel Ego Noise Estimation Methods Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Jun-ichi Imura, Keisuke Nakamura, Hirofumi Nakajima

2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 106 - 111 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Intelligent Sound Source Localization and Its Application to Multimodal Human Tracking Reviewed

Keisuke Nakamura, Kazuhiro Nakadai, Futoshi Asano, Goekhan Ince

2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 143 - 148 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Assessment of General Applicability of Ego Noise Estimation - Applications to Automatic Speech Recognition and Sound Source Localization Reviewed

Goekhan Ince, Keisuke Nakamura, Futoshi Asano, Hirofumi Nakajima, Kazuhiro Nakadai

2011 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Rhythmic Reference of a Human while a Rope Turning Task Reviewed

Kenta Yonekura, Chyon Hae Kim, Kazuhiro Nakadai, Hiroshi Tsujino, Shigeki Sugano

PROCEEDINGS OF THE 6TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTIONS (HRI 2011) 289 - 290 2011

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
A multi-expert model for dialogue and behavior control of conversational robots and agents. Reviewed

Mikio Nakano, Yuji Hasegawa, Kotaro Funakoshi, Johane Takeuchi, Toyotaka Torii, Kazuhiro Nakadai, Naoyuki Kanda, Kazunori Komatani, Array,Array

Knowl.-Based Syst. 24 ( 2 ) 248 - 256 2011

　More details

DOI： 10.1016/j.knosys.2010.08.004

researchmap
ロボット聴覚のための2階層視聴覚情報統合を用いた音声認識システムの検討 Reviewed

中臺一博, 奥乃博

日本ロボット学会誌 28巻8号 56 - 63 2011

　More details

Publishing type：Research paper (scientific journal)

CiNii Research

researchmap

Other Link： https://kaken.nii.ac.jp/grant/KAKENHI-PUBLICLY-21013030/
Audio-Visual Speech Recognition System for Robots Based on Two-layered Audio-Visual Integration Framework

YOSHIDA Takami, NAKADAI Kazuhiro, OKUNO Hiroshi G

Journal of the Robotics Society of Japan 28 ( 8 ) 970 - 977 2010.10

　More details

Language：Japanese Publisher：日本ロボット学会

Noise-robust Automatic Speech Recognition (ASR) is essential for robots which are expected to communicate with human in a daily environment. In such an environment, Voice Activity Detection (VAD) performance becomes poor, and ASR performance deteriorates due to noises and VAD failures. To cope with these problems, it is said that humans improve speech recognition performance by using visual information like lip reading. Thus, we propose two-layered audio-visual integration framework for VAD and ASR. The two-layered AV integration framework includes three crucial methods. The first is Audio-Visual Voice Activity Detection (AV-VAD) based on Bayesian network. The second is a new lip-related visual feature which is robust for visual noises. The last one is microphone array processing to improve Signal-to-Noise Ratio (SNR) of input signal. We implemented prototype audio-visual speech recognition system based on our proposed framework using HARK which is our robot audition system. Through voice activity detection and speech recognition experiments, we showed the effectiveness of Audio-Visual integration, microphone array processing, and their combination for VAD and ASR. Preliminary results show that our system improves 20 and 9.7 points of ASR results with/without microphone array processing, respectively, and also improves robustness against several auditory/visual noise conditions.

DOI： 10.7210/jrsj.28.970

CiNii Books

CiNii Research

researchmap
PROT — An embodied agent for intelligible and user-friendly human-robot interaction Reviewed

R Fujimura, K Nakadai, M Imai, R Ohmura

2010 IEEE/RSJ International Conference on Intelligent Robots and Systems 2010.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/iros.2010.5649116

researchmap
Wave acoustic numerical simulation for sound source orientation estimation

595 - 598 2010.9

　More details

Language：Japanese Publishing type：Research paper (scientific journal)

researchmap
Sound source orientation estimation using wave acoustic numerical simulation

Proceedings of the Annual Conference of the Robotics Society of Japan 1H2 - 2 2010.9

　More details

Language：Japanese Publishing type：Research paper (scientific journal)

researchmap
Blind Source Separation With Parameter-Free Adaptive Step-Size Method for Robot Audition Reviewed

Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa, Hiroshi Tsujino

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 18 ( 6 ) 1476 - 1485 2010.8

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1109/TASL.2009.2035219

Web of Science

researchmap
Special Interest Group on AI Challenge(<Special Issue>Comprehensive Guide to JSAI SIGs)

NAKADAI Kazuhiro, MITSUNAGA Noriaki

Journal of the Japanese Society for Artificial Intelligence 25 ( 4 ) 545 - 546 2010.7

　More details

Language：Japanese Publisher：The Japanese Society for Artificial Intelligence

DOI： 10.11517/jjsai.25.4_545

CiNii Books

CiNii Research

researchmap
Upper-limit evaluation of robot audition based on ICA-BSS in multi-source, barge-in and highly reverberant conditions Reviewed

Ryu Takeda, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G Okuno

2010 IEEE International Conference on Robotics and Automation 2010.5

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/robot.2010.5509891

researchmap
Improvement in listening capability for humanoid robot HRP-2 Reviewed

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G Okuno

2010 IEEE International Conference on Robotics and Automation 2010.5

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/robot.2010.5509830

researchmap
Soft missing-feature mask generation for robot audition. Reviewed

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata,Array

Paladyn 1 ( 1 ) 37 - 47 2010.1

　More details

Publisher：Walter de Gruyter {GmbH}

DOI： 10.2478/s13230-010-0005-1

researchmap
Voice-awareness control for a humanoid robot consistent with its body posture and movements. Reviewed

Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata,Array

Paladyn 1 ( 1 ) 80 - 88 2010.1

　More details

Publisher：Walter de Gruyter {GmbH}

DOI： 10.2478/s13230-010-0009-x

researchmap
Design and Implementation of Robot Audition System 'HARK' — Open Source Software for Listening to Three Simultaneous Speakers Reviewed

Kazuhiro Nakadai, Toru Takahashi, Hiroshi G. Okuno, Hirofumi Nakajima, Yuji Hasegawa, Hiroshi Tsujino

Advanced Robotics 24 ( 5-6 ) 739 - 761 2010.1

　More details

Publishing type：Research paper (scientific journal) Publisher：Informa {UK} Limited

DOI： 10.1163/016918610x493561

CiNii Research

researchmap
Applying geometric source separation for improved pitch extraction in human-robot interaction

Martin Heckmann, Claudius Gläser, Frank Joublin, Kazuhiro Nakadai

Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 2602 - 2605 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：International Speech Communication Association

Scopus

researchmap
Audio-Visual Speech Recognition System for Robots Based on Two-Layered Audio-Visual Integration Framework

Yoshida Takami, Nakadai Kazuhiro, Okuno Hiroshi G

Journal of the Robotics Society of Japan 28 ( 8 ) 970 - 977 2010

　More details

Language：Japanese Publisher：一般社団法人日本ロボット学会

Noise-robust Automatic Speech Recognition (ASR) is essential for robots which are expected to communicate with human in a daily environment. In such an environment, Voice Activity Detection (VAD) performance becomes poor, and ASR performance deteriorates due to noises and VAD failures. To cope with these problems, it is said that humans improve speech recognition performance by using visual information like lip reading. Thus, we propose two-layered audio-visual integration framework for VAD and ASR. The two-layered AV integration framework includes three crucial methods. The first is Audio-Visual Voice Activity Detection (AV-VAD) based on Bayesian network. The second is a new lip-related visual feature which is robust for visual noises. The last one is microphone array processing to improve Signal-to-Noise Ratio (SNR) of input signal. We implemented prototype audio-visual speech recognition system based on our proposed framework using HARK which is our robot audition system. Through voice activity detection and speech recognition experiments, we showed the effectiveness of Audio-Visual integration, microphone array processing, and their combination for VAD and ASR. Preliminary results show that our system improves 20 and 9.7 points of ASR results with/without microphone array processing, respectively, and also improves robustness against several auditory/visual noise conditions.

DOI： 10.7210/jrsj.28.970

CiNii Research

researchmap
Design and Implementation of Two-level Synchronization for Interactive Music Robot. Reviewed

Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010 2 1238 - 1244 2010

　More details

Publishing type：Research paper (international conference proceedings) Publisher：AAAI Press

Scopus

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/aaai/aaai2010.html#conf/aaai/OtsukaNTKOO10
An improvement in automatic speech recognition using soft missing feature masks for robot audition. Reviewed

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Array,Array

2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 18-22, 2010, Taipei, Taiwan 964 - 969 2010

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2010.5650540

Web of Science

Scopus

researchmap
Two-layered audio-visual speech recognition for robots in noisy environments. Reviewed

Takami Yoshida, Kazuhiro Nakadai, Array

2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 18-22, 2010, Taipei, Taiwan 988 - 993 2010

　More details

DOI： 10.1109/IROS.2010.5651205

Web of Science

Scopus

researchmap
Human-robot ensemble between robot thereminist and human percussionist using coupled oscillator model. Reviewed

Takeshi Mizumoto, Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Array,Array

2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 18-22, 2010, Taipei, Taiwan 1957 - 1963 2010

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2010.5650364

Web of Science

Scopus

researchmap
An Improvement in Audio-Visual Voice Activity Detection for Automatic Speech Recognition. Reviewed

Takami Yoshida, Kazuhiro Nakadai, Array

Trends in Applied Intelligent Systems - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Cordoba, Spain, June 1-4, 2010, Proceedings, Part I 6096 LNAI ( PART 1 ) 51 - 61 2010

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1007/978-3-642-13022-9_6

Web of Science

Scopus

CiNii Research

researchmap
Music-Ensemble Robot That Is Capable of Playing the Theremin While Listening to the Accompanied Music. Reviewed

Takuma Otsuka, Takeshi Mizumoto, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Array,Array

Trends in Applied Intelligent Systems - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Cordoba, Spain, June 1-4, 2010, Proceedings, Part I 6096 LNAI ( PART 1 ) 102 - 112 2010

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1007/978-3-642-13022-9_11

Web of Science

Scopus

CiNii Research

researchmap
Speedup and performance improvement of ICA-based robot audition by parallel and resampling-based block-wise processing. Reviewed

Ryu Takeda, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Array,Array

2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 18-22, 2010, Taipei, Taiwan 1949 - 1956 2010

　More details

DOI： 10.1109/IROS.2010.5652757

Web of Science

Scopus

researchmap
Multi-talker Speech Recognition under Ego-motion Noise using Missing Feature Theory Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 982 - 987 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
A Robust Speech Recognition System against the Ego Noise of a Robot Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 2070 - + 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Robust Ego Noise Suppression of a Robot Reviewed

Gokhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-Ichi Imura

TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS 6096 62 - + 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

CiNii Research

researchmap
A Hybrid Framework for Ego Noise Cancellation of a Robot Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Yuji Hasegawa, Hiroshi Tsujino, Jun-ichi Imura

2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) 3623 - 3628 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Two-Layered Audio-Visual Integration in Voice Activity Detection and Automatic Speech Recognition for Robots Reviewed

Takami Yoshida, Kazuhiro Nakadai

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 2710 - 2713 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
An Easily-configurable Robot Audition System using Histogram-based Recursive Level Estimation Reviewed

Hirofumi Nakajima, Goekhan Ince, Kazuhiro Nakadai, Yuji Hasegawa

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 958 - 963 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Sound Source Separation and Automatic Speech Recognition for Moving Sources Reviewed

Kazuhiro Nakadai, Hirofumi Nakajima, Goekhan Ince, Yuji Hasegawa

IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010) 976 - 981 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Correlation matrix estimation by an optimally controlled recursive average method and its application to blind source separation Reviewed

Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa, Hiroshi Tsujino

Acoustical Science and Technology 31 ( 3 ) 205 - 212 2010

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1250/ast.31.205

Scopus

researchmap
3D sound field recording and reproducing system including sound source orientation Reviewed

Toshimasa Suzuki, Hirofumi Nakajima, Hideo Tsuru, Takayuki Arai, Kazuhiro Nakadai

2010 4th International Universal Communication Symposium, IUCS 2010 - Proceedings 215 - 220 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IUCS.2010.5666221

Scopus

researchmap
Pitch extraction in human-robot interaction Reviewed

Martin Heckmann, Frank Joublin, Kazuhiro Nakadai

IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings 1482 - 1487 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2010.5649882

Scopus

researchmap
Robust hands-free automatic speech recognition for human-machine interaction Reviewed

Randy Gomez, Tatsuya Kawahara, Kazuhiro Nakadai

2010 10th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2010 138 - 143 2010

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ICHR.2010.5686828

Scopus

researchmap
Movable projection avatar system "Remy" for helping remote collaboration handling real objects

FUJIMURA Ryota, GUO BIN, OHMURA Ren, NAKADAI Kazuhiro, IMAI Michita

Journal of Japan Society for Fuzzy Theory and Intelligent Informatics 21 ( 5 ) 701 - 712 2009.10

　More details

Language：Japanese Publisher：Japan Society for Fuzzy Theory and Intelligent Informatics

In this paper, we describe a movable projection avatar system named “Remy”. Remy aims to support communication that share other party's actual environment from a remote environment. There are three issues in a lot of existing studies in remote communication system. First, there are few systems that consider sharing other party's actual environment. Second, even in the systems sharing other party's actual environment, users suffer from distraction of some devices. Third, nonverbal communication is not considered in a lot of existing systems. Remy solves these three issues by projecting two dimension avatar on actual environment. Conducting some Remy's evaluations from local user's perspective, the results showed that Remy can solve three issues and enhance quality of communication.

DOI： 10.3156/jsoft.21.701

CiNii Books

researchmap
Step-size parameter adaptation of multi-channel semi-blind ICA with piecewise linear model for barge-in-able robot audition Reviewed

Ryu Takeda, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2009 IEEE/RSJ International Conference on Intelligent Robots and Systems 2009.10

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/iros.2009.5354527

researchmap
Incremental polyphonic audio to score alignment using beat tracking for singer robots Reviewed

Takuma Otsuka, Toru Takahashi, Hiroshi G. Okuno, Kazunori Komatani, Tetsuya Ogata, Kazumasa Murata, Kazuhiro Nakadai

2009 IEEE/RSJ International Conference on Intelligent Robots and Systems 2289 - 2296 2009.10

　More details

Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/iros.2009.5354637

Scopus

researchmap
Missing-feature-theory-based robust simultaneous speech recognition system with non-clean speech acoustic model Reviewed

Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2009 IEEE/RSJ International Conference on Intelligent Robots and Systems 2730 - 2735 2009.10

　More details

Publishing type：Research paper (international conference proceedings) Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/iros.2009.5354201

Scopus

researchmap
Sound Source Separation Adaptable to Environmental Changes for Robot Audition

NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, TSUJINO Hiroshi

Journal of the Robotics Society of Japan 27 ( 7 ) 774 - 781 2009.9

　More details

Language：Japanese Publisher：一般社団法人日本ロボット学会

This paper describes a novel sound source separation method for a robot that needs to cope with dynamically changing noises in the real world. A sound source separation method, Geometric Source Separation (GSS), is promising because it has high separation performance but does not require a high computational cost. However, GSS has several issues when applied to real-world applications such as robot audition systems that are used in dynamically changing environments. To improve performance in dynamically changing environments, we propose two effective techniques. One is Adaptive Step-size control (AS) this adaptively sets the step-size to the optimum value. The other is Optima Controlled Recursive Average that improves the precision of an estimated separation matrix, and thus achieves high separation performance. We evaluated GSS with and without our proposed methods using an 8ch microphone array embedded in Honda ASIMO. Experimental results showed that the proposed methods improved GSS performance in dynamically changing environment.

DOI： 10.7210/jrsj.27.774

CiNii Books

researchmap
Musical Beat-Tracking for Robots and Its Application to A Music Robot

MURATA Kazumasa, NAKADAI Kazuhiro, TAKEDA Ryu, OKUNO Hiroshi G., HASEGAWA Yuji, TSUJINO Hiroshi

Journal of the Robotics Society of Japan 27 ( 7 ) 793 - 801 2009.9

　More details

Language：Japanese Publisher：The Robotics Society of Japan

Human-robot interaction through music in real environments is essential for robots, because such a robot makes people enjoyable. To deal with real music signals by using robot's own ears, we propose a beat-tracking algorithm for a robot based on semi-blind independent component analysis (SB-ICA) and spectro-temporal pattern matching (STPM). SB-ICA suppresses a self-generating sound such as singing or scatting which heavily affects beat-tracking due to its periodicity. STPM provides quick adaptation to beat changes because it is able to use a shorter matching window than conventional beat-tracking methods based on self-correlation functions. We thus developed a music robot which steps, sings, and scats according to musical beats based on the proposed beat-tracking method. The experimental results using the music robot showed highly noise-robust beat-tracking even when the robot was singing or scatting, and quick adaptation to beat changes like a human clapping sound whose tempo is always changing.

DOI： 10.7210/jrsj.27.793

CiNii Books

researchmap
Robot Audition based on Multiple-Input Independent Component Analysis for Recognizing Barge-In Speech under Reverberation

TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G.

Journal of the Robotics Society of Japan 27 ( 7 ) 782 - 792 2009.9

　More details

Language：Japanese Publisher：The Robotics Society of Japan

This paper presents a new method based on independent component analysis (ICA) for enhancing a target source and suppressing other interfering sound sources, supposed that the latter are known. The method can provides in a reverberant environment a barge-in-able robot audition system; that is, the user can talk to the robot at any time even when the robot speaks. Our method separates and dereverberates the user's speech and the robot's one by using Multiple Input ICA. The critical issue for real-time processing is to reduce the computational complexity of Multiple Input ICA to the linear order of the reverberation time, which has not been proposed so far. We attain it by exploit the property of the independence relationship between late observed signals and late speech signals. Experimental results show that 1) the computational complexity of our method is less than the naïve Multiple Input ICA method, and that 2) our method improves word correctness of automatic speech recognition under barge-in and reverberant situations; by at most 40 points for reverberation time of 240[ms] and 30 points for 670[ms].

DOI： 10.7210/jrsj.27.782

CiNii Books

researchmap
The acoustic simulation of directivity by modeling the shape of a sound source with a finite difference method in time domain

SUZUKI, Toshimasa, NAKAJIMA Hirofumi, ARAI Takayuki, NAKADAI Kazuhiro, HASEGAWA Yuji

821 - 824 2009.9

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：Acoustical Society of Japan

researchmap
Frontiers of Music Information Processing Technologies: Real-time Music Information Processing for Music Robots

OKUNO Hiroshi G, NAKADAI Kazuhiro, OHTSUKA Takuma

IPSJ Magazine 50 ( 8 ) 729 - 734 2009.8

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：一般社団法人情報処理学会

音楽のリズムに合わせて振舞う音楽ロボットを目標に据えると, 音楽情報処理の課題が見えてくる.

CiNii Books

CiNii Research

researchmap
ICA-based efficient blind dereverberation and echo cancellation method for barge-in-able robot audition. Reviewed

Ryu Takeda, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Array,Array

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, 19-24 April 2009, Taipei, Taiwan 3677 - 3680 2009.4

　More details

DOI： 10.1109/ICASSP.2009.4960424

DOI： 10.1109/icassp.2009.4960424

Web of Science

researchmap
Ego Noise Suppression of a Robot Using Template Subtraction Reviewed

Goekhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Yuji Hasegawa, Hiroshi Tsujino, Jun-ichi Imura

2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 199 - 204 2009

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Automatic estimation of reverberation time with robot speech to improve ICA-based robot audition Reviewed

Ryu Takeda, Kazuhiro Nakadai, Torn Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2009 9th IEEE-RAS International Conference on Humanoid Robots 2009

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/ichr.2009.5379572

researchmap
Automatic speech recognition improved by two-layered audio-visual integration for robot audition Reviewed

Takami Yoshida, Kazuhiro Nakadai, Hiroshi G. Okuno

2009 9th IEEE-RAS International Conference on Humanoid Robots 2009

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/ichr.2009.5379586

researchmap
Voice quality manipulation for humanoid robots consistent with their head movements. Reviewed

Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Kazunori Komatani, Array,Array

9th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2009, Paris, France, December 7-10, 2009 405 - 410 2009

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ICHR.2009.5379569

Scopus

researchmap
Intelligent Sound Source Localization for Dynamic Environments Reviewed

Keisuke Nakamura, Kazuhiro Nakadai, Futoshi Asano, Yuji Hasegawa, Hiroshi Tsujino

2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 664 - 669 2009

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Real-time sound source orientation estimation using a 96 channel microphone array Reviewed

Hirofumi Nakajima, Keiko Kikuchi, Toru Daigo, Yutaka Kaneda, Kazuhiro Nakadai, Yuji Hasegawa

2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 676 - 683 2009

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
SOUND SOURCE SEPARATION OF MOVING SPEAKERS FOR ROBOT AUDITION Reviewed

Kazuhiro Nakadai, Hirofumi Nakajima, Yuji Hasegawa, Hiroshi Tsujino

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS 3685 - 3688 2009

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Robot Audition using an Adaptive Filter Based on Independent Component Analysis

TAKEDA Ryu, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G.

Journal of the Robotics Society of Japan 26 ( 6 ) 529 - 536 2008.8

　More details

Language：Japanese Publisher：The Robotics Society of Japan

This paper describes a new adaptive filter algorithm based on independent component analysis (ICA) for enhancing a target sound and for suppressing other interference sounds that are known. The technique can provide barge-in capable robot audition systems by utilizing known sound source signals such as self speech. Unlike a conventional ICA-based method, we use the time-frequency domain convolution model to cope with reflections of the sound. Experimental results showed that our method outperformed the conventional ICA-based method and the well-known adaptive filter algorithm called Nomalized Least Mean Squares (LMS) .

DOI： 10.7210/jrsj.26.529

CiNii Books

researchmap
Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation Reviewed

Takeda R, Nakadai K, Komatani K, Ogata T, Okuno H.G

2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 1718 - 1723 2008

　More details

DOI： 10.1109/IROS.2008.4650799

Web of Science

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
A Robot Uses Its Own Microphone to Synchronize Its Steps to Musical Beats While Scatting and Singing Reviewed

Kazumasa Murata, Kazuhiro Nakadai, Kazuyoshi Yoshii, Ryu Takeda, Toyotaka Torii, Hiroshi G. Okuno, Yuji Hasegawa, Hiroshi Tsujino

2008 IEEE/RSJ INTERNATIONAL CONFERENCE ON ROBOTS AND INTELLIGENT SYSTEMS, VOLS 1-3, CONFERENCE PROCEEDINGS 2459 - + 2008

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2008.4650596

Web of Science

researchmap
High performance sound source separation adaptable to environmental changes for robot audition Reviewed

Hirofumi Nakajima, Kazuhiro Nakadai, Yuuji Hasegawa, Hiroshi Tsujino

2008 IEEE/RSJ INTERNATIONAL CONFERENCE ON ROBOTS AND INTELLIGENT SYSTEMS, VOLS 1-3, CONFERENCE PROCEEDINGS 2165 - 2171 2008

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2008.4650597

Web of Science

researchmap
Computational auditory scene analysis and its application to robot audition Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai

2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS 125 - + 2008

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
A robot referee for rock-paper-scissors sound games Reviewed

Kazuhiro Nakadai, Shunichi Yamamoto, Hiroshi G. Okuno, Hirofumi Nakajima, Yuji Hasegawa, Hiroshi Tsujino

2008 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-9 3469 - + 2008

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Adaptive step-size parameter control for real-world blind source separation Reviewed

Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa, Hiroshi Tsujino

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 149 - 152 2008

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ICASSP.2008.4517568

Web of Science

researchmap
Moving sound source extraction by time-variant beamforming Reviewed

Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa, Hiroshi Tsujino

NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE 4914 47 - 53 2008

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-540-78197-4_6

Web of Science

researchmap
A portable robot audition software system for multiple simultaneous speech signals Reviewed

Okuno H.G, Yamamoto S, Nakadai K, Valin J.-M, Ogata T, Komatani K

Proceedings - European Conference on Noise Control 123 ( 5 ) 483 - 488 2008

　More details

Publishing type：Research paper (international conference proceedings) Publisher：Acoustical Society of America ({ASA})

DOI： 10.1121/1.2932825

Scopus

CiNii Research

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
A Robot Singer with Music Recognition Based on Real-Time Beat Tracking Reviewed

Kazumasa Murata, Kazuhiro Nakadai, Kazuyoshi Yoshii, Ryu Takeda, Toyotaka Torii, Hiroshi G. Okuno, Yuji Hasegawa, Hiroshi Tsujino

ISMIR 2008, 9th International Conference on Music Information Retrieval, Drexel University, Philadelphia, PA, USA, September 14-18, 2008 199 - 204 2008

　More details

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
An open source software system for robot audition HARK and its evaluation. Reviewed

Kazuhiro Nakadai, Array, Hirofumi Nakajima, Yuji Hasegawa, Hiroshi Tsujino

8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008, Daejeon, South Korea, December 1-3, 2008 561 - 566 2008

　More details

DOI： 10.1109/ICHR.2008.4756031

Web of Science

researchmap
Soft missing-feature mask generation for simultaneous speech recognition system in robots. Reviewed

Toru Takahashi, Shun'ichi Yamamoto, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008 992 - 995 2008

　More details

Publishing type：Research paper (international conference proceedings)

Web of Science

Scopus

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/interspeech/interspeech2008.html#conf/interspeech/TakahashiYNKOO08
Noise Robust Automatic Speech Recognition Method for the Robot with Motor Noise using Missing Feature Theory

NISHIMURA Yoshitaka, ISHIZUKA Mitsuru, NAKADAI Kazuhiro, NAKANO Mikio, TSUJINO Hiroshi

Journal of the Robotics Society of Japan 25 ( 8 ) 1189 - 1198 2007.11

　More details

Language：Japanese Publisher：一般社団法人日本ロボット学会

Automatic speech recognition (ASR) is essential for human-humanoid communication. One of the main problems with ASR by a humanoid is that it is inevitably generates motor noises. These noises are easily captured by the humanoid's microphones because the noise sources are closer to the microphones than the target speech source. Thus, the signal-to-noise ratio (SNR) of input speech becomes quite low (sometimes less than 0 [dB] ) . However, it is possible to estimate these noises by using information on the humanoid's motions and gestures. This paper proposes a method to improve ASR for a humanoid with motor noises by utilizing its motion/gesture information. The method consists of noise suppression and missing-feature-theory-based ASR (MFT-ASR) . The proposed noise suppression technique is based on spectral subtraction, and a white noise is added to blur distortion of suppression. MFT-ASR improves ASR by masking unreliable acoustic features in the input sound. The motion/gesture information is used for obtaining the unreliable acoustic features. Furthermore, we also evaluated with the acoustic model adaptation technique called MLLR (Maximum Likelihood Linear Regression) . Un-supervised MLLR was used for the adaptation. We evaluated the proposed method through recognition of speech recorded by using Honda ASIMO in a room with reverberation. The noise data contained 34 kinds of noises: motor noises without motions, gesture noises, walking noises, and other kind of noises. The experimental results show that the proposed method outperforms the conventional multi-condition training technique.

DOI： 10.7210/jrsj.25.1189

CiNii Books

researchmap
Tracking of Multiple Sound Sources by Integration of Robot-Embedded and In-Room Microphone Arrays

NAKADAI Kazuhiro, NAKAJIMA Hirofumi, MURASE Masamitsu, OKUNO Hiroshi G, HASEGAWA Yuji, TSUJINO Hiroshi

Journal of the Robotics Society of Japan 25 ( 6 ) 979 - 989 2007.9

　More details

Language：Japanese Publisher：一般社団法人日本ロボット学会

Real-time and robust sound source tracking is an important function for a robot operating in a daily environment, because the robot should recognize where a sound event such as speech, music and other environmental sounds originates from. This paper addresses real-time sound source tracking by spatial integration of an in-room microphone array (IRMA) and a robot-embedded microphone array (REMA) . The IRMA system consists of 64 ch microphones attached to the walls. It localizes multiple sound sources based on weighted delay-and-sum beamforming on a 2D plane. The REMA system localizes multiple sound sources in azimuth using eight microphones attached to a robot's head on a rotational table. A particle filter integrates their localization results to track multiple sound sources. The experimental results show that particle filter based integration improved accuracy and robustness of sound source tracking even when the robot's head was in rotation.

DOI： 10.7210/jrsj.25.979

CiNii Books

researchmap
Robust Recognition of Simultaneous Speech by a Mobile Robot Reviewed

Jean-Marc Valin, Shun{\textquotesingle}ichi Yamamoto, Jean Rouat, Francois Michaud, Kazuhiro Nakadai, Hiroshi G. Okuno

IEEE Trans. Robot. 23 ( 4 ) 742 2007.8

　More details

Publisher：Institute of Electrical {\&} Electronics Engineers ({IEEE})

DOI： 10.1109/tro.2007.900612

researchmap
Real-World Auditory Scene Analysis by Information Integration : Sound Source Tracking by Integration of Multiple Microphone Arrays

NAKADAI Kazuhiro

計測と制御 = Journal of the Society of Instrument and Control Engineers 46 ( 6 ) 427 - 433 2007.6

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：計測自動制御学会

DOI： 10.11499/sicejl1962.46.427

CiNii Books

CiNii Research

researchmap
Robust Domain Selection Using Dialogue History in Multi-domain Spoken Dialogue Systems

KANDA NAOYUKI, KOMATANI KAZUNORI, NAKANO MIKIO, NAKADAI KAZUHIRO, TSUJINO HIROSHI, OGATA TETSUYA, OKUNO HIROSHI G

IPSJ journal 48 ( 5 ) 1980 - 1989 2007.5

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

We have developed a robust domain selection method using dialogue history in multi-domain spoken dialogue systems. We define domain selection as a classifying problem among (I) the domain in the previous turn, (II) the domain in which N-best speech recognition results can be accepted with the highest recognition score, (III) other domains. We constructed a classifier by decision tree learning with dialogue data. We implemented a multi-domain spoken dialogue system with 5 domains, and collected dialogue data from 10 subjects. The experimental result showed our method reduced 16.2% of domain selection errors, compared with a conventional method using speech recognition likelihoods only.

CiNii Books

researchmap
音環境を可視化する録音再生システム

吉田, 雅敏, 海尻, 聡, 山本, 俊一, 中, 臺一博, 駒谷, 和範, 尾形, 哲也, 奥乃, 博

第69回全国大会講演論文集 2007 ( 1 ) 563 - 564 2007.3

　More details

Language：Japanese Publisher：情報処理学会

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00173375/
Implementation and Evaluation of Sound Source Separation Filter on Dynamically Reconfigurable Processor

KUROTAKI Shunsuke, SUZUKI Noriaki, NAKADAI Kazuhiro, OKUNO Hiroshi G, AMANO Hideharu

The IEICE transactions on information and systems 90 ( 3 ) 897 - 907 2007.3

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：一般社団法人電子情報通信学会

近年,人間と共生するロボットが多数登場してきている.これらのロボットが人間と言語を用いたインタラクションを行うためには音声認識が必要となるが,従来の音声認識手法は単一音源を対象としているため,複数人の同時発話や周囲に雑音がある環境では著しく認識精度が低下してしまうという問題がある.よって,実環境での音声認識にはその前処理として,混合音から注目する音声信号のみを抽出する音源分離処理が不可欠となる。実時間で音源分離を行うためには多大な計算コストを要する一方で,自律型のロボットは消費電力やシステムのサイズ等の面で厳しい制限をもつため,汎用プロセッサによる実装は現実的ではない.そこで,本研究ではNECエレクトロニクス社の動的再構成可能プロセッサDRP-1上に音源分離処理を実装し,ロボットへの搭載に適したシステムを目指した.実験の結果,DRP上の音源分離フィルタは実時間で精度の良い音源分離を実現し,低面積コストかつ,FPGAなど従来のデバイスと比較して低消費電力で必要な性能を実現できる事が示された.

CiNii Books

CiNii Research

researchmap
Simultaneous Speech Recognition Based on Automatic Missing Feature Mask Generation by Integrating Sound Source Separation

YAMAMOTO Shunichi, VALIN Jean-Marc, NAKADAI Kazuhiro, NAKANO Mikio, TSUJINO Hiroshi, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

Journal of the Robotics Society of Japan 25 ( 1 ) 92 - 102 2007.1

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：一般社団法人日本ロボット学会

Our goal is to realize a humanoid robot that has the capabilities of recognizing simultaneous speech. A humanoid robot under real-world environments usually hears a mixture of sounds, and thus three capabilities are essential for robot audition; sound source localization, separation, and recognition of separated sounds. In particular, an interface between sound source separation and speech recognition is important. In this paper, we designed an interface between sound source separation and speech recogniton by applying Missing Feature Theory (MFT) . In this method, spectral sub-bands distorted by sound source separation are detected from input speech as missing features. The detected missing features are masked on recognition not to affect the system badly. Therefore, this method is more flexible when noises change dynamically and drastically. It is the most important issue how distorted spectral sub-bands are detected. To solve the issue, we used speech feature apropriate for MFT-based ASR, and developed automatic missing feature mask generation. As a speech feature, we used a Mel-Scale Log Spectral (MSLS) feature instead of Mel-Frequency Cepstrum Coefficient (MFCC) which is commonly used for ASR. We presented a method of generating missing feature mask automatically by using information from sound source separation. To evaluate our method, we implemented it in a humanoid robotSIG2, and performed the experiments on recognition of three simultaneous isolated words. As a result, our method outperformed conventional ASR with MSLS feature.

DOI： 10.7210/jrsj.25.92

CiNii Books

CiNii Research

researchmap
A navigation system using ultrasonic directional speaker with rotating base Reviewed

Kentaro Ishii, Yukiko Yamamoto, Michita Imai, Kazuhiro Nakadai

HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION: INTERACTING IN INFORMATION ENVIRONMENTS, PT 2, PROCEEDINGS 4558 526 - + 2007

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Coarse speech recognition by audio-visual integration based on missing feature theory Reviewed

Tomoaki Koiwa, Kazuhiro Nakadai, Jun-ichi Imura

2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9 1757 - 1762 2007

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2007.4399300

Web of Science

researchmap
A biped robot that keeps steps in time with musical beats while listening to music with its own ears Reviewed

Kazuyoshi Yoshii, Kazuhiro Nakadai, Toyotaka Torii, Yuji Hasegawa, Hiroshi Tsujino, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9 1749 - + 2007

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
The design of phoneme grouping for coarse phoneme recognition Reviewed

Kazuhiro Nakadai, Ryota Sumiya, Mikio Nakano, Koichi Ichige, Yasuo Hirose, Hiroshi Tsujino

NEW TRENDS IN APPLIED ARTIFICIAL INTELLIGENCE, PROCEEDINGS 4570 905 - + 2007

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

CiNii Research

researchmap
Real-World Auditory Scene Analysis by Information Integration:Sound Source Tracking by Integration of Multiple Microphone Arrays

NAKADAI Kazuhiro

Journal of The Society of Instrument and Control Engineers 46 ( 6 ) 427 - 433 2007

　More details

Language：Japanese Publisher：公益社団法人計測自動制御学会

DOI： 10.11499/sicejl1962.46.427

CiNii Research

researchmap
Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech. Reviewed

Shun'ichi Yamamoto, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-Marc Valin, Kazunori Komatani, Tetsuya Ogata,Array

IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2007, Kyoto, Japan, December 9-13, 2007 111 - 116 2007

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ASRU.2007.4430093

Scopus

researchmap
Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition Reviewed

Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9 1763 - + 2007

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
時変拡張ビームフォーミングによる移動音源の抽出

中島弘史, 中臺一博, 長谷川雄二, 辻野広司

人工知能学会全国大会論文集 JSAI07 3C84 - 3C84 2007

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

本稿では時変拡張ビームフォーミングによる移動音源の正確な抽出方法とその適用例について述べる。本手法は，音源位置を離散化し，離散化した各位置でBF係数を切替えて行う従来法に比べ，係数切替時の不連続等がなく有効である。

DOI： 10.11517/pjsai.jsai07.0_3c84

CiNii Research

researchmap
Sound Source Separation Filter for Robot Audition used by Dynamic Reconfigurable Device, DRP (in Japanese)

中臺一博, 奥乃博

IEICE Transaction on Information and Systems Vol.J90-D, No.3 897 - 907 2007

　More details

Publishing type：Research paper (scientific journal)

CiNii Research

researchmap
Improving Location-Based Speech Recognition of Simultaneous Speech Signals by Parameter Optimization with Genetic Algorithm

YAMAMOTO Shunichi, NAKADAI Kazuhiro, NAKANO Mikio, TSUJINO Hiroshi, VALIN Jean-Marc, TAKEDA Ryu, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G.

Human interface 8 ( 2 ) 203 - 212 2006.5

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：ヒュ-マンインタフェ-ス学会

CiNii Books

CiNii Research

researchmap
Speech Interface For Robot By Using Ultrasonic Directional Speaker

NAKADAI Kazuhiro, TSUJINO Hiroshi

Human interface 8 ( 2 ) 213 - 221 2006.5

　More details

Language：Japanese Publisher：ヒュ-マンインタフェ-ス学会

CiNii Books

CiNii Research

researchmap
パーティクルフィルタによる音源追跡の性能評価

村瀬昌満, 中臺一博, 奥乃博

第68回全国大会講演論文集 2006 ( 1 ) 329 - 330 2006.3

　More details

Language：Japanese

CiNii Books

CiNii Research

researchmap
複数ドメイン音声対話システムにおける対話履歴を利用したドメイン選択の高精度化

神田, 直之, 駒谷, 和範, 中野, 幹生, 中, 臺一博, 辻野, 広司, 尾形, 哲也, 奥乃, 博

第68回全国大会講演論文集 2006 ( 1 ) 315 - 316 2006.3

　More details

Language：Japanese Publisher：情報処理学会

CiNii Books

CiNii Research

researchmap

Other Link： http://id.ndl.go.jp/bib/7841154
Real-Time Tracking of Multiple Sound Sources by Integration of In-Room and Robot-Embedded Microphone Arrays Reviewed

Kazuhiro Nakadai, Hirofumi Nakajima, Masamitsu Murase, Hiroshi G. Okuno, Yuji Hasegawa, Hiroshi Tsujino

2006 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-12 852 - + 2006

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2006.281737

DOI： 10.1109/iros.2006.281737

Web of Science

researchmap
Speech Recognition for a Humanoid with Motor Noise Utilizing Missing Feature Theory. Reviewed

Yoshitaka Nishimura, Mitsuru Ishizuka, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino

2006 6th IEEE-RAS International Conference on Humanoid Robots, Genova, Italy, December 4-6, 2006 26 - 33 2006

　More details

Publisher：IEEE

DOI： 10.1109/ICHR.2006.321359

researchmap
A Robot That Can Engage in Both Task-Oriented and Non-Task-Oriented Dialogues. Reviewed

Mikio Nakano, Atsushi Hoshino, Johane Takeuchi, Yuji Hasegawa, Toyotaka Torii, Kazuhiro Nakadai, Kazuhiko Kato, Hiroshi Tsujino

2006 6th IEEE-RAS International Conference on Humanoid Robots, Genova, Italy, December 4-6, 2006 404 - 411 2006

　More details

Publisher：IEEE

DOI： 10.1109/ICHR.2006.321304

researchmap
Genetic Algorithm-Based Improvement of Robot Hearing Capabilities in Separating and Recognizing Simultaneous Speech Signals. Reviewed

Shun'ichi Yamamoto, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-Marc Valin, Ryu Takeda, Kazunori Komatani, Tetsuya Ogata,Array

Advances in Applied Artificial Intelligence, 19th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2006, Annecy, France, June 27-30, 2006, Proceedings 207 - 217 2006

　More details

Publishing type：Research paper (scientific journal) Publisher：Springer

DOI： 10.1007/11779568_24

CiNii Research

researchmap
Leak energy based missing feature mask generation for ICA and GSS and its evaluation with simultaneous speech recognition. Reviewed

Shun'ichi Yamamoto, Ryu Takeda, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-Marc Valin, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, SAPA 2006, Pittsburgh, PA, USA, September 16, 2006 42 - 47 2006

　More details

Publisher：ISCA

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/interspeech/sapa2006.html#conf/interspeech/YamamotoTNNTVKOO06
Speech recognition for a robot under its motor noises by selective application of missing feature theory and MLLR. Reviewed

Yoshitaka Nishimura, Mikio Nakano, Kazuhiro Nakadai, Hiroshi Tsujino, Mitsuru Ishizuka

ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, SAPA 2006, Pittsburgh, PA, USA, September 16, 2006 53 - 58 2006

　More details

Publisher：ISCA

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/interspeech/sapa2006.html#conf/interspeech/NishimuraNNTI06
Real-Time Robot Audition System That Recognizes Simultaneous Speech in The Real World. Reviewed

Shun'ichi Yamamoto, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-Marc Valin, Kazunori Komatani, Array,Array

2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006, October 9-15, 2006, Beijing, China 5333 - 5338 2006

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS.2006.282037

Scopus

researchmap
Recognition of Simultaneous Speech by Estimating Reliability of Separated Signals for Robot Audition. Reviewed

Shun'ichi Yamamoto, Ryu Takeda, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-Marc Valin, Kazunori Komatani, Tetsuya Ogata,Array

PRICAI 2006: Trends in Artificial Intelligence, 9th Pacific Rim International Conference on Artificial Intelligence, Guilin, China, August 7-11, 2006, Proceedings 484 - 494 2006

　More details

Publishing type：Research paper (scientific journal) Publisher：Springer

DOI： 10.1007/11801603_52

CiNii Research

researchmap
Robust tracking of multiple sound sources by spatial integration of room and robot microphone arrays Reviewed

Nakadai K, Nakajima H, Murase M, Kaijiri S, Yamada K, Nakamura T, Hasegawa Y, Okuno H.G, Tsujino H

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 4 929 - + 2006

　More details

Publishing type：Research paper (scientific journal)

Web of Science

Scopus

CiNii Research

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Robust tracking of multiple sound sources by spatial integration of room and robot microphone arrays Reviewed

Nakadai K, Nakajima H, Murase M, Kaijiri S, Yamada K, Nakamura T, Hasegawa Y, Okuno H.G, Tsujino H

2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13 4 4599 - 4602 2006

　More details

Web of Science

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Multi-Domain Spoken Dialogue System with Extensibility and Robustness against Speech Recognition Errors. Reviewed

Kazunori Komatani, Naoyuki Kanda, Mikio Nakano, Kazuhiro Nakadai, Hiroshi Tsujino, Tetsuya Ogata, Hiroshi G. Okuno

Proceedings of the SIGDIAL 2006 Workshop, The 7th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 15-16 July 2006, Sydney, Australia 9 - 17 2006

　More details

Publishing type：Research paper (international conference proceedings) Publisher：The Association for Computer Linguistics

DOI： 10.3115/1654595.1654598

Scopus

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/sigdial/sigdial2006.html#conf/sigdial/KomataniKNNTOO06
Missing Feature Theory based Interface Between Sound Source Separation and Automatic Speech Recognition and Applying to Multiple Robots

YAMAMOTO Shunichi, NAKADAI Kazuhiro, TSUJINO Hiroshi, OKUNO Hiroshi G

Journal of the Robotics Society of Japan 23 ( 6 ) 743 - 751 2005.9

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：一般社団法人日本ロボット学会

Robot audition is a critical technology in creating an intelligent robot operating in daily environments. To realize such a robot audition system, we have designed a missing feature theory based interface between sound source separation and automatic speech recognition (ASR) . In this interface, features distorted by speech separation are detected from input speech as missing features. The detected missing features are masked on recognition to avoid severe deterioration of recognition performance. By using the interface, we developed the robot audition system which recognizes multiple simultaneous speech. We also assess its general applicability by implementing it on three different humanoids, i.e., Honda ASIMO, SIG2, and Replie of Kyoto University. By using three simultaneous speeches as benchmarks, its general applicability was confirmed. When triphone is used and a size of vocabulary is 200 words, the average word correct of three simultaneous speech are 79.7%, 78.7%, and 82.7% for ASIMO, SIG2, and Replie, respectively.

DOI： 10.7210/jrsj.23.743

CiNii Books

CiNii Research

researchmap
Towards new human-humanoid communication: Listening during speaking by using ultrasonic directional speaker Reviewed

K Nakadai, H Tsujino

2005 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-4 1483 - 1488 2005

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
A two-layer model for behavior and dialogue planning in conversational service robots. Reviewed

Mikio Nakano, Yuji Hasegawa, Kazuhiro Nakadai, Takahiro Nakamura, Johane Takeuchi, Toyotaka Torii, Hiroshi Tsujino, Naoyuki Kanda,Array

2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Edmonton, Alberta, Canada, August 2-6, 2005 3329 - 3335 2005

　More details

Publisher：IEEE

DOI： 10.1109/IROS.2005.1545198

researchmap
Implementation of active direction-pass filter on dynamically reconfigurable processor Reviewed

Kurotaki S, Suzuki N, Nakadai K, Okuno H.G, Amano H

2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 515 - 520 2005

　More details

Publisher：IEEE

DOI： 10.1109/IROS.2005.1545033

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Enhanced robot speech recognition based on microphone array source separation and missing feature theory Reviewed

Yamamoto S, Valin J.-M, Nakadai K, Rouat J, Michaud F, Ogata T, Okuno H.G

Proceedings - IEEE International Conference on Robotics and Automation 2005 1477 - 1482 2005

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/ROBOT.2005.1570323

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Making a robot recognize three simultaneous sentences in real-time Reviewed

Shun'ichi Yamamoto, Kazuhiro Nakadai, Jean{-}Marc Valin, Jean Rouat, Fran{\c{c } }ois Michaud, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Edmonton, Alberta, Canada, August 2-6, 2005 4040 - 4045 2005

　More details

Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/IROS.2005.1545094

Scopus

researchmap
Multiple moving speaker tracking by microphone array on mobile robot Reviewed

Murase M, Yamamoto S, Valin J.-M, Nakadai K, Yamada K, Komatani K, Ogata T, Okuno H.G

9th European Conference on Speech Communication and Technology 249 - 252 2005

　More details

Publishing type：Research paper (international conference proceedings) Publisher：ISCA

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Sound source tracking with directivity pattern estimation using a 64 ch microphone array Reviewed

Kazuhiro Nakadai, Hirofumi Nakajima, Kentaro Yamada, Yuji Hasegawa, Takahiro Nakamura, Hiroshi Tsujino

2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 196 - 202 2005

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IROS.2005.1544981

Scopus

researchmap
ロボット聴覚の課題と現状(招待講演)

奥乃博, 中臺一博

音響学会春季研究発表会,3-7-7 633 - 636 2005

　More details

Publishing type：Research paper (scientific journal)

CiNii Research

researchmap
Sound and Visual Tracking for Humanoid Robot Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai, Tino Lourens, Hiroaki Kitano

Applied Intelligence 20 ( 3 ) 253 - 266 2004.5

　More details

Publishing type：Research paper (scientific journal) Publisher：Springer Science $\mathplus$ Business Media

DOI： 10.1023/b:apin.0000021417.62541.e0

CiNii Research

researchmap
ミッシングフィーチャー理論による三話者同時発話認識の向上

山本, 俊一, 中, 臺一博, 辻野, 広司, 駒谷, 和範, 尾形, 哲也, 奥乃, 博

第66回全国大会講演論文集 2004 ( 1 ) 285 - 286 2004.3

　More details

Language：Japanese Publisher：情報処理学会

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00169751/
マルチモーダル情報統合によるヒューマノイドロボットの挙動選択

戸田, 充彦, 中, 臺一博, 駒谷, 和範, 尾形, 哲也, 奥乃, 博

第66回全国大会講演論文集 2004 ( 1 ) 191 - 192 2004.3

　More details

Language：Japanese Publisher：情報処理学会

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00169704/
Multimodal expression for humanoid robots by integration of human speech mimicking and facial color

Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino

8th International Conference on Spoken Language Processing, ICSLP 2004 2305 - 2308 2004

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：International Speech Communication Association

Scopus

researchmap
Assessment of general applicability of robot audition system by recognizing three simultaneous speeches Reviewed

Yamamoto S, Nakadai K, Tsujino H, Okuno H.G

2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 3 2111 - 2116 2004

　More details

Publishing type：Research paper (scientific journal) Publisher：IEEE

Scopus

CiNii Research

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Computational Auditory Scene Analysis and Its Application to Robot Audition Reviewed

Okuno H.G, Ogata T, Komatani K, Nakadai K

Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004 73 - 80 2004

　More details

Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ICKS.2004.1313411

Scopus

CiNii Research

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Improvement of recognition of simultaneous speech signals using AV integration and scattering theory for humanoid robots Reviewed

Nakadai K, Matsuura D, Okuno H.G, Tsujino H

Speech Communication 44 ( 1-4 ) 97 - 112 2004

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.specom.2004.10.010

Scopus

CiNii Research

J-GLOBAL

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Improvement of robot audition by interfacing sound source separation and automatic speech recognition with missing feature theory Reviewed

Yamamoto S, Nakadai K, Tsujino H, Yokoyama T, Okuno H.G

Proceedings - IEEE International Conference on Robotics and Automation 2004 ( 2 ) 1517 - 1523 2004

　More details

Publishing type：Research paper (scientific journal) Publisher：IEEE

DOI： 10.1109/ROBOT.2004.1308039

Scopus

CiNii Research

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Effects of increasing modalities in recognizing three simultaneous speeches Reviewed

Okuno H.G, Nakadai K, Kitano H

Speech Communication 43 ( 4 ) 347 - 359 2004

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.specom.2004.03.008

Scopus

CiNii Research

J-GLOBAL

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Effect of facial colors on humanoids in emotion recognition using speech

Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino

Proceedings - IEEE International Workshop on Robot and Human Interactive Communication 59 - 64 2004

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Scopus

researchmap
Real-Time Human Tracking by Audio-Visual Integration for Humanoids : Integration of Active Audition and Face Recognition

NAKADAI Kazuhiro, HIDAI Ken-ichi, MIZOGUCHI Hiroshi, OKUNO Hiroshi, KITANO Hiroaki

Journal of the Robotics Society of Japan 21 ( 5 ) 517 - 525 2003.7

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：一般社団法人日本ロボット学会

This paper describes a real-time human tracking system by audio-visual integrtation for the humanoid SIG. An essential idea for real-time and robust tracking is hierarchical integration of multi-modal information. The system creates three kinds of streams - auditory, visual and associated streams. An auditory stream with sound source direction is formed as temporal series of events from audition module which localizes multiple sound sources and cancels motor noise from a pair of microphones. A visual stream with a face ID and its 3D-position is formed as temporal series of events from vision module by combining face detection, face identification and face localization by stereo vision. Auditory and visual streams are associated into an associated stream, a higher level representation according to their proximity. Because the associated stream disambiguates parcially missing information in auditory or visual streams, &ldquo;focus-of-attention&rdquo; control of SIG works well enough to robust human tracking. These processes are executed in real-time with the delay of 200 msec using off-the-shelf PCs distributed via TCP/IP. As a result, robust human tracking is attained even when the person is visually occluded and simultaneous speeches occur.

DOI： 10.7210/jrsj.21.517

CiNii Books

CiNii Research

researchmap
人間に似た外見を持つロボットReplieにおける挙動選択システム

戸田, 充彦, 山本, 俊一, 中, 臺一博, 奥乃, 博

第65回全国大会講演論文集 2003 ( 1 ) 211 - 212 2003.3

　More details

Language：Japanese Publisher：情報処理学会

CiNii Books

CiNii Research

researchmap

Other Link： http://id.nii.ac.jp/1001/00169249/
Three simultaneous speech recognition by integration of active audition and face recognition for humanoid Reviewed

Kazuhiro Nakadai, Daisuke Matsuura, Hiroshi G. Okuno, Hiroshi Tsujino

8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - INTERSPEECH 2003, Geneva, Switzerland, September 1-4, 2003 2003

　More details

Publisher：ISCA

DOI： 10.1016/j.specom.2004.10.010

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Issues in humanoid audition and sound source localization by active audition

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

Transactions of the Japanese Society for Artificial Intelligence 18 ( 2 ) 104 - 113 2003

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：一般社団法人人工知能学会

DOI： 10.1527/tjsai.18.104

Scopus

CiNii Books

CiNii Research

researchmap
Issues in humanoid audition and sound source localization by active audition

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

Transactions of the Japanese Society for Artificial Intelligence 18 ( 2 ) 104 - 113 2003

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：The Japanese Society for Artificial Intelligence

DOI： 10.1527/tjsai.18.104

Scopus

CiNii Books

CiNii Research

researchmap
Applying Scattering Theory to Robot Audition System: Robust Sound Source Localization and Extraction Reviewed

Nakadai K, Matsuura D, Okuno H.G, Kitano H

IEEE International Conference on Intelligent Robots and Systems 2 1147 - 1152 2003

　More details

Publisher：IEEE

DOI： 10.1109/IROS.2003.1248800

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Improvement of three simultaneous speech recognition by using AV integration and scattering theory for humanoid Reviewed

Kazuhiro Nakadai, Daisuke Matsuura, Hiroshi G. Okuno, Hiroshi Tsujino

AVSP 2003 - International Conference on Audio-Visual Speech Processing, St. Jorioz, France, September 4-7, 2003 44 ( 1-4 ) 157 - 162 2003

　More details

Publisher：ISCA

DOI： 10.1016/j.specom.2004.10.010

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Active audition for humanoid robots that can listen to three simultaneous talkers Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai

J. Acoust. Soc. Am. Vol.113, No.4, Pt.2 of 2, pp.2230 ( 4 ) 2230 2003

　More details

Publishing type：Research paper (scientific journal) Publisher：Acoustical Society of America ({ASA})

DOI： 10.1121/1.4780329

CiNii Research

researchmap
Real-time sound source localization and separation based on active audio-visual integration Reviewed

Okuno H.G, Nakadai K

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2686 118 - 125 2003

　More details

Publishing type：Research paper (scientific journal) Publisher：Springer

DOI： 10.1007/3-540-44868-3_16

Scopus

CiNii Research

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Realizing personality in audio-visually triggered non-verbal behaviors Reviewed

Okuno H.G, Nakadai K, Kitano H

Proceedings - IEEE International Conference on Robotics and Automation 392-397 392 - 397 2003

　More details

Publisher：IEEE

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Human-robot non-verbal interaction empowered by real-time auditory and visual multiple-talker tracking Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai, Ken{-}ichi Hidai, Hiroshi Mizoguchi, Hiroaki Kitano

Advanced Robotics 17 ( 2 ) 115 - 130 2003

　More details

Language：Japanese

DOI： 10.1163/156855303321165088

CiNii Books

J-GLOBAL

researchmap
Design and implementation of personality of humanoids in human humanoid non-verbal interaction Reviewed

Okuno H.G, Nakadai K, Kitano H

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) 2718 662 - 673 2003

　More details

Publishing type：Research paper (scientific journal) Publisher：Springer

DOI： 10.1007/3-540-45034-3_67

Scopus

CiNii Research

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Robot recognizes three simultaneous speech by active audition Reviewed

Nakadai K, Okuno H.G, Kitano H

Proceedings - IEEE International Conference on Robotics and Automation 1 398 - 405 2003

　More details

Publisher：IEEE

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
ヒューマノイドを対象にした視聴覚統合による実時間人物追跡 : アクティブオーディションと顔認識の統合

中臺一博, 奥乃博

ロボット学会誌 21・5 1333 - 1342 2003

　More details

Publishing type：Research paper (scientific journal)

CiNii Research

researchmap
Real-time sound source localization and separation for robot audition Reviewed

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

7th International Conference on Spoken Language Processing, ICSLP2002 - INTERSPEECH 2002, Denver, Colorado, USA, September 16-20, 2002 193 - 196 2002

　More details

Publisher：ISCA

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Social interaction of humanoid robot based on audio-visual tracking Reviewed

Okuno H.G, Nakadai K, Kitano H

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2358 725 - 735 2002

　More details

Publisher：Springer

DOI： 10.1007/3-540-48035-8_70

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Realizing audio-visually triggered Eliza-like non-verbal behaviors Reviewed

Okuno H.G, Nakadai K, Kitano H

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2417 552 - 562 2002

　More details

Publisher：Springer

DOI： 10.1007/3-540-45683-x_59

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Sound and Visual Tracking by Active Audition Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai, Tino Lourens, Hiroaki Kitano

Enabling Society with Information Technology 174 2002

　More details

Publisher：Springer Science $\mathplus$ Business Media

DOI： 10.1007/978-4-431-66979-1_17

researchmap
Real-time multiple speaker tracking by multi-modal integration for mobile robots Reviewed

Kazuhiro Nakadai, Ken{-}ichi Hidai, Hiroshi G. Okuno, Hiroaki Kitano

EUROSPEECH 2001 Scandinavia, 7th European Conference on Speech Communication and Technology, 2nd INTERSPEECH Event, Aalborg, Denmark, September 3-7, 2001 1193 - 1196 2001

　More details

Publisher：ISCA

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
A computational model of monkey grating cells for oriented repetitive alternating patterns Reviewed

Tino Lourens, Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

ESANN 2001, 9th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 25-27, 2001, Proceedings 315 - 322 2001

　More details

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Graph extraction from color images Reviewed

Tino Lourens, Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

ESANN 2001, 9th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 25-27, 2001, Proceedings 329 - 334 2001

　More details

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Real-time auditory and visual multiple-object tracking for humanoids Reviewed

Nakadai K, Hidai K.-I, Mizoguchi H, Okuno H.G, Kitano H

IJCAI International Joint Conference on Artificial Intelligence 1425 - 1432 2001

　More details

Publisher：Morgan Kaufmann

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Sound and visual tracking for humanoid robot Reviewed

Okuno H.G, Nakadai K, Lourens T, Kitano H

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2070 640 - 650 2001

　More details

Publisher：Springer

DOI： 10.1007/3-540-45517-5_71

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Separating three simultaneous speeches with two microphones by integrating auditory and visual processing Reviewed

Hiroshi G. Okuno, Kazuhiro Nakadai, Tino Lourens, Hiroaki Kitano

EUROSPEECH 2001 Scandinavia, 7th European Conference on Speech Communication and Technology, 2nd INTERSPEECH Event, Aalborg, Denmark, September 3-7, 2001 2643 - 2646 2001

　More details

Publisher：ISCA

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Designing a humanoid head for RoboCup challenge Reviewed

Kitano Hiroaki, Okuno Hiroshi G, Nakadai Kazuhiro, Fermin Iris, Sabisch Theo, Nakagawa Yukiko, Matsui Tatsuya

Proceedings of the International Conference on Autonomous Agents 17 - 18 2000

　More details

Publisher：ACM

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
And the Fans Are Going Wild! SIG plus MIKE. Reviewed

Ian Frank, Kumiko Tanaka-Ishii, Hiroshi G. Okuno, Junichi Akita, Yukiko Nakagawa, Kazuaki Maeda, Kazuhiro Nakadai, Hiroaki Kitano

RoboCup 2000: Robot Soccer World Cup IV 139 - 148 2000

　More details

Publisher：Springer

DOI： 10.1007/3-540-45324-5_12

researchmap
Humanoid active audition system improved by the cover acoustics Reviewed

Nakadai K, Okuno H.G, Kitano H

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1886 LNAI 544 - 554 2000

　More details

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Design and architecture of SIG the humanoid: An experimental platform for integrated perception in RoboCup humanoid challenge Reviewed

Kitano H, Okuno H.G, Nakadai K, Sabisch T, Matsui T

IEEE International Conference on Intelligent Robots and Systems 1 181 - 190 2000

　More details

Publisher：IEEE

Scopus

researchmap

Other Link： http://orcid.org/0000-0002-8704-4318
Chord Recognition Mechanisms in the OPTIMA Processing Architecture for Music Scene Analysis

KASHINO Kunio, NAKADAI Kazuhiro, KINOSHITA Tomoyoshi, TANAKA Hidehiko

The Transactions of the Institute of Electronics,Information and Communication Engineers. 79 ( 11 ) 1762 - 1770 1996.11

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

音楽演奏の音響信号を対象として演奏情報を認識する試みとしては,従来自動採譜の研究が行われているが,複数種類の楽器音を含む音楽演奏を対象とする場合には,認識処理の有効性は極めて限られていた.そこで本論文では,複数種類の楽器音を含む音楽演奏の認識を音楽情景分析の問題としてとらえ,その解決を図る.ここで音楽情景分析とは,音楽演奏の音響信号から,単音や和音などの音楽演奏情報を記号表現として抽出することを指す.本論文ではまず,音楽情景分析を実現する上では情報統合の技術が不可欠であるとの認識から,ベイジアンネットワークによる情報統合の機構を備えた音楽情景分析の処理モデルOPTIMAを提案する.次に,特に単音の認識に的を絞って,提案する情報統合機構の有効性を示す.

CiNii Books

CiNii Research

researchmap
音楽情景分析の処理モデルOPTIMAにおける単音の認識

柏野邦夫, 中台一博, 木下智義

電子情報通信学会論文誌. D-2, 情報・システム. 2, パターン処理 = The IEICE transactions on information and systems. Pt. 2 / 電子情報通信学会編 79 ( 11 ) 1751 - 1761 1996.11

　More details

Language：Japanese Publisher：東京 : 電子情報通信学会情報・システムソサイエティ

CiNii Books

CiNii Research

researchmap

Other Link： https://ndlsearch.ndl.go.jp/books/R000000004-I4087193
Chord Recognition Mechanisms in the OPTIMA Processing Architecture for Music Scene Analysis

KASHINO Kunio, KINOSHITA Tomoyoshi, NAKADAI Kazuhiro, TANAKA Hidehiko

The transactions of the Institute of Electronics, Information and Communication Engineers 79 ( 11 ) 1762 - 1770 1996

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

我々は,複数種類の楽器音を含む音楽演奏を対象とした音楽認識を,音楽情景分析の問題としてとらえ研究を行っている.ここで音楽情景分析とは,音楽演奏の音響信号から,単音や和音などの音楽演奏情報を記号表現として抽出することを指す.我々は先に,ベイジアンネットワークによる情報統合の機構を備えた音楽情景分析の処理モデルOPTIMAを提案した.本論文では,OPTIMAにおける処理のうち,特に和音の認識に的を絞って,情報統合機構の有効性を調べた.その結果,サンプル曲を用いた評価実験において,ボトムアップ処理のみによる和音認識を行った場合に比較して,和音を構成する単音に関する統計情報を統合した場合には15.6%,また和音の時間的な遷移に関する統計情報を統合した場合には18.7%の和音認識率の向上が見られたことから,提案する処理モデルにおいてこれらの情報を統合することの有効性が示された.

researchmap
Application of Bayesian Probability Network to Music Scene Analysis Reviewed

Kunio Kashino, Kazuhiro Nakadai, Tomoyoshi Kinoshita, Hidehiko Tanaka

Working Note of the IJCAI-95 Computational Auditory Scene Analysis Workshop, 1995 1995

　More details

We propose a process model for hierarchical perceptual sound organization, which recognizes perceptual sounds included in incoming sound signals. We consider perceptual sound organization as a scene analysis problem in the auditory domain. Our current application is a music scene analysis system, which recognizes rhythm, chords, and source-separated musical notes included in incoming music signals. Our process model consists of multiple processing modules and a probability network for information integration. The structure of our model is conceptually based on the blackboard architecture....

researchmap
Organization of Hierarchical Perceptual Sounds: Music Scene Analysis with Autonomous Processing Modules and a Quantitative Information Integration Mechanism Reviewed

Kunio Kashino, Kazuhiro Nakadai, Tomoyoshi Kinoshita, Hidehiko Tanaka

Proc. IJCAI-95 1995

　More details

We propose a process model for hierarchical perceptual sound organization, which recognizes perceptual sounds included in incoming sound signals. We consider perceptual sound organization as a scene analysis problem in the auditory domain. Our model consists of multiple processing modules and a hypothesis network for quantitative integration of multiple sources of information. When input information for each processing module is available, the module rises to process it and asynchronously writes output information to the hypothesis network. On the hypothesis network, individual information...

researchmap

▼display all

Books

AIの活用と感情に寄り添う音声認識・合成の新展開 Reviewed

伊藤, 彰則, 森川, 大輔, 上江洲, 安史, 鳥谷, 輝樹, 高野, 佐代子, 河原, 達也, 鵜木, 祐史, 齊藤, 剛史, 吉村, 奈津江, 平井, 重行, 中島, 佐和子, 大河内, 直之, 中臺, 一博, 糸山, 克寿, 福森, 隆寛, 周藤, 唯, 松田, 裕之, 渡辺, 光太朗, 白土, 浩司, 三井, 祥幹, 鳥居, 崇, 中川, 達也, 高橋, 敏, 加藤, 集平

エヌ・ティー・エス 2025.4 （ ISBN:9784860439361 ）

　More details

Total pages：1, 7, 254, 6p, 図版5p Language：Japanese

CiNii Books

researchmap
ロボット聴覚の基礎 : 実環境での音源定位・分離技術 Reviewed

中臺, 一博, 糸山, 克寿

オーム社 2025.2 （ ISBN:9784274232527 ）

　More details

Total pages：vi, 214p Language：Japanese

CiNii Books

researchmap
感覚デバイス開発―機器が担うヒト感覚の生成・拡張・代替技術 Reviewed

廣瀬通孝, 小柳光正, 石鍋隆宏, 川上徹, 小澤史朗, 八木康史, 長原一, 鏡慎吾, 徐剛, 奥乃博, 中臺一博, ホンダ・リサーチ, インスティチュート・ジャパン, ほか執筆者

エヌティーエス 2014.11 （ ISBN:4864690642 ）

　More details

Total pages：424 Language：Japanese

ASIN

researchmap

MISC

野鳥の歌分析用マイクロホンアレイの開発とその応用

中臺一博

人工知能学会第二種研究会資料 2024 ( Challenge-064 ) 01 2024.3

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-064_01

CiNii Research

J-GLOBAL

researchmap
LCMVベースのScan-and-Sum Beamformerによる面領域内音源の抽出

安江蒼人, YEN Benjamin, 糸山克寿, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 42nd 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
ガウス過程回帰を用いた音響伝達関数の環境変化適応

藤田侑樹, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 42nd ( Challenge-066 ) 06 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：一般社団法人人工知能学会

DOI： 10.11517/jsaisigtwo.2024.challenge-066_06

CiNii Research

J-GLOBAL

researchmap
Biasing Networkを用いた音声認識の雑音耐性向上

大崎崇博, 周藤唯, 糸山克寿, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 42nd 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
距離学習を用いた話者識別に基づく話者ダイアライゼーションの検討

阿坂脩平, 西田健次, 糸山克寿, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 42nd 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Video Vision Transformerに基づく音源定位の提案

横田遥大, BOZKURTLAR Mert, BOZKURTLAR Mert, YEN Benjamin, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 42nd 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
屋外環境下でのドローンのローターノイズによる地表材質推定手法の検討

矢野翼, YEN Benjamin, 糸山克寿, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 42nd 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Detection of small moving objects as rare events in videos

西田健次, 糸山克寿, 糸山克寿, 中臺一博

人工知能学会第二種研究会資料(Web) 2024 ( Challenge-064 ) 05 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

DOI： 10.11517/jsaisigtwo.2024.challenge-064_05

CiNii Research

J-GLOBAL

researchmap
複数のドローンを用いた音源探査のためのROSネットワークの構築

山本拓実, 干場功太郎, YEN Benjamin, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 42nd 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Improvement in multi-drone sound source tracking considering self and other dorne noise

三好智大, 山田泰基, 山田泰基, YEN Benjamin, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 25th 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Environmental sound classification using microphones with a drone

野島稔生, 大崎崇博, 矢野翼, YEN Benjamin, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 25th 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Improvement in Target Speech Extraction Using Distance- and Speaker-Based Time-Frequency Masks

田口鐵人, 石井遼平, 大崎崇博, 阿坂脩平, YEN Benjamin, 糸山克寿, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 25th 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
HARK3.6 and Its Application to Active Drone Audition

中臺一博, 公文誠, 佐々木洋子, 干場功太郎, YEN Benjamin, 糸山克寿, 瀧ヶ平将行, 寺門直哉, LIN Zirui, GULZAR Haris, BUSTO Monikka Rosalianna, 江田毅晴, 天野英晴

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 25th 2024

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
ロボット聴覚のための音源定位と深層ブラインド音源分離の統合

合澤隆拓, 合澤隆拓, 坂東宜昭, 糸山克寿, 糸山克寿, 西田健次, 中臺一博, 大西正輝

日本ロボット学会学術講演会予稿集(CD-ROM) 41st 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
面音源抽出のための複数拘束MVDRビームフォーマーの逐次計算による高速化

安江蒼人, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 41st 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
フォンミーゼス分布に基づく音響伝達関数オンライン適応の向上

藤田侑樹, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 41st 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
音声強調ネットワークとアダプターを用いた音声認識の耐雑音ロバスト性向上

大崎崇博, 周藤唯, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 41st 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Introduction of a Python Package for Robot Audition Open Source Software HARK and its implementation for embedded use

中臺一博, LIN Zirui, 糸山克寿, 糸山克寿, 瀧ヶ平将行, 寺門直哉, GULZAR Haris, BUSTO Monikka Rosalianna, 江田毅晴, 天野英晴

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 24th 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Towards Natural Spoken Dialogue Systems Based on AI Services

阿坂脩平, 西田健次, 糸山克寿, 糸山克寿, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 24th 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Groud Surface Material Estimation Using Drone Rotor Noise

矢野翼, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 24th 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Improved 3D spatial recognition based on audible sound-based echolocation with a 5-channel microphone array

小林宙輝, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 24th 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
気配センシングに向けた磁束密度センサと風速センサを用いた動作検出

川口洋慶, SHAKEEL Muhammad, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 41st 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
Online Adaptation of Fourier series based Lightweight Transfer Function to Improve Sound Source Localization and Separation

周藤唯, 瀧ケ平将行, 中臺一博, 中島弘史

人工知能学会第二種研究会資料(Web) 2023 ( Challenge-063 ) 08 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

DOI： 10.11517/jsaisigtwo.2023.challenge-063_08

CiNii Research

J-GLOBAL

researchmap
Improving Noise Robustness of Automatic Speech Recognition based on a Parallel Adapter Model with Near-Identity Initialization

大崎崇博, 周藤唯, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

人工知能学会第二種研究会資料(Web) 2023 ( Challenge-063 ) 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
An approach to integrating evolutionary models and field experiments on avian vocalization using trait representations based on generative models

鈴木麗璽, 古山諒, HARLOW Zachary, 中臺一博, 有田隆也

人工知能学会第二種研究会資料(Web) 2023 ( Challenge-063 ) 07 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

DOI： 10.11517/jsaisigtwo.2023.challenge-063_07

CiNii Research

J-GLOBAL

researchmap
鳥類の鳴き声行動の理解に対するロボット聴覚に基づく観測と生成進化モデル

古山諒, 鈴木麗璽, 中臺一博, 有田隆也

日本鳥学会大会講演要旨集 2023 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
鳴き声の音源定位によるシマフクロウの生息位置把握の試み

土門優介, 鈴木祐太郎, 石塚正仁, 内山秀樹, 矢野幹也, 鈴木麗璽, 中臺一博

日本鳥学会大会講演要旨集 2023 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
マイクロホンアレイを用いた渡り鳥の群れの飛行ルート推定

山本悠貴, 鈴木麗璽, 中臺一博, 東信行

日本鳥学会大会講演要旨集 2023 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
一夫一妻制鳥類のリュウキュウコノハズクは交尾声で異性を惹きつけるのか?

金杉尚紀, 澤田明, 佐々木瑠太, 細江隼平, 中臺一博, 高木昌興

日本鳥学会大会講演要旨集 2023 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
ヒバリの求愛飛行実測の試み

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2023 2023

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

J-GLOBAL

researchmap
深層フルランク空間相関分析に基づく遠隔音声認識のフロントエンド

合澤, 隆拓, 坂東, 宜昭, 糸山, 克寿, 西田, 健次, 中臺, 一博

第84回全国大会講演論文集 2022 ( 1 ) 285 - 286 2022.2

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

雑踏環境下でも頑健な音声認識をする実現するには，音源分離により目的音源を抽出するフロントエンドが不可欠である．このような音源分離は，学習コストの観点から教師なしでの動作が望ましく，混合複素角度中心ガウス法や多チャネル非負値行列因子分解といった線形型確率モデルに基づく手法が提案されていた．本稿では，より高い表現能力をもつ深層フルランク空間相関分析（neural FCA）に基づくフロントエンドを提案する．Neural FCAは，フルランク空間モデルと深層音源モデルを統合した非線形型確率モデルであり，従来の枠組みより精緻な分離性能を教師なしで獲得できる．Neural FCAを多人数対話のための音声認識フロントエンドとして拡張し，拡散性雑音を含む複数話者の混合音で評価した認識性能を報告する．

CiNii Books

CiNii Research

researchmap
Integration of Blockwise Streaming Automatic Speech Recognition with Voice Activity Detection International coauthorship

周藤唯, SHAKEEL Muhammad, 中臺一博, SHI Jiatong, 渡部晋二

人工知能学会第二種研究会資料(Web) 2022 ( Challenge-061 ) 10 2022

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

DOI： 10.11517/jsaisigtwo.2022.challenge-061_10

CiNii Research

J-GLOBAL

researchmap
PyHARK: HARK Python package supporting online and offline processing

中臺一博, 瀧ヶ平将行, 糸山克寿, 糸山克寿

人工知能学会第二種研究会資料(Web) 2022 ( Challenge-061 ) 04 2022

　More details

Language：Japanese

DOI： 10.11517/jsaisigtwo.2022.challenge-061_04

CiNii Research

J-GLOBAL

researchmap
Investigation of a method for detecting small objects from low-resolution images

西田健次, 糸山克寿, 糸山克寿, 中臺一博

人工知能学会第二種研究会資料(Web) 2022 ( Challenge-061 ) 03 2022

　More details

Language：Japanese

DOI： 10.11517/jsaisigtwo.2022.challenge-061_03

CiNii Research

J-GLOBAL

researchmap
音声に基づくヒクイナの個体数推定と生息地利用状況の可視化

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2022 2022

　More details

J-GLOBAL

researchmap
野外鳥類集団における音声相互作用分析のためのマイクロホンアレイに基づく自動観測の検討

鈴木麗璽, 炭谷晋司, 有田隆也, 松林志保, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2022 2022

　More details

J-GLOBAL

researchmap
ロボット聴覚用音響処理ソフトウェアHARKを用いたサウンドスケープの解析

山本遼, 西田健次, 糸山克寿, 糸山克寿, 松林志穂, 鈴木麗璽, 中臺一博

日本鳥学会大会講演要旨集 2022 2022

　More details

J-GLOBAL

researchmap
複数マイクロホンアレイのパラメータ同時最適化

杉山地塩, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 40th 2022

　More details

J-GLOBAL

researchmap
音源定位結果の3D可視化とmAPベースの評価指標の提案

山本遼, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 40th 2022

　More details

J-GLOBAL

researchmap
環境イベント識別学習フレームワークの提案とその日本語テキスト入力からの音響シーン生成部の実装

露口弘毅, MUHAMMAD Shakeel, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 40th 2022

　More details

J-GLOBAL

researchmap
アンサンブル時間周波数マスクを用いた複数の音声強調手法の統合

藤田雅彦, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 40th 2022

　More details

J-GLOBAL

researchmap
複数のマイクロホンアレイ搭載ドローンの配置最適化による音源追跡性能の向上

山田泰基, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 40th 2022

　More details

J-GLOBAL

researchmap
An Implementation of GHDSS on an FPGA board

QIN Ziquan, WEI Kaijie, 天野英晴, 中臺一博

電子情報通信学会技術研究報告(Web) 122 ( 174(RECONF2022 26-41) ) 2022

　More details

J-GLOBAL

researchmap
Adapting Acoustic Transfer Functions to Environmental Changes with Mode Filter

藤田侑樹, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 23rd 2022

　More details

J-GLOBAL

researchmap
2-Dimensional Interpolation of Acoustic Transfer Function and Application for Sound Source Localization

大崎崇博, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 23rd 2022

　More details

J-GLOBAL

researchmap
HARK 3.4-Introduction to PyHARK-

中臺一博, 糸山克寿, 糸山克寿

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 23rd 2022

　More details

J-GLOBAL

researchmap
Extraction of Two Dimensional Area with Expanded Scan-and-Sum Beamforming

安江蒼人, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 23rd 2022

　More details

J-GLOBAL

researchmap
Study of drone swarm action planning in multiple sound source tracking

山田泰基, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

人工知能学会第二種研究会資料(Web) 2022 ( Challenge-061 ) 07 2022

　More details

Language：Japanese

DOI： 10.11517/jsaisigtwo.2022.challenge-061_07

CiNii Research

J-GLOBAL

researchmap
Calibration of microphone array shape with arbitrary sound mixtures as input

糸山克寿, 糸山克寿, 中臺一博

人工知能学会第二種研究会資料(Web) 2022 ( Challenge-061 ) 11 2022

　More details

Language：Japanese

DOI： 10.11517/jsaisigtwo.2022.challenge-061_11

CiNii Research

J-GLOBAL

researchmap
Off-loading of sound localization on an FPGA board

HOU Zhongyang, WEI Kaijie, 天野英晴, 中臺一博

電子情報通信学会技術研究報告(Web) 122 ( 174(RECONF2022 26-41) ) 2022

　More details

J-GLOBAL

researchmap
Soundscape analysis using robot audition open source software HARK

山本遼, 西田健次, 糸山克寿, 中臺一博, 中臺一博

日本生態学会大会講演要旨(Web) 69th 2022

　More details

J-GLOBAL

researchmap
Analysis of acoustic interactions among wild birds based on sound source localization techniques

鈴木麗璽, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

日本生態学会大会講演要旨(Web) 69th 2022

　More details

J-GLOBAL

researchmap
野外での鳥類鳴き声観測のためのWebベース録音ユニットと可視化ツールの試作

炭谷晋司, 大和祐介, 鈴木麗璽, 小島諒介, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 39th 2021

　More details

J-GLOBAL

researchmap
Robot audition approaches to observations of bird vocalizations

鈴木麗璽, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

日本生態学会大会講演要旨(Web) 68th 2021

　More details

J-GLOBAL

researchmap
類似度行列を考慮した野鳥の歌自動識別の検討

山本遼, 中臺一博, 中臺一博, 西田健次, 糸山克寿

日本ロボット学会学術講演会予稿集(CD-ROM) 39th 2021

　More details

J-GLOBAL

researchmap
エコロケーションに基づく視覚シーンの再構成手法の提案と入力特徴量の検討

岸波華彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 39th 2021

　More details

J-GLOBAL

researchmap
A playback experiment on songbirds using simulated vocalizations based on a generative model

炭谷晋司, 鈴木麗璽, 有田隆也, 和多和宏, 松林志保, 中臺一博, 中臺一博, 奥乃博

人工知能学会第二種研究会資料(Web) 2021 ( Challenge-058 ) 2021

　More details

J-GLOBAL

researchmap
Improvement of Sound Source Localization and Separation with Fully-Online Always-Adaptation of Transfer Functions

中臺一博, 中臺一博, 瀧ケ平雅行, 河合熊輔, 中島弘史

人工知能学会第二種研究会資料(Web) 2021 ( Challenge-058 ) 2021

　More details

J-GLOBAL

researchmap
Evaluation of spatial source separation using NMF with multiple microphone arrays under reverberation

鍵本泰宏, 糸山克寿, 西田健次, 中臺一博, 中臺一博

人工知能学会第二種研究会資料(Web) 2021 ( Challenge-058 ) 05 2021

　More details

Language：Japanese

DOI： 10.11517/jsaisigtwo.2021.challenge-058_05

CiNii Research

J-GLOBAL

researchmap
A Study of Sound Classification Using Transfer Learning

露口弘毅, 西田健次, 糸山克寿, 中臺一博, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 22nd 2021

　More details

J-GLOBAL

researchmap
Robot Audition 5.0-Evolution & Prospect-

中臺一博, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 22nd 2021

　More details

J-GLOBAL

researchmap
Evaluation of Speech Recognition Performance Improvement by Spotforming

合澤隆拓, 鍵本泰宏, 西田健次, 糸山克寿, 中臺一博, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 22nd 2021

　More details

J-GLOBAL

researchmap
複数マイクロホンアレイの同期および3次元位置・姿勢推定の同時最適化の検討

杉山地塩, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 39th 2021

　More details

J-GLOBAL

researchmap
アンサンブル時間周波数マスクによる音声強調手法の評価

藤田雅彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 39th 2021

　More details

J-GLOBAL

researchmap
ヒクイナの鳴き声自動観測の可能性と今後の課題

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博, 奥乃博

日本鳥学会大会講演要旨集 2021 (CD-ROM) 2021

　More details

J-GLOBAL

researchmap
類似度行列による野鳥の歌識別器の検討

山本遼, 中臺一博, 中臺一博, 糸山克寿, 西田健次, 鈴木麗璽, 松林志保

日本鳥学会大会講演要旨集 2021 (CD-ROM) 2021

　More details

J-GLOBAL

researchmap
ロボット聴覚技術に基づく鳥類音声の方位角・仰角に関する音源定位と音風景の観測

鈴木麗璽, 林晃一郎, 大坂英樹, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

日本鳥学会大会講演要旨集 2021 (CD-ROM) 2021

　More details

J-GLOBAL

researchmap
Acoustic monitoring of owl fledglings

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博

景観生態学 25 ( 1 ) 87 - 89 2020.6

　More details

Language：Japanese

CiNii Books

J-GLOBAL

researchmap
Multi-scale approaches to observations of bird vocalizations using robot audition techniques

鈴木麗璽, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

日本生態学会大会講演要旨(Web) 67th 2020.3

　More details

Language：Japanese

J-GLOBAL

researchmap
ロボット聴覚からのクロスモーダルへの期待—メディアエクスペリエンス・バーチャル環境基礎

中臺一博

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 119 ( 386 ) 107 - 112 2020.1

　More details

Language：Japanese Publisher：東京 : 電子情報通信学会

CiNii Books

CiNii Research

researchmap

Other Link： https://ndlsearch.ndl.go.jp/books/R000000004-I030249880
ドローン搭載マイクロホンアレイを用いた音源探査の高精度化に向けた静音プロペラの開発

干場功太郎, 野田龍介, 中田敏是, 劉浩, 泉田啓, 中臺一博, 中臺一博, 公文誠, 奥乃博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 38th 2020

　More details

J-GLOBAL

researchmap
Examination of voice-based sentiment estimation method using facial expression-based sentiment estimation

西田健次, 山田亨, 糸山克寿, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 57th 2020

　More details

J-GLOBAL

researchmap
重み付け尤度関数と定在波を用いた可聴音による二次元環境認識

岸波華彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 38th 2020

　More details

J-GLOBAL

researchmap
テニスの打球音による球種識別の検討

山本修己, 西田健次, 糸山克寿, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 38th 2020

　More details

J-GLOBAL

researchmap
ロボット聴覚技術の活用による鳥類音声の到来方向に基づく音風景の可視化の検討

鈴木麗璽, ZHAO Hao, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 38th 2020

　More details

J-GLOBAL

researchmap
複数マイクロホンアレイを用いたNMFによる空間音源分離法の提案と評価

鍵本泰宏, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 38th 2020

　More details

J-GLOBAL

researchmap
環境音情報と画像情報を用いた物体検出による音ラベル付きセグメントの生成

鈴木啓, 糸山克寿, 西田健次, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 38th 2020

　More details

J-GLOBAL

researchmap
The 31st IEEE/RSJ International Conference on Intelligent Systems and Robots (IROS 2018)

Nakadai Kazuhiro

Journal of the Robotics Society of Japan 37 ( 1 ) 70 - 72 2019

　More details

Language：Japanese Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.37.70

CiNii Books

CiNii Research

researchmap

Other Link： https://ndlsearch.ndl.go.jp/books/R000000004-I029462341
柔軟索状レスキューロボットのための空気噴射音下での単チャネル音声強調

坂東宜昭, 安部祐一, 糸山克寿, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 中臺一博, 奥乃博

日本機械学会ロボティクス・メカトロニクス講演会講演論文集(CD-ROM) 2019 2019

　More details

J-GLOBAL

researchmap
「見えない」鳥を音で追う:定位技術を活用した鳥類観測

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奧乃博

日本景観生態学会大会発表要旨集(Web) 29th 2019

　More details

J-GLOBAL

researchmap
ドローンによる地上音源の位置推定―HARKを用いたドローン聴覚の取り組み―

公文誠, 若林瑞保, 干場功太郎, 中臺一博, 中臺一博, 奥乃博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 19th ROMBUNNO.2E3‐09 2018.12

　More details

Language：Japanese

J-GLOBAL

researchmap
An Experiment of Drone Control and Sensor Data Transmission using 920MHz Multi-hop Wireless Communication System

加川敏規, 小野文枝, SHAN Lin, 三浦龍, 中臺一博, 干場功太郎, 公文誠, 奥乃博, 加藤晋, 児島史秀

電子情報通信学会技術研究報告 118 ( 344(RCC2018 58-106) ) 217‐221 2018.11

　More details

Language：Japanese

J-GLOBAL

researchmap
Fine-scale observations of spatiotemporal dynamics and vocalization type of birdsongs using microphone arrays and unsupervised feature mapping

Reiji Suzuki, Shinji Sumitani, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Proceedings of the 10th International Conference on Ecological Informatics 72-73 2018.9

　More details

Language：English Publishing type：Research paper, summary (national, other academic conference)

researchmap
Spatial localization of vocalizations of Spotted Towhee (Pipilo maculatus) in playback experiments using robot audition techniques Reviewed

Shinji Sumitani, Reiji Suzuki, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Proceedings of the 10th International Conference on Ecological Informatics 265 2018.9

　More details

Language：English Publishing type：Research paper, summary (national, other academic conference)

researchmap
音情報を活用したフクロウの歌行動観測の試み

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2018 72 2018.9

　More details

Language：Japanese

J-GLOBAL

researchmap
ロボット聴覚技術に基づく鳥類の歌行動の二次元定位精度改善と次元圧縮に基づく分類支援

炭谷晋司, 鈴木麗璽, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2018 73 2018.9

　More details

Language：Japanese

J-GLOBAL

researchmap
マイクロホンアレイを用いた鳥類の歌行動の三次元音源到来方向推定

林晃一郎, 鈴木麗璽, 松林志保, 有田隆也, 小島諒介, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2018 74 2018.9

　More details

Language：Japanese

J-GLOBAL

researchmap
複数のマイクロホンアレイの遠隔制御に基づく鳥類の歌行動の二次元定位

森松健充, 炭谷晋司, 鈴木麗璽, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2018 72 2018.9

　More details

Language：Japanese

J-GLOBAL

researchmap
複数のマイクロホンアレイをネットワーク制御可能な鳥類の歌行動観測システムの構築

森松健充, 炭谷晋司, 鈴木麗璽, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 36th ROMBUNNO.2J2‐03 2018.9

　More details

Language：Japanese

J-GLOBAL

researchmap
音響センサによるサイバー救助犬のパンディングの検出

鈴木拓也, 中臺一博, 中臺一博, 奥乃博, 星達也, 水野直希, 大貫和也, 濱田龍之介, 大野和則, 干場功太郎

日本ロボット学会学術講演会予稿集(CD-ROM) 36th ROMBUNNO.2J2‐05 2018.9

　More details

Language：Japanese

J-GLOBAL

researchmap
マイクロホンアレイを用いた鳥類の３次元音源到来方向推定

林晃一郎, 鈴木麗璽, 松林志保, 有田隆也, 小島諒介, 中臺一博, 奧乃博

日本鳥学会2018年度大会講演要旨集 74 2018.9

　More details

Language：Japanese

researchmap
Understanding relationships between spatial movements and bird song-types using a robot audition system HARK with microphone arrays Reviewed

Shinji Sumitani, Reiji Suzuki, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Proc. of the 27th International Ornithological Congress 188 2018.8

　More details

Language：English Publishing type：Research paper, summary (national, other academic conference)

researchmap
Acoustic monitoring of the nocturnal owl (Strix uralensis) using microphone arrays and a robot audition system, HARK: A case study in the Ikoma mountains, Japan Reviewed

Shiho Matsubayashi, Fumiyuki Saito, Reiji Suzuki, Kazuhiro Nakadai, Hiroshi G. Okuno

Proc. of the 27th International Ornithological Congress 213 2018.8

　More details

Language：English Publishing type：Research paper, summary (national, other academic conference)

researchmap
Introduction to Sound Source Localization and Separation Software Using Microphone Array Processing

Nakadai Kazuhiro

SYSTEMS, CONTROL AND INFORMATION 62 ( 2 ) 42 - 49 2018.8

　More details

Language：Japanese Publisher：一般社団法人システム制御情報学会

DOI： 10.11509/isciesci.62.2_42

CiNii Books

CiNii Research

researchmap
Understanding ecoacoustic interactions among songbirds as complex systems using robot audition techniques

Reiji Suzuki, Shinji Sumitani, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Abstract Booklet of EVOSLACE: Workshop on the emergence and evolution of social learning, communication, language and culture in natural and artificial agents in ALIFE2018 22 2018.7

　More details

Language：English Publishing type：Research paper, summary (national, other academic conference)

researchmap
Transition and the current technologies in acoustic signal processing: From the viewpoint of robot audition Reviewed

Nakadai Kazuhiro

THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN 74 ( 7 ) 394 - 400 2018.7

　More details

Language：Japanese Publisher：Acoustical Society of Japan

DOI： 10.20697/jasj.74.7_394

CiNii Books

CiNii Research

J-GLOBAL

researchmap
Field observations of ecoacoustic dynamics of a Japanese bush warbler using an open-source software for robot audition HARK Reviewed

Reiji Suzuki, Shinji Sumitani, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

Journal of Ecoacoustics 2 EYAJ46 2018.6

　More details

Language：English Publishing type：Rapid communication, short report, research note, etc. (scientific journal)

researchmap
Development of Robot Audition to Extreme Environments

奥乃博, 糸山克寿, 中臺一博, 中臺一博, 公文誠, 坂東宜昭, 干場功太郎

システム制御情報学会研究発表講演会講演論文集(CD-ROM) 62nd ROMBUNNO.221‐1 2018.5

　More details

Language：Japanese Publisher：システム制御情報学会

J-GLOBAL

researchmap
ロボット聴覚技術を活用した鳥類の行動観測

鈴木麗璽, 中臺一博, 奥乃博

日本鳥学会誌（フォーラム） 67 ( 1 ) 155-157 2018.5

　More details

Language：Japanese Publishing type：Internal/External technical report, pre-print, etc.

researchmap
ロボット聴覚技術を用いた鳥類の歌行動分析の試み―複数のマイクロホンアレイを用いた二次元リアルタイム歌定位―

鈴木麗璽, 炭谷晋司, 中臺一博, 中臺一博, 奥乃博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 18th ROMBUNNO.1D6‐04 2017.12

　More details

Language：Japanese

J-GLOBAL

researchmap
人間とロボットとの対話環境における対話終了タイミングの検討 (情報ネットワーク)

北川遼, 蓮本諒介, 今井倫太, 中臺一博

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 117 ( 306 ) 31 - 34 2017.11

　More details

Language：Japanese Publisher：電子情報通信学会

CiNii Books

CiNii Research

researchmap
Establishment and Experimental Demonstration of Distant Speech Recognition System for Communication Robot

山本俊一, 住田直亮, 中臺一博

Honda R&D technical review 29 ( 2 ) 110 - 117 2017.10

　More details

Language：Japanese Publisher：本田技術研究所

CiNii Books

CiNii Research

researchmap
マイクロホンアレイを利用したウグイスの歌行動の時空間分析

炭谷晋司, 鈴木麗璽, 有田隆也, 松林志保, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2017 92 2017.9

　More details

Language：Japanese

J-GLOBAL

researchmap
マイクロフォンアレイを用いた野鳥観測:ソウシチョウの歌行動をめぐる予備的調査報告

松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2017 92 2017.9

　More details

Language：Japanese

J-GLOBAL

researchmap
ロボット聴覚技術を活用した野鳥の歌行動観測・分析ツールHARKBirdの機能強化

千葉尚彬, 炭谷晋司, 松林志保, 鈴木麗璽, 有田隆也, 中臺一博, 中臺一博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 35th ROMBUNNO.3A3‐03 2017.9

　More details

Language：Japanese

J-GLOBAL

researchmap
UAV搭載マイクロホンアレイを用いた組み込みシステムによる音源探査性能の評価

干場功太郎, 中臺一博, 中臺一博, 公文誠, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 35th ROMBUNNO.3A2‐04 2017.9

　More details

Language：Japanese

J-GLOBAL

researchmap
マルチロータヘリコプタ収録音の音源分離におけるシステムパラメータと分離性能について―GHDSSとBNP‐MAPの比較

鷲崎海, 公文誠, 大塚琢馬, 奥乃博, 干場功太郎, 中臺一博, 中臺一博

日本ロボット学会学術講演会予稿集(CD-ROM) 35th ROMBUNNO.3A2‐05 2017.9

　More details

Language：Japanese

J-GLOBAL

researchmap
災害救助犬の呼吸音と周囲の音を同時に計測するサイバスーツの開発

水野直希, 大貫和也, 星達也, 山口竣平, 濱田龍之介, 大野和則, 中臺一博, 奥乃博, 田所諭

日本ロボット学会学術講演会予稿集(CD-ROM) 35th ROMBUNNO.3A3‐02 2017.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Contributing to a Community of Open Source Software

中臺一博

映像情報メディア学会誌 = The journal of the Institute of Image Information and Television Engineers 71 ( 5 ) 647 - 653 2017.9

　More details

Language：Japanese Publisher：映像情報メディア学会

DOI： 10.3169/itej.71.647

CiNii Books

CiNii Research

researchmap
Investigation of passive acoustic anemometer using wind noise correlation

73 ( 8 ) 472 - 479 2017.8

　More details

Language：Japanese

CiNii Books

researchmap
Field observations and virtual experiences of bird songs in the soundscape using an open-source software for robot audition HARK

Shinji Sumitani, Reiji Suzuki, Takaya Arita, Naren, Shiho Matsubayashi, Kazuhiro Nakadai, Hiroshi G. Okuno

Abstract Book of 4th International Symposium on Acoustic Communication by Animals 116-117 2017.7

　More details

Language：English Publishing type：Rapid communication, short report, research note, etc. (scientific journal)

researchmap
Bird song explorer: 野鳥の歌行動体験のための立体音響に基づく仮想森林アプリケーション

娜仁, 鈴木麗璽, 有田隆也, 中臺一博, 奥乃博

第79回全国大会講演論文集 2017 ( 1 ) 239 - 240 2017.3

　More details

Language：Japanese Publisher：情報処理学会

我々は，マイクロホンアレイとロボット聴覚ソフトウェアHARKを用いて野鳥の歌行動を観測・分析する簡易なシステムHARKBirdを開発している．観測した音空間を臨場的に体験することは，野鳥の生態理解への貢献をはじめ，教育や啓蒙など幅広い活用が期待される．本発表では，ゲームエンジンであるUnityを用いて，野鳥が棲息し歌う様子を3次元空間上の仮想的な森林等で表現するアプリケーションを提案する.具体的には，いくつかの調査地で録音し音源定位・分離した野鳥の歌を，実環境と同じタイミングと方位で仮想的なフィールドに配置し再生する．ユーザはアバターを動かして野鳥を探索しながら立体音響で臨場的に歌を聴くことができる．目的に応じて任意の歌を配置することも可能である．

CiNii Books

CiNii Research

researchmap
マイクロホンアレイ搭載UAVを用いた屋外実環境実時間音源探査

干場功太郎, 若林瑞保, 鷲崎海, 石木隆洋, 公文誠, GABRIEL Daniel, 中臺一博, 中臺一博, 奥乃博

情報処理学会全国大会講演論文集 79th ( 1 ) 1.199‐1.200 2017.3

　More details

Language：Japanese

J-GLOBAL

researchmap
Bird song explorer:野鳥の歌行動体験のための立体音響に基づく仮想森林アプリケーション

NARAN, 鈴木麗璽, 有田隆也, 中臺一博, 中臺一博, 奥乃博

情報処理学会全国大会講演論文集 79th ( 4 ) 4.239‐4.240 2017.3

　More details

Language：Japanese

J-GLOBAL

researchmap
Report of JSAI SigConf 2016

中臺一博, 小林一郎, 和泉潔

人工知能 : 人工知能学会誌 : journal of the Japanese Society for Artificial Intelligence 32 ( 2 ) 297 - 304 2017.3

　More details

Language：Japanese Publisher：人工知能学会 ; 2014-

DOI： 10.11517/jjsai.32.2_297

CiNii Books

CiNii Research

researchmap
A Study on body movements and postures at Human-Robot Interaction using speech and image information

蓮本諒介, 小山大幾, 水本武志, 中村圭佑, 中臺一博, 今井倫太

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 461 ) 19 - 22 2017.2

　More details

Language：Japanese Publisher：電子情報通信学会

CiNii Books

researchmap
A Study on body movements and postures at Human-Robot Interaction using speech and image information

蓮本諒介, 小山大幾, 水本武志, 中村圭佑, 中臺一博, 今井倫太

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 462 ) 19 - 22 2017.2

　More details

Language：Japanese Publisher：電子情報通信学会

CiNii Books

researchmap
外来種ソウシチョウが在来種の歌行動へ与える影響を探る:マイクロフォンアレイを用いた森林性鳥類の観測実例

松林志保, 斉藤史之, 鈴木麗璽, 千葉尚彬, 中臺一博, 中臺一博, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 49th 23‐28 (WEB ONLY) 2017

　More details

Language：Japanese

J-GLOBAL

researchmap
Evaluation of microphone array for sound source localization using UAV

HOSHIBA Kotaro, WASHIZAKI Kai, WAKABAYASHI Mizuho, KUMON Makoto, NAKADAI Kazuhiro

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2017 ( 0 ) 1P1 - R05 2017

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

Sound source localization using a microphone array embedded on an unmanned aerial vehicle has been studied to detect and localize people who need help in a disaster-stricken area. Because such sound source localization should work in outdoor environments, the design of the microphone array is crucial. We thus developed two types of microphone array; 16ch two-storied hexagonal and 12ch spherical microphone arrays. These two microphone arrays were evaluated via numerical simulation with discussions on the appropriate design of microphone arrays.

DOI： 10.1299/jsmermd.2017.1P1-R05

researchmap
Real-Time Human-Voice Enhancement for a Hose-Shaped Rescue Robot Based on Multi-Channel Low-Rank Sparse Decomposition

bando Yoshiaki, Ambe Yuichi, Itoyama Katsutoshi, Konyo Masashi, Tadokoro Satoshi, Nakadai Kazuhiro, Yoshii Kazuyoshi, G. Okuno Hiroshi

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2017 ( 0 ) 1P2 - P05 2017

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

This paper presents a real-time human-voice enhancement method for a hose-shaped rescue robot based on multi-channel low-rank sparse decomposition. Although microphone arrays equipped on hose-shaped robots are crucial for finding victims under collapsed buildings, human voices captured by the microphone array are contaminated by environment-dependent and non-stationary ego-noise. Our method decomposes multi-channel amplitude spectrograms into sparse and low-rank components (human voice and noise) without any prior training. This decomposition is conducted with a state-space model representing the dynamics of these components in a mini-batch manner. Experimental results show that the performance difference between our method and its offline version is less than 3dB in signal-to-distortion ratio.

DOI： 10.1299/jsmermd.2017.1p2-p05

researchmap
アクティブ周波数レンジフィルタを用いた雑音にロバストな音源定位手法の提案

干場功太郎, 中臺一博, 中臺一博, 公文誠, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 49th 9‐14 (WEB ONLY) 2017

　More details

Language：Japanese

J-GLOBAL

researchmap
HARK2.3の紹介とタフロボティクスチャレンジへの展開

中臺一博, 中臺一博, 中臺一博, 坂東宜昭, 水本武志, 干場功太郎, 小島諒介, 糸山克寿, 杉山治, 公文誠, 奥乃博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 17th ROMBUNNO.3A3‐3 2016.12

　More details

Language：Japanese

J-GLOBAL

researchmap
空間情報を用いた鳥の歌分析 Invited

小島諒介, 杉山治, 干場功太郎, 鈴木麗璽, 中臺一博

第46回AIチャレンジ研究会予稿集 (SIG-Challenge) 046-05 25-31 2016.11

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

researchmap
複数のマイクロホンアレイとロボット聴覚ソフトウエアHARKを用いた野鳥の観測精度の検討 Invited

松林志保, 鈴木麗璽, 小島諒介, 中臺一博

人工知能学会2015年度研究会優秀賞記念講演集 10-15 2016.11

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

researchmap
Semi-Automatic Bird Song Analysis by Spatial-Cue-Based Integration of Sound Source Detection, Localization, Separation, and Identification Reviewed

Ryosuke Kojima, Osamu Sugiyama, Reiji Suzuki, Kazuhiro Nakadai, Charles E. Taylor

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2016) 1287-1292 2016.10

　More details

Language：English Publishing type：Research paper, summary (national, other academic conference)

researchmap
Automatic impulse-response-truncating method affecting less on broadband phase information

72 ( 10 ) 627 - 634 2016.10

　More details

Language：Japanese

CiNii Books

researchmap
マイクロホンアレイを用いた森林性野鳥の定位精度の検証とその応用:歌の空間的な位置およびタイミングから知る複数種の棲み分け

松林志保, 鈴木麗璽, 小島諒介, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2016 138 2016.9

　More details

Language：Japanese

J-GLOBAL

researchmap
マイクロホンアレイを用いたオオヨシキリのソングポスト定位

鈴木麗璽, 松林志保, 斎藤史之, 村手達佳, 増田智久, 山本晃一, 小島諒介, 中臺一博, 中臺一博, 奥乃博

日本鳥学会大会講演要旨集 2016 151 2016.9

　More details

Language：Japanese

J-GLOBAL

researchmap
音源位置を考慮した音源同定のための確率モデルとその学習

小島諒介, 杉山治, 鈴木麗璽, 中臺一博

第34回日本ロボット学会学術講演会 (RSJ2016)資料 4 pages 2016.9

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

researchmap
変分ベイズ多チャネルRNMFに基づく柔軟索状レスキューロボットのための音声強調

坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 中臺一博, 吉井和佳, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 34th ROMBUNNO.1C2‐04 2016.9

　More details

Language：Japanese

J-GLOBAL

researchmap
変分ベイズ多チャネルロバストNMFに基づくマイクロホンの移動・被覆を許容する音声強調 (音声) -- (オーガナイズドセッション「あらゆる音を対象とした情報処理の実現に向けて」)

坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 河原達也, 奥乃博

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116 ( 189 ) 47 - 52 2016.8

　More details

Language：Japanese Publisher：電子情報通信学会

CiNii Books

researchmap
The Past, the Present, and the Future of Special Interest Groups of JSAI

Journal of the Japanese Society for Artificial Intelligence 31 ( 4 ) 531 - 549 2016.7

　More details

Language：Japanese Publisher：The Japanese Society for Artificial Intelligence

DOI： 10.11517/jjsai.31.4_531

CiNii Books

researchmap
The Past, the Present, and the Future of Special Interest Groups of JSAI

和泉潔, 中臺一博, 栗原聡

人工知能 : 人工知能学会誌 : journal of the Japanese Society for Artificial Intelligence 31 ( 4 ) 531 - 549,530 2016.7

　More details

Language：Japanese Publisher：人工知能学会 ; 2014-

DOI： 10.11517/jjsai.31.4_531

CiNii Books

CiNii Research

researchmap
3D Posture Estimation for a Hose-shaped Rescue Robot using a Microphone and Accelerometer Array

bando Yoshiaki, Itoyama Katsutoshi, Konyo Masashi, Tadokoro Satoshi, Nakadai Kazuhiro, Yoshii Kazuyoshi, G. Okuno Hiroshi

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2016 ( 0 ) 1A2 - 10a6 2016.6

　More details

Language：Japanese Publisher：一般社団法人日本機械学会

This paper presents an online method that estimates a 3D posture of a hose-shaped rescue robot using a microphone and accelerometer array. Posture (shape) estimation of a self-driving hose-shaped rescue robot is crucial for handling the robot body because the unseen robot posture deforms in narrow spaces under collapsed buildings. Conventional sound-based method that uses time-differences of arrivals (TDOAs) works only on a two-dimensional surface and is often hampered by the rubble around the robot. Our method eliminates the outliers of sound-based TDOA measurements, and compensates the lack of the posture information with the tilt information measured by accelerometers. Experimental results using a 3-m hose-shaped robot that was deployed in a simple 3D structure demonstrate that our method reduces the errors of initial states to about 20cm in the 3D space.

DOI： 10.1299/jsmermd.2016.1A2-10a6

J-GLOBAL

researchmap
Development of Robot Audition under Severe Conditions

G. OKUNO Hiroshi, NAKADI Kazuhiro, KUMON Makoto, ITOYAMA Katsutoshi, YOSHII Kazuyoshi, BANDO Yoshiaki, Sasaki Yoko

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2016 ( 0 ) 1A2 - 09b3 2016.6

　More details

Language：Japanese Publisher：一般社団法人日本機械学会

The ability of robots to listen to several things at once with their own "ears", i.e., robot audition, is critical in improving the performance of search and rescue activities under severe conditions. This paper introduces "HARK" robot audition open-source software and its capabilities of suppressing ego-noise that is caused by robot's own movements such as motor, propeller and/or flying noise. Then it describes three main applications of robot audition: 1) Unmanned Aerial Vehicle (UAV) with a microphone array to capture sounds can localize a sound source by suppressing ego-noise with either hovering, slow gliding or fast gliding. It can also recognize a sound source by CNN. 2) A serpentine robot with a microphone array can estimate its posture by sound. It can also enhance a voice by Online Robust PCA. 3) A robot with a LiDAR and 32-channel microphone can visualize a sound map by superimposing sound source directions on point clouds.

DOI： 10.1299/jsmermd.2016.1a2-09b3

CiNii Research

J-GLOBAL

researchmap
Online Localization of Multiple Sound Sources and Multiple Robots with Asynchronous Microphone Arrays

Sekiguchi Kouhei, bando Yoshiaki, Nakamura Keisuke, Nakadai Kazuhiro, Itoyama Katsutoshi, Yoshii Kazuyoshi

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2016 1A2-09b5 2016.6

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

This paper presents an online method for localizing the positions of multiple sound sources and stationary robots and synchronizing microphone arrays attached to those robots. Since each robot can estimate only the directions of sound sources, the two-dimensional source positions can be estimated from the source direction estimated by each robot using a triangulation. In addition, mixture signals can be separated accurately by regarding multiple microphone arrays as one big array. To perform these tasks, some methods have been proposed for localizing and synchronizing microphone arrays. These methods, however, assume only a single sound source exists. To overcome this limitation, we estimate the directions of arrival (DOAs) and separate observed signals to estimate the time differences of arrival (TDOAs) by using microphone array techniques, and integrate the DOAs and TDOAs by using a state-space model. The latent variables are estimated in an online manner with a FastSLAM2.0 algorithm.

DOI： 10.1299/jsmermd.2016.1A2-09b5

CiNii Research

J-GLOBAL

researchmap
Report of JSAI SigConf 2015(Special Interest Group Report)

Izumi Kiyoshi, Nakadai Kazuhiro, Yamakawa Hiroshi

journal of the Japanese Society for Artificial Intelligence 31 ( 2 ) 299 - 304 2016.3

　More details

Language：Japanese Publisher：The Japanese Society for Artificial Intelligence

DOI： 10.11517/jjsai.31.2_299

CiNii Books

CiNii Research

researchmap
Computational Creation of Footsteps Illusion Art and Its Practical Applications

Nakadai Kazuhiro, Okuno Hiroshi G, Mizumoto Takeshi, Nakamura Keisuke

シミュレーション = Journal of the Japan Society for Simulation Technology 35 ( 1 ) 32 - 38 2016.3

　More details

Language：Japanese Publisher：小宮山印刷工業

CiNii Books

CiNii Research

researchmap
音源到来方向・時間差を用いた非同期複数マイクロホンアレイ位置のオンライン推定

関口航平, 中村圭佑, 坂東宜昭, 糸山克寿, 吉井和佳, 中臺一博

情報処理学会第78回全国大会 2016 ( 1 ) 483 - 484 2016.3

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

本稿では非同期複数マイクロホンアレイの同期ずれ・位置推定手法について述べる．マイクロホンアレイを搭載した複数台のロボットを用いた音源定位・分離などの音環境認識技術は，単独のロボットを用いた場合よりも高精度な処理を行うことができる．しかし，複数台のロボットを用いたマイクロホンアレイ信号処理には，各ロボットの位置，マイクロホンアレイ間の同期ずれの推定が不可欠である．本稿では各マイクロホンアレイごとに個別に推定した音源定位・位相情報をもとに，非同期複数マイクロホンアレイ間の同期ずれ・位置推定を行う．ロボットと音源の位置・同期ずれを潜在変数として持つ状態空間モデルを設計し，その事後分布をオンライン推定する．

CiNii Books

CiNii Research

researchmap
Robust Recognition of Simultaneous Speech By a Mobile Robot

Jean-Marc Valin, Shun'ichi Yamamoto, Jean Rouat, Francois Michaud, Kazuhiro Nakadai, Hiroshi G. Okuno

IEEE Transactions on Robotics, Vol. 23, No. 4, pp. 742-752, 2007 2016.2

　More details

Publishing type：Internal/External technical report, pre-print, etc.

This paper describes a system that gives a mobile robot the ability to 
perform automatic speech recognition with simultaneous speakers. A microphone 
array is used along with a real-time implementation of Geometric Source 
Separation and a post-filter that gives a further reduction of interference 
from other sources. The post-filter is also used to estimate the reliability of 
spectral features and compute a missing feature mask. The mask is used in a 
missing feature theory-based speech recognition system to recognize the speech 
from simultaneous Japanese speakers in the context of a humanoid robot. 
Recognition rates are presented for three simultaneous speakers located at 2 
meters from the robot. The system was evaluated on a 200 word vocabulary at 
different azimuths between sources, ranging from 10 to 90 degrees. Compared to 
the use of the microphone array source separation alone, we demonstrate an 
average reduction in relative recognition error rate of 24% with the 
post-filter and of 42% when the missing features approach is combined with the 
post-filter. We demonstrate the effectiveness of our multi-source microphone 
array post-filter and the improvement it provides when used in conjunction with 
the missing features theory.

DOI： 10.1109/TRO.2007.900612

arXiv

researchmap
UAV搭載マイクアレイを用いた高雑音環境下における音イベント検出・識別の並列最適化

杉山治, 小島諒介, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 46th 32‐36 (WEB ONLY) 2016

　More details

Language：Japanese

J-GLOBAL

researchmap
部分共有アーキテクチャを用いた深層学習ベースの音源同定の検討

森戸隆之, 杉山治, 小島諒介, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 46th 12‐17 (WEB ONLY) 2016

　More details

Language：Japanese

J-GLOBAL

researchmap
深層学習による多チャネル音響信号に対する音源同定の検討

森戸隆之, 杉山治, 上村知史, 小島諒介, 中臺一博, 中臺一博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 16th ROMBUNNO.2D1‐4 2015.12

　More details

Language：Japanese

J-GLOBAL

researchmap
HARK2.2の新機能とその組込み,SaaSへの展開

中臺一博, 中臺一博, 水本武志, 中村圭佑, 奥乃博

計測自動制御学会システムインテグレーション部門講演会(CD-ROM) 16th ROMBUNNO.2M2‐1 2015.12

　More details

Language：Japanese

J-GLOBAL

researchmap
ロバスト主成分分析を用いた動作雑音抑圧に基づく柔軟索状ロボットのための音声強調

坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 中臺一博, 吉井和佳, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 33rd ROMBUNNO.2D2-05 2015.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Automatic impulse response truncation based on relative amplitude spectrum

中島弘史, 坂田直人, 加科優希, 中臺一博

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 28 208 - 213 2015.8

　More details

Language：Japanese Publisher：[電子情報通信学会]

CiNii Research

J-GLOBAL

researchmap
Wind-induced noise reduction by linear beamforming using a 2-channel microphone

坂田直人, 村上哲郎, 中島弘史, 中臺一博

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 28 359 - 364 2015.8

　More details

Language：Japanese Publisher：[電子情報通信学会]

CiNii Research

J-GLOBAL

researchmap
両耳聴ロボット聴覚ソフトウェアHARK‐BinauralとRaspberry Pi2を用いたヒューマノイドロボットへの適用

坂東宜昭, 金宜鉉, 糸山克寿, 吉井和佳, 中臺一博, 中臺一博, 奥乃博

情報処理学会研究報告(Web) 2015 ( MUS-107 ) VOL.2015-MUS-107,NO.33 (WEB ONLY) 2015.5

　More details

Language：Japanese

J-GLOBAL

researchmap
柔軟索状レスキューロボットのためのロバスト主成分分析を用いた走行雑音抑圧

坂東宜昭, 池宮由楽, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博

第77回全国大会講演論文集 2015 ( 1 ) 505 - 506 2015.3

　More details

Language：Japanese

本稿では，柔軟索状レスキューロボットのための走行雑音抑圧手法について述べる．人間の侵入が困難な災害現場（例：倒壊家屋）においては，被災者の声を手がかりにしたレスキューロボットによる捜索が有用である．柔軟索状レスキューロボットなどの地上走行型ロボットでは，自身の走行雑音によって被災者の声が聞き取りづらくなるうえ，走行雑音は接地面に依存するため，事前の予測が困難であった．本研究では，この問題を解決するため，繰り返し出現する周波数成分を事前情報を用いずに除去することができるロバスト主成分分析を用いて走行雑音抑圧を行う．実際にロボットを動作させて得られた録音データを用いた実験により，提案法を評価した．

CiNii Books

researchmap
柔軟索状レスキューロボットのためのロバスト主成分分析を用いた走行雑音抑圧

坂東宜昭, 池宮由楽, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博

情報処理学会全国大会講演論文集 77th ( 2 ) 2.505-2.506 2015.3

　More details

Language：Japanese

J-GLOBAL

researchmap
Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

IEICE technical report. Signal processing 114 ( 474 ) 1 - 6 2015.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter's ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

CiNii Books

CiNii Research

researchmap
Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

IEICE technical report. Speech 114 ( 475 ) 1 - 6 2015.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter's ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

CiNii Books

researchmap
Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

Technical report of IEICE. EA 114 ( 473 ) 1 - 6 2015.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter's ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

CiNii Books

researchmap
TeleCoBot : A Telepresence system of taking account for conversation environment

TAKAHASHI Masaaki, OGATA Masa, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 351 ) 1 - 5 2014.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

The study of the telepresence robot becomes popular as a communication tool in the remote place. However, there is a problem that the telepresence system can't precisely transfer the user's utterance because of not considering difference of sound environment such as noise. In addition, when the user talks with several people in remote place, the user wants freely to change the speaker volume depending on the situation. Therefore we propose a telepresence conversation robot named "TeleCoBot". It provides the function automatically regulate the volume of user's utterance according to the distance of the partner and noise level in remote place. In addition, user can change the volume freely depending on the conversation situation. In this paper, we conduct the case study, and the result indicated that TeleCoBot's UI should be more effctive and enhance the presence.

CiNii Books

researchmap
Deep Neural Networkを用いたマルチモーダル音声認識

野田邦昭, 山口雄紀, 中臺一博, 奥乃博, 尾形哲也

日本ロボット学会学術講演会予稿集(CD-ROM) 32nd ROMBUNNO.1I1-04 2014.9

　More details

Language：Japanese

J-GLOBAL

researchmap
マイクロホンアレイを用いた駆動機構付ホース型ロボットの姿勢推定

坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 32nd ROMBUNNO.1I2-02 2014.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Design and Implementation of Multidirectional Sound Annotation Tool with HARK

SUGIYAMA Osamu, ITOYAMA Katsutoshi, NAKADAI Kazuhiro, OKUNO Hiroshi G

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 114 ( 85 ) 23 - 26 2014.6

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this study we designed and developed the multidirectional sound source annotation tool with the robot audition software, HARK. With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user ' s annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.

CiNii Books

CiNii Research

researchmap
Deep Neural Networkを用いたマルチモーダル音声認識の為の特徴量学習

山口雄紀, 野田邦昭, 中臺一博, 奥乃博, 尾形哲也

第76回全国大会講演論文集 2014 ( 1 ) 465 - 466 2014.3

　More details

Language：Japanese

本研究の目標は，マルチモーダル音声認識の為の画像特徴量の設計である．マルチモーダル音声認識の精度向上のためには，唇画像からどのようにして音声認識の最小単位である音素や口形素を表現する情報を取り出すかが重要な課題である．本研究では，特徴量学習の新たな手法として注目を集めているDeep Neural Network (DNN)を用いて大量の唇画像から画像特徴量を自己組織的に抽出する手法を構築した．得られた画像特徴量を孤立単語認識タスクで検証するとともに特徴量空間を解析する事で口形素との関連についても考察した．また，得られた画像特徴量と音声を用いた視聴覚統合によるノイズ頑健性の向上について検証を行った．

CiNii Books

researchmap
マイクロホンアレイの位置推定によるホース型ロボットの姿勢推定

坂東宜昭, 大塚琢馬, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 奥乃博

第76回全国大会講演論文集 2014 ( 1 ) 189 - 190 2014.3

　More details

Language：Japanese

ホース型ロボットは細長い形状が特徴のレスキューロボットで，倒壊した建築物の隙間などの探索が可能である．操縦の効率化のために加速度センサやカメラ画像などを用いた本ロボットの姿勢推定法が提案されてきたが，累積誤差が生じるなどの問題があった．本稿ではマイクロホンアレイと小型スピーカを本ロボットに装着し，音によるこれらの位置推定によって姿勢を推定する手法について述べる．本手法ではスピーカから発する試験音の各マイクへの到達時間差を用いて姿勢を推定するが，到達時間差は現在のマイクとスピーカの位置関係を表しており，過去の誤差を修正できる．実録音データを用いて本手法の有効性を評価した．

CiNii Books

researchmap
音ランドマークを用いたマルチコプターの定位

ラナシナパヤ, 中村圭佑, 中臺一博, 高橋秀幸, 木下哲男

第76回全国大会講演論文集 2014 ( 1 ) 185 - 186 2014.3

　More details

Language：English

We propose a novel approach to multicopter localization, using sound landmarks and one embedded microphone. This approach can benefit to multicopter localization in that it requires less computational power and smaller payloads than image-based approaches. However, the high ego-noise of multicopters is a serious threat for sound-based algorithms. We simulated a 2D localization method based on a Kalman Filter using measurements of acceleration and sound landmarks' intensity. A random walk model is used to update the multicopter's position with the Kalman Filter; the calculated estimation is then corrected using noisy measurements from the embedded microphone and accelerometer. Simulation results show that the proposed algorithm can successfully track the multicopter's motion in a noisy environment. We confirmed the effectiveness of our proposed algorithm by comparing its performance and robustness to a time/phase based algorithm.

CiNii Books

researchmap
DI-1-6 Scene Analysis Based on Robot Audition

Nakadai Kazuhiro, Nakamura Keisuke, Tezuka Taiki

Proceedings of the IEICE General Conference 2014 ( 2 ) "SS - 18"-"SS-19" 2014.3

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

CiNii Books

CiNii Research

researchmap
相関行列スケーリングを用いた屋外音源探索手法の解析

大畑琢磨, 長峰諒英, 中村圭佑, 石崎孝幸, 水本武志, 中臺一博, 中臺一博

人工知能学会AIチャレンジ研究会(Web) 41st 2014

　More details

J-GLOBAL

researchmap
Online calibration and transfer function estimation of an asynchronous microphone array

Nakadai Kazuhiro, Nakamura Keisuke

THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN 70 ( 7 ) 397 - 402 2014

　More details

Language：Japanese Publisher：一般社団法人日本音響学会

DOI： 10.20697/jasj.70.7_397

CiNii Books

CiNii Research

researchmap
マイクロホンアレイとスピーカをもつ柔軟索状ロボットのための動的スピーカ選択による姿勢推定の高速化

坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 41st 8 (WEB ONLY) 2014

　More details

Language：Japanese

J-GLOBAL

researchmap
TelePaBot : A telepresence system for supporting multi-party conversation

Koike Kyotaro, Imai Michita, Nakamura Keisuke, Nakadai Kazuhiro

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 372 ) 1 - 6 2013.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

A telepresence robot is useful to deal with a situation where a user in a remote area has to control the robot to communicate with people. However, there exists some remaining issues that the target speech is contaminated with unnecessary speeches, and the remote user cannot understand the speech in the case of multi-party conversation. We propose a telepresence party robot, "TelePaBot" that visualizes utterance's position and purveys a selective listening function. A case study suggested that TelePaBot smoothens remote-communication even when multi-party conversation occurs.

CiNii Books

researchmap
Volume Adaptation and Visualization by Modeling the Volume Level in Actual Noise Environment for Telepresence System

HAYAMIZU Akira, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 113 ( 372 ) 35 - 40 2013.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

The Lombard effect is the involuntary tendency of speakers to increase their vocal effort when speaking in loud noise to enhance the audibility of their voice. There is a problem in a telecommunication situation due to the Lombard effect, and would talk loudly than necessary for the conversation partner at a remote location. In this paper, the design and the model that is required in order to adjust automatically the volume of the operator at the remote communication via telepresence robot mobile in the real world, the optimal volume control system LOMBOT equipped with a model was developed. As a result, We confirmed that the volume is adjusted properly to the noise of the remote location

CiNii Books

researchmap
Incremental Noise Estimation in Outdoor Auditory Scene Analysis using a Quadrocopter with a Microphone Array

OKUTANI Keita, YOSHIDA Takami, NAKAMURA Keisuke, NAKADA Kazuhiro

Journal of the Robotics Society of Japan 31 ( 7 ) 676 - 683 2013.9

　More details

Language：Japanese Publisher：The Robotics Society of Japan

This paper addresses sound source localization using an aerial vehicle with a microphone array in an outdoor environment to realize outdoor auditory scene analysis. It, for instance, aims at finding distressed people in a disaster situation. In such an environment, noise is quite loud and dynamically-changing, and conventional microphone array techniques studied in the field of indoor robot audition are of less use. We, thus, proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC). It can deal with dynamically-changing high power noise by introducing incrementally-estimated noise correlation matrices. We developed a prototype system for the outdoor auditory scene analysis based on the proposed method using the Parrot AR.Drone with an 8ch microphone array and a Kinect device. Experimental results using the prototype system showed that dynamically-changing noise is properly suppressed with the proposed method even when the signal-to-noise ratio is less than 0dB in an outdoor/indoor environment with the hovering/moving AR.Drone.

DOI： 10.7210/jrsj.31.676

CiNii Books

researchmap
Multirotor UAVを用いた音源定位のための雑音相関行列推定

古川孝太郎, 大塚琢馬, 糸山克寿, 中臺一博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 31st ROMBUNNO.3D3-02 2013.9

　More details

Language：Japanese

J-GLOBAL

researchmap
ホース型ロボットのマイクロホンアレイを用いた姿勢推定

坂東宜昭, 大塚琢馬, 水本武志, 糸山克寿, 中臺一博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 31st ROMBUNNO.3D3-01 2013.9

　More details

Language：Japanese

J-GLOBAL

researchmap
話者ダイアライゼーションシステムのための音声区間検出および到来方向推定の精度向上の検討

黄楊暘, 大塚琢馬, 中臺一博, 奥乃博

第75回全国大会講演論文集 2013 ( 1 ) 479 - 480 2013.3

　More details

Language：Japanese Publisher：情報処理学会

ロボット聴覚では, いつ, どこで, 誰が話したかを解明する音環境理解機能が不可欠である. 本稿では, それらの問題を解決するために, 音声区間検出, 到来方向推定および話者同定技術を組み合わせた処理を話者ダイアライゼーションシステムとする. ロボット聴覚ソフトウエア HARK においては, MUSIC アルゴリズムを前処理として, 音声区間検出および到来方向推定を行っている. しかし, MUSIC スペクトルに基づいて処理を行う際に, 音源数パラメータおよび閾値パラメータが結果を大きく左右する. 本稿では, ブラインド音源分離を前処理とする話者ダイアライゼーションシステムを提案した. 音量閾値パラメータの設定は依然必要であるが, 精度向上したパフォーマンスが得られている.

CiNii Books

CiNii Research

researchmap
チューブ型ロボットの姿勢推定のためのEKF-SLAMを用いた可変マイクロホンアレイ位置推定

坂東宜昭, 水本武志, 中臺一博, 奥乃博

全国大会講演論文集 2013 ( 1 ) 439 - 441 2013.3

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

災害現場での被災者発見にはがれき内へ進入可能なチューブ型ロボットが有用である.さらにチューブ型ロボットに音源定位機能があれば被災者の声から位置の推定が可能となる.しかし,近年の高精度な音源定位手法は位置が既知のマイクアレイで収録した音声から方向を推定しているが,チューブ型ロボットではマイク配置を事前に計測できない.そこで本稿ではEKF-SLAMによるマイクロフォン位置推定手法提案し,常に変化するロボット姿勢の推定によって本問題を解決する.数値実験と実録音の両方を用いて本手法の有効性を確認した.

CiNii Books

CiNii Research

researchmap
チューブ型ロボットの姿勢推定のためのEKF-SLAMを用いた可変マイクロホンアレイ位置推定

坂東宜昭, 水本武志, 中臺一博, 奥乃博

第75回全国大会講演論文集 2013 ( 1 ) 439 - 440 2013.3

　More details

Language：Japanese

災害現場での被災者発見にはがれき内へ進入可能なチューブ型ロボットが有用である．さらにチューブ型ロボットに音源定位機能があれば被災者の声から位置の推定が可能となる．しかし，近年の高精度な音源定位手法は位置が既知のマイクアレイで収録した音声から方向を推定しているが，チューブ型ロボットではマイク配置を事前に計測できない．そこで本稿ではEKF-SLAMによるマイクロフォン位置推定手法提案し，常に変化するロボット姿勢の推定によって本問題を解決する．数値実験と実録音の両方を用いて本手法の有効性を確認した．

CiNii Books

CiNii Research

researchmap
クアドロコプターを用いた飛行雑音に頑健な音源定位

古川孝太郎, 奥谷啓太, 柳楽浩平, 大塚琢馬, 中臺一博, 奥乃博

第75回全国大会講演論文集 2013 ( 1 ) 489 - 490 2013.3

　More details

Language：Japanese

本研究は多数の回転翼を持つ小型の無人航空機, クアドロコプターにマイクロフォンアレイを搭載し, 周囲の環境における音源定位問題を取り扱う.通常, 飛行時には風圧やローターの駆動に起因する雑音が極めて大であり, 定位精度の劣化原因となり得る.このような雑音環境下では, 一般化固有値分解を用いた MUSIC 法により雑音相関行列を加味することで精度が改善することが知られている.そこで本研究は, 飛行に伴って動的に変化する雑音相関行列の推定へと問題を帰着する.その上で飛行制御などの機体のモニタ情報を用いた推定手法を提案し, 飛行雑音に頑健な音源定位手法を開発する.

CiNii Books

researchmap
ホースの伸び縮みによるマイク位置の変化を許容するマイクロホンアレイを用いたホース型ロボットの姿勢推定

坂東宜昭, 大塚琢馬, 糸山克寿, 中村圭佑, 昆陽雅司, 田所諭, 中臺一博, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 38th 10 (WEB ONLY) 2013

　More details

Language：Japanese

J-GLOBAL

researchmap
2P1-P24 Development of a Sound Soure Localization System for Assisting Group Conversation(Communication Robot)

Moon Seong-eun, Takagi Kentaro, Kamashima Tsutomu, Nakadai Kazuhiro, Otake Mihoko

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2013 ( 0 ) _2P1 - P24_1-_2P1-P24_2 2013

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

This paper presents a sound source localization system that composes a wireless microphone array named Jellyfish-02 and robot audition software HARK. Jellyfish-02 surpasses existing microphone array in design and usability, because it has a cover with rechargeable battery, which can be connected to wireless network. We evaluated sound source localization performance of Jellyfish-02, and investigated the percentage of speech overlapped periods in natural conversation. Prom the results, Jellyfish-02 is potentially applicable for assisting group conversation by measuring duration of speech for each participant.

CiNii Books

J-GLOBAL

researchmap
マイクロホンアレイを用いた複数人対話からの音声区間検出および話者方向推定の評価手法

黄楊暘, 大塚琢馬, 中臺一博, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 30th ROMBUNNO.3D1-4 2012.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Sensing Technology for Listening to a Mixture of Sounds

OKUNO Hiroshi G, NAKADAI Kazuhiro, MIZUMOTO Takeshi

The Journal of the Institute of Electronics, Information, and Communication Engineers 95 ( 5 ) 401 - 404 2012.5

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

私たちが日常耳にする音は複数の音や背景雑音が混じった混合音である.実世界で音情報を活用するためには「聞き分ける」機能が不可欠である.聞き分けるセンサ技術は,インストルメンテーション(装置化)という観点から音を収録するデバイス(センサ)と収録音に対する処理ソフトウェアから構成される.本稿では,混合音のセンサ技術の動向を,ロボット聴覚とカエルの合唱の観測について解説を行う.混合音を聞き分けるという立場から,音源定位,音源分離,分離音認識に取り組むべきであると考え,音環境理解という研究を過去15年進めてきた.離れて聞くという技術は,ロボットでは不可欠の技術であり,ロボット聴覚に不可欠な機能を統合的に提供するソフトウェアHARKを開発し,公開している.HARKの設計思想から具体的な実装まで概観し,その応用として,音環境可視化技術と人ロボット共生学への応用について報告する.また,カエルの合唱機構を音を聞き分けて解析する応用では,フィールドで聞こえる様々な音のために,音響処理だけでは難しいので,近傍の音を拾ってLEDを光らせる「カエルホタル」を開発した.カエルホタルを多数並べて実際の田んぼで観測し,カエルの鳴き方の観測実験についても合わせて報告する.以上の報告を通して,混合音を聞き分ける技術が,今後重要な技術になることを提案する.

CiNii Books

CiNii Research

researchmap
Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

第74回全国大会講演論文集 2012 ( 1 ) 355 - 356 2012.3

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

人のギター演奏を対象とした実時間のビートトラッキングでは，シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある．我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では，視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い，手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

CiNii Books

researchmap
Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

全国大会講演論文集 2012 ( 1 ) 355 - 357 2012.3

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

人のギター演奏を対象とした実時間のビートトラッキングでは,シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある.我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では,視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い,手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

CiNii Books

researchmap
多チャンネルマイクロホンアレイを用いた音声区間検出および音源定位の精度の向上の検討

HUANG Yangyang, 大塚琢馬, 中臺一博, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 36th 5 (WEB ONLY) 2012

　More details

Language：Japanese

J-GLOBAL

researchmap
ロボットのための実環境ロバストな実時間超解像三次元音源定位

中村圭佑, 中臺一博, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 36th 2 (WEB ONLY) 2012

　More details

Language：Japanese

J-GLOBAL

researchmap
遠隔ユーザの音環境理解を支援するユーザインタフェース

植田俊輔, 今井倫太, 中村圭佑, 中臺一博

人工知能学会全国大会論文集 2012 ( 0 ) 3K1R111 - 3K1R111 2012

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

人間は雑音が多い環境下であってもある程度どこでどのような会話が行われているかを理解する事が出来るが，遠隔操作を行うロボットアバタでは遠隔操作者が遠隔地の音環境を理解する事は困難である．本稿では，雑音環境下でも操作者と遠隔地がインタラクションをスムーズに行うことを支援するユーザインタフェースUI-ALTを提案する．オフライン実験によりUI-ALTは遠隔操作者の雑音環境理解に有用であることが示された．

DOI： 10.11517/pjsai.jsai2012.0_3k1r111

CiNii Books

CiNii Research

researchmap
Intelligent Human Tracking Based on Multimodal Integration

NAKAMURA Keisuke, NAKADAI Kazuhiro, ASANO Futoshi, NAKAJIMA Hirofumi, INCE G&ouml, khan

Transactions of the Society of Instrument and Control Engineers 48 ( 6 ) 349 - 358 2012

　More details

Language：English Publisher：The Society of Instrument and Control Engineers

Localization and tracking of humans are essential research topics in robotics. In particular, Sound Source Localization (SSL) has been of great interest. Despite the numerous reported methods, SSL in a real environment had mainly three issues; robustness against noise with high power, no framework for selective listening to sound sources, and tracking of inactive and/or noisy sound sources. For the first issue, we extended Multiple SIgnal Classification by incorporating Generalized Eigen Value Decomposition (GEVD-MUSIC) so that it can deal with high power noise and can select target sound sources. For the second issue, we proposed Sound Source Identification (SSI) based on hierarchical Gaussian mixture models and integrated it with GEVD-MUSIC to realize a function to listen to a specific sound source according to the sort of the sound source. For the third issue, auditory and visual human tracking were integrated using particle filtering. These three techniques are integrated into an intelligent human tracking system. Experimental results showed that integration of SSL and SSI successfully achieved human tracking only by audition, and the audio-visual integration showed considerable improvement in tracking by compensating the loss of auditory or visual information.

researchmap
A Platform for Recognizing Interactive Behavior on Human-Robot Interaction

SHIOMI Masahiro, IWAI Yoshio, SUMI Yasuyuki, NAKADAI Kazuhiro, HAGITA Norihiro

Journal of the Robotics Society of Japan 29 ( 10 ) 883 - 886 2011.12

　More details

Language：Japanese Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.29.883

CiNii Books

researchmap
Intelligent Human Tracking based on Information Integration

NAKAMURA Keisuke, NAKADAI Kazuhiro, INCE Gokhan

IEICE technical report 111 ( 32 ) 35 - 40 2011.5

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

Since scene recognition and robot perception have been of great interest, information integration has become a significant research topic in robotics. From the viewpoint of scalability and reusability, utilization of appropriate middleware is a key factor to improve total system performance. This paper presents an integration methodology of multimodal information through constructing an intelligent human tracking system. Our system architecture interoperably combines two different types of middleware ; HARK and ROS. HARK uses dataflow-oriented middleware for real-time processing while ROS is event-driven middleware for easy integration. We confirmed that the proposed architecture realized real-time processing and considerable improvements of noise-robustness in human tracking.

CiNii Books

CiNii Research

researchmap
ロボット聴覚用オープンソースソフトウェア HARKの展開

中臺一博, 奥乃博

デジタルプラクティス 2 ( 2 ) 133 - 140 2011.4

　More details

Language：Japanese Publisher：情報処理学会

ロボット聴覚用のオープンソースソフトウェアとして研究開発を行っているHARK (HRI-JP Audition for Robots with Kyoto Univ.) の展開について説明する．HARK は複数のマイクロフォン（マイクロフォンアレイ）からの入力をもとに，音源定位，音源分離，さらに分離音声の認識までをサポートするソフトウェアであり，GUIプログラミング環境上で様々なモジュールを配置・接続することにより，形状やマイクロフォンレイアウトが異なるロボットに対応させたり，用途に合わせたロボット聴覚システムを構築したりすることができる．本稿では，HARK の設計指針を解説し，HARKを用いて構築したシステムの応用例，HARKの展開も併せて報告する．

CiNii Books

CiNii Research

researchmap
累積頻度重みを適用したパーティクルフィルタによる実時間楽譜追従

大塚琢馬, 中臺一博, 高橋徹, 尾形哲也, 奥乃博

第73回全国大会講演論文集 2011 ( 1 ) 305 - 306 2011.3

　More details

Language：Japanese

パーティクルフィルタによる楽譜追従は，音響信号と楽譜との距離に基づくパーティクル重みの計算によって追従性能が大きく左右される．従来のベクトル内積計算やシグモイド関数を用いた重み計算手法では，音響信号の非調波成分や楽器の音色のバリエーションにより，楽譜位置推定が正しい場合，誤った場合でそれぞれの重みに大きな差が生じず，最終的に推定された楽譜位置に誤差が含まれるという問題点があった．本稿では，過去に計算された距離の累積頻度から重みを動的に計算し，正しい楽譜位置ではより高い重みを計算する．評価実験では，累積頻度を用いた重み計算法が，従来の重み計算法よりも楽譜追従精度で改善することが確認された．

CiNii Books

researchmap
Audio-visual musical instrument recognition

AngelicaLim, 中村圭佑, 中臺一博, 尾形哲也, 奥乃博

第73回全国大会講演論文集 2011 ( 1 ) 309 - 310 2011.3

　More details

Language：English

Is this person playing a violin or a flute? Classification of musical instrument performances is usually carried out using audio features such as spectral coefficients. We propose augmenting the typical audio feature set with visual features. We show that a combination of audio features and video perform better than audio alone, and verify this multimodal recognition approach on a real-time robot platform.

CiNii Books

researchmap
Machine Audition Technology that Listens to Multiple Voiced Speech at Once

G. OKUNO Hiroshi, NAKADAI Kazuhiro

The Journal of The Institute of Electrical Engineers of Japan 131 ( 3 ) 159 - 163 2011.3

　More details

Language：Japanese Publisher：一般社団法人電気学会

This article has no abstract.

DOI： 10.1541/ieejjournal.131.159

CiNii Books

CiNii Research

researchmap
Robot Audition : Hands-Free Automatic Speech Recognition under Highly-Noisy Environemnts

NAKADAI Kazuhiro, OKUNO Hiroshi G

IEICE technical report 110 ( 401 ) 7 - 12 2011.1

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

This paper addresses robot audition, which realizes listening capabilities for robots using robot-embedded microphones. For robot audition, we propose real-time sound source separation and automatic speech recognition (ASR) techniques for dynamically changing environments based on microphone array processing, which is applicable to hands-free ASR under highly-noisy environments. Implementation of the proposed techniques is open-sourced as robot audition software called "HARK." We show the effectiveness of these techniques through applications of HARK to robots.

CiNii Books

CiNii Research

researchmap
マルチロボットによるKinectを用いた同期合奏

糸原達彦, 水本武志, LIM Angelica, 大塚琢馬, 中村圭佑, 長谷川雄二, 中臺一博, 尾形哲也, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 34th B102-10 (WEB ONLY) 2011

　More details

Language：Japanese

J-GLOBAL

researchmap
音源定位手法MUSICのベイズ拡張

大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

人工知能学会AIチャレンジ研究会(Web) 34th B102-6 (WEB ONLY) 2011

　More details

Language：Japanese

J-GLOBAL

researchmap
AI-1-3 OPEN-SOURCED ROBOT AUDITION SOFTWARE HARK

Okuno Hiroshi G, Nakadai Kazuhiro, Takahashi Toru

Proceedings of the Society Conference of IEICE 2010 "SS - 72"-"SS-73" 2010.8

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

CiNii Books

CiNii Research

researchmap
ロボット聴覚ソフトウエアHARKとそのロボットへの適用

高橋徹, 中臺一博, 奥乃博

電気関係学会東海支部連合大会講演論文集(CD-ROM) 2010 ROMBUNNO.S3-1 2010.8

　More details

Language：Japanese

J-GLOBAL

researchmap
Real time speaker orientation estimation using a room microphone array

HARUBARA Takuya, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, KANEDA Yutaka

IEICE technical report 110 ( 131 ) 19 - 24 2010.7

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

This paper addresses a real-time sound source orientation estimation system using a 96ch microphone-array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. Furthermore, we showed that the precision of the orientation estimation system is improved to introduce four additional techniques: Amplitude-extraction, correlation-based automatic voice activity detection(VAD), frequency mask and histogram integration. We developed a real-time sound source orientation system. However, the precision of the real-time system is sufficient for practical use. In this paper, we investigate the main causes of the estimation error and propose an advanced real-time orientation estimation system. The experimental results show that the advanced system has lower errors than the previous system by 20°- -30°.

CiNii Books

CiNii Research

researchmap
Score Following by Particle Filtering for Music Robots

OTSUKA Takuma, NAKADAI Kazuhiro, TAKAHASHI Toru, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 72 ( 0 ) 913 - 914 2010.3

　More details

Language：English

CiNii Books

researchmap
Robot audition system development and parameter-turning in real environment

TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 72 ( 0 ) 29 - 30 2010.3

　More details

Language：Japanese

CiNii Books

researchmap
Self-speech cancellation with Semi-blind ICA for Robot speech interaction

TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G.

全国大会講演論文集 72 ( 0 ) 27 - 28 2010.3

　More details

Language：Japanese

CiNii Books

researchmap
Robot Audition Open-Sourced Software HARK

奥乃博, 中臺一博

日本ロボット学会誌(Journal of the Robotics Society of Japan) 28 ( 1 ) 6 - 9 2010.1

　More details

Language：Japanese Publisher：日本ロボット学会

DOI： 10.7210/jrsj.28.6

CiNii Books

CiNii Research

researchmap
On special issue "Robot Audition"

中臺一博, 宮下敬宏, 奥乃博

日本ロボット学会誌(Journal of the Robotics Society of Japan) 28 ( 1 ) 1 - 1 2010.1

　More details

Language：Japanese Publisher：日本ロボット学会

CiNii Books

CiNii Research

researchmap
On special issue "Robot Audition"

NAKADAI Kazuhiro, MIYASHITA Takahiro, OKUNO Hiroshi G

Journal of the Robotics Society of Japan 28 ( 1 ) 1 - 1 2010.1

　More details

Language：Japanese Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.28.1

CiNii Books

CiNii Research

researchmap
Robot Audition Open-Sourced Software HARK

OKUNO Hiroshi G, NAKADAI Kazuhiro

Journal of the Robotics Society of Japan 28 ( 1 ) 6 - 9 2010.1

　More details

Language：Japanese Publisher：The Robotics Society of Japan

DOI： 10.7210/jrsj.28.6

CiNii Books

researchmap
リサンプル‐ブロック処理と並列化に基づくICAの実時間実装

武田龍, 中臺一博, 高橋徹, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 28th ROMBUNNO.1H3-1 2010

　More details

Language：Japanese

J-GLOBAL

researchmap
打楽器とロボットとの合奏のための結合振動子モデルに基づく打撃時刻予測

水本武志, 中臺一博, 大塚琢馬, 高橋徹, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 28th ROMBUNNO.1H3-2 2010

　More details

Language：Japanese

J-GLOBAL

researchmap
Blind dereverberation improved by multi-stage processing

NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

IEICE technical report 109 ( 136 ) 7 - 12 2009.7

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

This paper addresses a multi-stage processing mechanism that improves various dereverberation methods. In the mechanism, each stage is implemented as an intermediate processing module that connects the outputs of the modules on the previous stage to the inputs of other modules on the next stage. Since the dereverberation performance at each stage depends on the input channel combinations, we proposed two additional processes: channel selection and delay addition. We applied our proposed mechanism with these two processes to two dereberveration methods, i.e., SBM and RDAIF. The proposed system showed the following results: (1) Channel selection process improved 3-10dB. The optimum combination can reduce the number of input channels without any degradation. (2) Delay addition process improved the suppression performance by 3-10dB. (3) Multi-stage mechanism improved for SBM and RDAIF are 18.2dB and 13.6dB, respectively, while the performance without the mechanism are only 14.6dB and 3.5dB, respectively. We can conclude that the proposed mechanism and processes are effective to improve reverberation performance.

CiNii Books

researchmap
Blind Dereverberation Improved By Multi-Stage Processing

NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

電子情報通信学会技術研究報告. EA, 応用音響 109 ( 136 ) 7 - 12 2009.7

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

This paper addresses a multi-stage processing mechanism that improves various dereverberation methods. In the mechanism, each stage is implemented as an intermediate processing module that connects the outputs of the modules on the previous stage to the inputs of other modules on the next stage. Since the dereverberation performance at each stage depends on the input channel combinations, we proposed two additional processes: channel selection and delay addition. We applied our proposed mechanism with these two processes to two dereberveration methods, i.e., SBM and RDAIF. The proposed system showed the following results: (1) Channel selection process improved 3-10dB. The optimum combination can reduce the number of input channels without any degradation. (2) Delay addition process improved the suppression performance by 3-10dB. (3) Multi-stage mechanism improved for SBM and RDAIF are 18.2dB and 13.6dB, respectively, while the performance without the mechanism are only 14.6dB and 3.5dB, respectively. We can conclude that the proposed mechanism and processes are effective to improve reverberation performance.

CiNii Books

CiNii Research

researchmap
The design of a directional sound source for numerical simulation based on wave acoustics

鈴木淑正, 中島弘史, 中臺一博

聴覚研究会資料 39 ( 4 ) 325 - 330 2009.6

　More details

Language：Japanese Publisher：日本音響学会聴覚研究委員会

CiNii Books

CiNii Research

researchmap
The design of a directional sound source for numerical simulation based on wave acoustics

SUZUKI Toshimasa, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, ARAI Takahiro, HASEGAWA Yuji

IEICE technical report 109 ( 100 ) 109 - 114 2009.6

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

Thanks to improvements in computer performance, numerical simulation based on wave acoustics works in practical time with off-the-shelf computers. Such a numerical simulation method accurately estimates a sound field when it is a simple and simulated environment like a free sound field. However, this method has difficulties in simulating a real-world acoustic environment. One of issues for real-world simulation is to deal with a sound directivity. Thus, most numerical simulators assume a point sound source to avoid this issue. Indeed, several studies to cope with a sound directivity have been reported, but, the accuracy and practical utility are insufficient for real world simulation, because an accurate sound propagation model is necessary to deal with a sound directivity. We use a compact finite difference method based on sound field digitization which has an accurate sound propagation model. However, this method also has a problem, that is, two points are simulated differently even when they are located with the same distance from the sound source due to the difference in the effect of their numerical dispercion. In this paper we, first, confirm the performance of our method by using an omni-directional point source in a free sound field. After that, we show that our method is able to simulate a directional sound source accurately using a combination of a simple loudspeaker and a point source model.

CiNii Books

researchmap
Simultaneous three talker speech recognition using soft mask and model adaptation technique

TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 71 ( 0 ) 35 - 36 2009.3

　More details

Language：Japanese

CiNii Books

researchmap
Realtime Syncronization Method between Audio Signal and Score Using Beats, Melodies, and Harmonies for Singer Robots

OTSUKA Takuma, MURATA Kazumasa, TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, OGATA Tetsuya, OKUNO Hiroshi G.

全国大会講演論文集 71 ( 0 ) 243 - 244 2009.3

　More details

Language：Japanese

CiNii Books

researchmap
Perspective of Robot Systems Coexisting with People

NAKADAI Kazuhiro, HASEGAWA Yuji, SEKIGUCHI Tatsuhiko, TSUJINO Hiroshi

Journal of the Robotics Society of Japan 27 ( 1 ) 6 - 9 2009.1

　More details

Language：Japanese Publisher：一般社団法人日本ロボット学会

DOI： 10.7210/jrsj.27.6

CiNii Books

researchmap
Panel Discussion : Application Developments of Speech Recognition

NISIMURA Ryuichi, NAKANO Teppei, KURIHARA Kazutaka, NAKADAI Kazuhiro, YOSHINO Takashi

IPSJ SIG Notes 2008 ( 102 ) 55 - 60 2008.10

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

To induce developments of ASR applications, this panel discussion introduces actual case studies. We also indicate some problems of ASR application developments.

CiNii Books

researchmap
ICA-based Robot Audition for recognizing barge-in speech under reverberation

武田龍, 中臺一博, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 26th ROMBUNNO.1A2-02 2008.9

　More details

Language：Japanese

J-GLOBAL

researchmap
A Beat-Tracking Robot for Human-Robot Intraction and Its Evaluation

村田和真, 中臺一博, 武田龍, 吉井和佳, 奥乃博, 鳥井豊隆, 長谷川雄二, 辻野広司

日本ロボット学会学術講演会予稿集(CD-ROM) 26th ROMBUNNO.1A1-03 2008.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Improving Speech Recognition of Periphery Talkers by Generating Soft Masks for Robot Audition

高橋徹, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 26th ROMBUNNO.1A1-01 2008.9

　More details

Language：Japanese

J-GLOBAL

researchmap
ミッシングフィーチャ理論に基づく複数話者同時発話音声認識における音響特徴量とマスクの検討

高橋徹, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

日本音響学会研究発表会講演論文集(CD-ROM) 2008 ROMBUNNO.2-P-16 2008.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Estimation of sound source orientation using a 96 channel microphone array

KIKUCHI Keiko, DAIGO Tohru, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, KANEDA Yutaka

IEICE technical report 108 ( 143 ) 13 - 18 2008.7

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

This paper addresses sound source orientation estimation using a 96ch microphone array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. However, in this method, a transfer function to design a beam-former should be the same as that of target sound source. Otherwise the performance deteriorated due to a mismatch between these two transfer functions. In addition, voice activity detection (VAD) was manually performed. To solve the former, we proposed amplitude-based orientation estimation using a histogram to relax the effect of the mismatch problems mainly caused by phase errors and outliers. For the latter, speech frequency component detection based on inner product and automatic VAD based on auto-correlation are introduced to form a frequency-temporal masking pattern. Preliminary experiments showed that sound source orientation estimation with automatic VAD for actual human voices drastically improved even when using a loudspeaker-based transfer function.

CiNii Books

researchmap
Design and Evaluation of Barge-In enable Robot Audition System with ICA and MFT-based ASR

TAKEDA Ryu, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G.

全国大会講演論文集 70 ( 0 ) 135 - 136 2008.3

　More details

Language：Japanese

CiNii Books

researchmap
1P1-G13 Overview of Open Source Software for Robot Audition

Nakadai Kazuhiro, Yamamoto Shunichi, Okuno Hiroshi G, Nakajima Hirofumi, Hasegawa Yuji, Tsujino Hiroshi

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2008 ( 0 ) _1P1 - G13_1-_1P1-G13_4 2008

　More details

Language：Japanese Publisher：一般社団法人日本機械学会

This paper describes an open source software system for robot audition called HARK (Honda Research Institute Japan Audition for Robots with Kyoto University). HARK consists of a lot of modules including multi-channel audio input, sound source localization, sound source tracking, sound source separation and recognition of separated speech for robot audition based on the data-flow oriented software programming environment, FlowDesigner. By combining these modules using a GUI environment, a user can easily build a robot audition system for various types of robots and acoustic environments. Through HARK applications to Honda ASIMO and Robovie with different microphone settings, we showed high software portability and reusability of HARK.

CiNii Books

CiNii Research

J-GLOBAL

researchmap
ビートトラッキングロボットの構築と評価

村田和真, 中臺一博, 武田龍, 奥乃博, 長谷川雄二, 辻野広司

人工知能学会AIチャレンジ研究会 28th 13 - 20 2008

　More details

Language：Japanese

J-GLOBAL

researchmap
E-052 Semi-Blind Source Separation using ICA for Barge-In-Capable Robot Spoken Dialogue

Takeda Ryu, Nakadai Kazuhiro, Komatani Kazunori, Ogata Tetsuya, Okuno Hiroshi G.

情報科学技術フォーラム一般講演論文集 6 ( 2 ) 261 - 262 2007.8

　More details

Language：Japanese Publisher：Forum on Information Technology

CiNii Books

researchmap
High performance blind source separation using an adaptive step-size parameter method

NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, TSUJINO Hiroshi

IEICE technical report 107 ( 120 ) 19 - 24 2007.6

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

This paper describes a novel blind source separation (BSS) method. One of the most important factors in BSS performance is a step-size parameter to update a decomposition matrix which is generally used for extracting a target sound source. A fixed value which was obtained empirically is commonly used as the step-size parameter. However, in the real world, the surrounding environment changes dynamically. So, conventional BSS with a fixed step-size parameter does not provide the best performance and sometimes results in divergence of the decomposition matrix. We propose a method that allows for an adaptive step-size parameter. Since the proposed method is gen- erally applicable to BSS methods, we applied it to six types of BSS algorithms with a microphone array embedded in Honda's ASIMO. Experimental results show that the proposed method improves sound source separation in the four BSS algorithms, and the step-size parameter is maintained optimally even when the surrounding environment changes.

CiNii Books

CiNii Research

researchmap
Robust Domain Selection Using Dialogue History in Multi-domain Spoken Dialogue Systems

神田直之, 駒谷和範, 中野幹生, 中臺一博, 辻野広司, 尾形哲也, 奥乃博

情報処理学会論文誌 48 ( 5 ) 1980 - 1989 2007.5

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

We have developed a robust domain selection method using dialogue history in multi-domain spoken dialogue systems. We define domain selection as a classifying problem among (I) the domain in the previous turn, (II) the domain in which N-best speech recognition results can be accepted with the highest recognition score, (III) other domains. We constructed a classifier by decision tree learning with dialogue data. We implemented a multi-domain spoken dialogue system with 5 domains, and collected dialogue data from 10 subjects. The experimental result showed our method reduced 16.2% of domain selection errors, compared with a conventional method using speech recognition likelihoods only.

CiNii Research

J-GLOBAL

researchmap
AS-6-1 Sound Stream Formation and Human Tracking by Integration of Microphone Arrays

Nakadai Kazuhiro, Nakajima Hirofumi, Murase Masamitsu, Okuno Hiroshi G, Hasegawa Yuji, Tsujino Hiroshi

Proceedings of the IEICE General Conference 2007 "S - 65"-"S-66" 2007.3

　More details

Language：English Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Books

researchmap
音を視覚化する録音再生システム

吉田雅敏, 海尻聡, 山本俊一, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

情報処理学会全国大会講演論文集 69th ( 2 ) 2.577-2.578 2007.3

　More details

Language：Japanese

J-GLOBAL

researchmap
口じゃんけん判定ロボットの開発~ロボット聴覚システムの応用に向けて~

中臺一博, 山本俊一, 奥乃博, 中島弘史, 長谷川雄二, 辻野広司

人工知能学会AIチャレンジ研究会 26th 59 - 64 2007

　More details

Language：Japanese

J-GLOBAL

researchmap
Robot Audition System Towards Natural Human-Robot Verbal Communication

中臺一博, 山本俊一, 浅野太

人工知能学会全国大会論文集 21 1 - 4 2007

　More details

Language：Japanese Publisher：人工知能学会

CiNii Books

researchmap
Towards Information Integration for Human-Robot Interaction

NAKADAI Kazuhiro

IEICE technical report 106 ( 298 ) 19 - 26 2006.10

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

researchmap
Towards Information Integration for Human-Robot Interaction

NAKADAI Kazuhiro

IEICE technical report 106 ( 296 ) 19 - 26 2006.10

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

CiNii Books

CiNii Research

researchmap
Towards Information Integration for Human-Robot Interaction

NAKADAI Kazuhiro

IEICE technical report 106 ( 300 ) 37 - 44 2006.10

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach-integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

CiNii Books

CiNii Research

researchmap
Improvement in Online Simultaneous Speech Recognizer by Using GA

山本俊一, 中臺一博, 中野幹生, 辻野広司, VALIN Jean‐Marc, 駒谷和範, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 24th 1B12 2006.9

　More details

Language：Japanese

J-GLOBAL

researchmap
D-14-10 Improvement for Noise-Robustness of Automatic Speech Recognition Using Coarse Phoneme Recognition

SUMIYA Ryota, NAKADAI Kazuhiro, NAKANO Mikio, ICHIGE Koichi, HIROSE Yasuo, TSUJINO Hiroshi

Proceedings of the IEICE General Conference 2006 ( 1 ) 134 - 134 2006.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Books

researchmap
パーティクルフィルタによる音源追跡の性能評価

村瀬昌満, 中台一博, 奥乃博

情報処理学会全国大会講演論文集 68th ( 2 ) 345 - 346 2006.3

　More details

Language：Japanese

J-GLOBAL

researchmap
複数ドメイン音声対話システムにおける対話履歴を利用したドメイン選択の高精度化

神田直之, 駒谷和範, 中野幹生, 中台一博, 辻野広司, 尾形哲也, 奥乃博

情報処理学会全国大会講演論文集 68th ( 2 ) 329 - 330 2006.3

　More details

Language：Japanese

J-GLOBAL

researchmap
GAによる話者位置への同時発話認識システムの最適化

山本俊一, 中台一博, 中野幹生, 辻野広司, VALIN Jean‐Marc, 武田龍, 駒谷和範, 尾形哲也, 奥乃博

情報処理学会全国大会講演論文集 68th ( 2 ) 5 - 6 2006.3

　More details

Language：Japanese

J-GLOBAL

researchmap
Robust Domain Selection using Dialogue History in Multi-Domain Spoken Dialogue System

KANDA NAOYUKI, KOMATANI KAZUNORI, NAKANO MIKIO, NAKADAI KAZUHIRO, TSUJINO HIROSHI, OGATA TETSUYA, OKUNO HIROSHI G

IPSJ SIG Notes 2006 ( 12 ) 55 - 60 2006.2

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

We have developed a robust domain selection method using dialogue history in multi-domain spoken dialogue systems. We define domain selection as classifying problem among (I) the domain in the previous turn, (II) the domain in which N-best speech recognition results can be accepted with the highest recognition score, (III) other domains. We constructed a classifier by decision tree learning with dialogue corpus. The experimental result using 10 subjects shows that our method could reduced 11.6% domain selection error, compared with a conventional method using speech recognition likelihoods only.

CiNii Books

researchmap
Human Robot Interaction Research in HRI-JP

TSUJINO Hiroshi, NAKANO Mikio, NAKADAI Kazuhiro, HASEGAWA Yuji

IEICE technical report 105 ( 426 ) 31 - 36 2005.11

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

As the computer technology advances, machines are expected to perform more functional tasks at home and the importance of technology realizing "human-machine interface that anyone can use" is increasing. An intelligent robot is an ultimate machine in this trend, and the advanced concept and sight of value for the robot are investigated actively. We focus on the "bi-directional human-robot interaction" as a future interface between human and the intelligent robot. In this paper, we present our recent results of the "robot architecture for human-robot interaction", "speech recognition by robot" and "speech recognition by human" in our human-robot interaction research.

CiNii Books

researchmap
Multiple Moving Speakers Tracking based on Multiple Kalman Filters and Accuracy Evaluatiton

村瀬昌満, 山本俊一, VALIN Jean‐Marc, 中台一博, 山田健太郎, 駒谷和範, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 23rd 3C26 2005.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Recognition of Three Simultaneous Speech Signals Using MFT for a Humanoid

山本俊一, VALIN Jean‐Marc, 中台一博, 中野幹生, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 23rd 3C35 2005.9

　More details

Language：Japanese

J-GLOBAL

researchmap
聖徳太子ロボット―視聴覚統合によるロボット聴覚―

奥乃博, 中台一博

画像センシングシンポジウム講演論文集 11th 87 - 92 2005.6

　More details

Language：Japanese

J-GLOBAL

researchmap
Implementation of Sound Source Separation Filter on Dynamically Reconfigurable Processor

KUROTAKI Shunsuke, SUZUKI Noriaki, NAKADAI Kazuhiro, OKUNO Hiroshi, AMANO Hideharu

IEICE technical report 105 ( 43 ) 67 - 72 2005.5

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Research

researchmap
Robot Audition : Its Issues and State of the Arts

OKUNO Hiroshi G, NAKADAI Kazuhiro

日本音響学会研究発表会講演論文集 2005 ( 1 ) 633 - 636 2005.3

　More details

Language：Japanese

CiNii Books

CiNii Research

researchmap
マイクロフォンアレイによる分離音声認識のためのミッシングフィーチャーマスク自動生成

山本俊一, VALIN J‐M, 中台一博, 駒谷和範, 尾形哲也, 奥乃博

情報処理学会全国大会講演論文集 67th ( 2 ) 377 - 378 2005.3

　More details

Language：Japanese

J-GLOBAL

researchmap
ミッシングフィーチャ理論を適用した同時発話認識システムの同時発話文による評価

山本俊一, VALIN Jean‐Marc, 中台一博, 中野幹生, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

人工知能学会AIチャレンジ研究会 22nd 101 - 106 2005

　More details

Language：Japanese

J-GLOBAL

researchmap
Evaluation of MFT-Based Interface between Sound Source Separation and ASR

山本俊一, 中台一博, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 22nd 1C33 2004.9

　More details

Language：Japanese

J-GLOBAL

researchmap
G-007 Missing Feature Theory Based Interface of Integrating Sound Source Separation and Automatic Speech Recognition

Yamamoto Shunichi, Nakadai Kazuhiro, Tsujino Hiroshi, Okuno Hiroshi G

情報科学技術フォーラム一般講演論文集 3 ( 2 ) 357 - 360 2004.8

　More details

Language：Japanese Publisher：Forum on Information Technology

CiNii Books

researchmap
マルチモーダル情報統合によるヒューマノイドロボットの挙動選択

戸田充彦, 中台一博, 駒谷和範, 尾形哲也, 奥乃博

情報処理学会全国大会講演論文集 66th ( 2 ) 2.193-2.194 2004.3

　More details

Language：Japanese

J-GLOBAL

researchmap
ミッシングフィーチャー理論による三話者同時発話認識の向上

山本俊一, 中台一博, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

情報処理学会全国大会講演論文集 66th ( 2 ) 2.287-2.288 2004.3

　More details

Language：Japanese

J-GLOBAL

researchmap
アクティブオーディションによる自然なヒューマン・ロボットインターフェースの実現に関する研究(認知と身体性)(<特集>人工知能分野における博士論文)

中臺一博

人工知能 19 ( 1 ) 106_2 - 106_2 2004.1

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

これまでロボットの聴覚機能に関する研究は,人間とのソーシャルインタラクションで最も重要であるにもかかわらず,あまり行われていなかった.また,ロボット聴覚を実現するために,実環境・実時間処理という観点から問題点は指摘されてきたものの,これらを体系的にまとめた報告はなかった.そこで,本研究では,まず,ロボット聴覚の課題を体系的に整理し,解決に向けた具体的な方法を議論する.そして,アクティブな動作はロボット聴覚の向上に本質的であると捉え,これをロボット聴覚に適用したアクティブオーディションを提案する.また,複数の聴覚情報の統合,聴覚情報以外の感覚情報との統合を行うことによる知覚向上およびより一般的な処理を目指したロボットによる一般的な音(混合音)の理解についても併せて議論する.実際に上半身ヒューマノイドロボットSIG(http://winnie.kuis.kyoto-u.ac.jp/SIG/)上に構築したシステムは,ロボットに特有な動作時のノイズをキャンセルすることで,アクティブな動作の聴覚処理への利用を可能とした.また,アクティブな動作を効果的に用いることにより,視聴覚統合による話者の定位・追跡,注意を向けた方向の音源を実時間で抽出できるアクティブ方向通過型フィルタによる音源分離,分離音の音声認識といった機能を実現した.システムの各機能およびシステム全体を通した統合評価を通じて,アクティブオーディション,感覚情報の統合,一般音理解の有効性・ロバスト性,ヒューマン・ロボットインタフェースとしての有効性を示した.

DOI： 10.11517/jjsai.19.1_106_2

CiNii Books

CiNii Research

researchmap
Three Simultaneous Speech Recognition by Applying Missing Feature Theory to Robot Audition System

山本俊一, 中臺一博, 辻野広司

人工知能学会全国大会論文集 18 1 - 4 2004

　More details

Language：Japanese Publisher：人工知能学会

CiNii Books

researchmap
ロボットに装着したマイクロフォンアレイによる音源分離とミッシングフィーチャー理論に基づく音声認識

山本俊一, VALIN Jean‐Marc, 中台一博, 奥乃博

人工知能学会AIチャレンジ研究会 20th 27 - 32 2004

　More details

Language：Japanese

J-GLOBAL

researchmap
ロボット聴覚へのミッシングフィーチャー理論の適用による三話者同時発話認識

山本俊一, 中臺一博, 辻野広司, 奥乃博

人工知能学会全国大会論文集 4 ( 0 ) 41 - 41 2004

　More details

Language：Japanese Publisher：一般社団法人人工知能学会

本稿では，ロボットに搭載された2つのマイクで録音された三話者同時発話音声を音源分離とミッシングフィーチャー理論に基づく音声認識によって行う手法を提案する．2体のロボットにおける実験により提案手法の有効性を確認する．

researchmap
Robotics Based on Al Technology : Robot Audition: State of the Art and Future Directions

OKUNO Hiroshi G, NAKADAI Kazuhiro

IPSJ Magazine 44 ( 11 ) 1138 - 1144 2003.11

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

ロボットが家庭に入ってくるようになり，ロボットと人とのコミュニケーション，特に，ロボットに装備されたマイクロフォンを用いたコミュニケーションや音による環境知覚がますます重要になってきている．最近，ロボット自身の耳による聴覚機能がようやく活発になってきた．では，ロボットのための聴覚機能にはどのようなものが必要であろうか．

CiNii Books

CiNii Research

researchmap
ロボットを対象とした散乱理論による三話者同時発話の定位・分離・認識の向上

中台一博, 奥乃博, 辻野広司

人工知能学会AIチャレンジ研究会 18th 33 - 38 2003.11

　More details

Language：Japanese

CiNii Research

J-GLOBAL

researchmap
Improvement of Recognition of Three Simultaneous Speeches By Hierarchical AV Integration for Humanoid and Scattering Theory

中台一博, 松浦大輔, 奥乃博, 辻野広司

日本ロボット学会学術講演会予稿集(CD-ROM) 21st 2K14 2003.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Design and Implementation of Action Selection System for Humanoid Robot

戸田充彦, 中台一博, 宮下敬宏, 奥乃博

日本ロボット学会学術講演会予稿集(CD-ROM) 21st 3F23 2003.9

　More details

Language：Japanese

J-GLOBAL

researchmap
人間に似た外見を持つロボットReplieにおける挙動選択システム

戸田充彦, 山本俊一, 中台一博, 奥乃博

情報処理学会全国大会講演論文集 65th ( 4 ) 4.211-4.212 2003.3

　More details

Language：Japanese

J-GLOBAL

researchmap
Applying FPGA to Sound Separation by Direction - Pass Filter

SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

情報処理学会研究報告システムLSI設計技術（SLDM） 2003 ( 7 ) 135 - 140 2003.1

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform(FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

CiNii Books

researchmap
Applying FPGA to Sound Separation by Direction-Pass Filter

SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

IEICE technical report. Computer systems 102 ( 611 ) 79 - 84 2003.1

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of IGHz.

CiNii Books

researchmap
Applying FPGA to Sound Separation by Direction-Pass Filter

SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

Technical report of IEICE. VLD 102 ( 609 ) 79 - 84 2003.1

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and are tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and are tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

CiNii Books

CiNii Research

researchmap
Exploiting Auditory Fovea in Humanoid-Human Interaction

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroshi G. Okuno, Hiroaki Kitano, Hiroaki Kitano

Proceedings of Eighteenth National Conference on Artificial Intelligence (AAAI-2002) 431-438 431 - 438 2002.12

　More details

Scopus

researchmap
アクティブオーディションによる複数音源の定位・分離・認識

中台一博, 奥乃博, 北野宏明

人工知能学会AIチャレンジ研究会 16th 25 - 32 2002.11

　More details

Language：Japanese

CiNii Research

J-GLOBAL

researchmap
Building Robot Audition-Development of Humanoid SIG2-

中台一博, 松浦大輔, 宮下敬宏, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集(CD-ROM) 20th 1H19 2002.10

　More details

Language：Japanese

CiNii Research

J-GLOBAL

researchmap
Focus-of-Attention Control in Speaker Tracking by Using Support Vector Machine

松浦大輔, 中台一博, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集(CD-ROM) 20th 1C33 2002.10

　More details

Language：Japanese

J-GLOBAL

researchmap
Auditory fovea based speech enchancement and its application to human-robot dialog system

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroshi G. Okuno, Hiroaki Kitano, Hiroaki Kitano

7th International Conference on Spoken Language Processing, ICSLP 2002 1817 - 1820 2002.1

　More details

Scopus

researchmap
Auditory Fovea Based Speech Separation and Its Application to Dialog System

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroshi G. Okuno, Hiroaki Kitano, Hiroaki Kitano

Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2002) 2 1320 - 1325 2002.1

　More details

Scopus

researchmap
Real-Time Speaker Localization and Speech Separation by Audio-Visual Integration

Kazuhiro Nakadai, Ken Ichi Hidai, Hiroshi G. Okuno, Hiroaki Kitano

Proceedings of IEEE/RSJ International Conference on Robots and Automation (ICRA-2002) 1 1043 - 1049 2002.1

　More details

Scopus

researchmap
Active Audition Based Human-Humanoid Interaction

Nakadai Kazuhiro, Okuno Hiroshi G, Kitano Hiroaki

SICE Division Conference Program and Abstracts 2002 ( 0 ) 522 - 522 2002

　More details

Publisher：公益社団法人計測自動制御学会

Robots to interact with people should understand various events simultaneously. To realize such capabilities in robots, integration of audition, vision and other sensory information and active motion for better perception are essential. This paper describes active audition that improves robot audition to integrate audition, vision and active motion. Our active audition based upper-torso robot can localize and interact with people even when occlusion and simultaneous speech occur.

DOI： 10.11499/siced.si2002.0.522.0

CiNii Research

researchmap
Real-time active human tracking by hierarchical integration of audition and vision

NAKADAI K.

Proc. IEEE-RAS Int. Conf. on Robots and Automation, Washington, DC, 2002 2002

　More details

researchmap
Are a pair of ears sufficient for robot audition ?

Okuno Hiroshi G, Nakadai Kazuhiro

THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN 58 ( 3 ) 205 - 210 2002

　More details

Language：Japanese Publisher：一般社団法人日本音響学会

聴覚は人間にとって最も重要な感覚である。言語によるコミュニケーションが聴覚によって成立することは容易に理解されるが,「ヒトは聴覚によってのみ言語を獲得し,そこに文化が生まれ,継承される。書かれた言語は目によって伝承されるが,話す言葉は耳からしか得られない。話し言葉があって書く言葉が生まれる」ことを,多くの人が理解していないのは残念なことである(鈴木淳一,小林武夫共著『耳科学-難聴に挑む』(中公新書1598,2001))。

DOI： 10.20697/jasj.58.3_205

CiNii Books

CiNii Research

researchmap
Research Issues and Current Status of Robot Audition

OKUNO Hiroshi G, NAKADAI Kazuhiro

IPSJ SIG Notes 2001 ( 123 ) 69 - 74 2001.12

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

CiNii Books

CiNii Research

researchmap
Research Issues and Current Status of Robot Audition

OKUNO Hiroshi G, NAKADAI Kazuhiro

IEICE technical report. Natural language understanding and models of communication 101 ( 520 ) 69 - 74 2001.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

CiNii Books

researchmap
Research Issues and Current Status of Robot Audition

OKUNO Hiroshi G, NAKADAI Kazuhiro

IEICE technical report. Speech 101 ( 522 ) 69 - 74 2001.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

CiNii Books

CiNii Research

researchmap
Human-Robot Interaction Through Real-Time Auditory and Visual Multiple-Talker Tracking

Hiroshi G. Okuno, Kazuhiro Nakadai, Ken Ichi Hidai, Hiroshi Mizoguchi, Hiroaki Kitano

Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2001) 3 1402 - 1409 2001.12

　More details

Scopus

researchmap
Epipolar Geometry Based Sound Localization and Extraction for Humanoid Audition

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2001) 3 1395 - 1401 2001.12

　More details

Scopus

researchmap
Active Audio - Visual Integration in Real - Time Human Tracking Humanoid SIG

NAKADAI KAZUHIRO, HIDAI KEN-ICHI, OKUNO HIROSHI G, KITANO HIROAKI

IPSJ SIG Notes. ICS 2001 ( 97 ) 37 - 42 2001.10

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

This paper describes improvement of auditory processing by active motion and audio-visual integration. Generally, environmental noises and reverberation affect sound source localization and separation in the real world badly. Our real-time human tracking system for humanoid robots attained robust sound source licalization in the real world by active audio-visual integration. Then, we propose a new sound source separation method by active direction pass filter. Our experiments proves that active audio-visual integration is essential to robust perception for extraction of tracking sound source.

CiNii Books

CiNii Research

researchmap
ステレオ視による実時間人物追跡システムの高精度化

日台健一, 中台一博, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集 19th 155 2001.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Real-Time Human Tracking by Integrating Auditory and Visual Streams.

中台一博, 日台健一, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集 19th 583 - 584 2001.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Improvement of Real-time Human Tracking System by Stereo Vision.

日台健一, 中台一博, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集 19th 581 - 582 2001.9

　More details

Language：Japanese

J-GLOBAL

researchmap
視聴覚のストリームベース統合による実時間人物追跡システム

中台一博, 日台健一, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集 19th 155 2001.9

　More details

Language：Japanese

J-GLOBAL

researchmap
視聴覚情報の階層的統合による実時間アクティブ人物追跡

中台一博, 日台健一, 奥乃博, 北野宏明

人工知能学会AIチャレンジ研究会 13th 35 - 42 2001.6

　More details

Language：Japanese

J-GLOBAL

researchmap
顔認識とアクティブオーディションを利用した実時間人物追跡

中台一博, 日台健一, 溝口博, 奥乃博, 北野宏明

人工知能学会AIチャレンジ研究会 11th 27 - 34 2001.3

　More details

Language：Japanese

CiNii Research

J-GLOBAL

researchmap
Real-time auditory and visual multiple-object tracking for robots

NAKADAI K.

Proceedints of the Seventeenth International Joint Conference on Atificial Intelligence (IJCAI-01) 2001

　More details

researchmap
Active Audition System and Humanoid Exterior Design.

K. Nakadai, T. Matsui, H. G. Okuno, H. Kitano

BProceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2000) 2 1453 - 1461 2000.12

　More details

Scopus

researchmap
Control an Interactive Robot to Integrate Image sequence and Sound stream in Dynamic Scene.

中川友紀子, 中台一博, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集 18th 113 - 114 2000.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Active Audition System Using Robot Cover Acoustics.

中台一博, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集 18th 103 - 104 2000.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Proposal of Active Audition for Humanoid Auditory Capabailities.

中台一博, 奥乃博, 北野宏明

日本ロボット学会学術講演会予稿集 18th 105 - 106 2000.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

IPSJ SIG Notes 2000 ( 23 ) 116 - 124 2000.3

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won't work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

CiNii Books

researchmap
Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

IPSJ SIG Notes 2000 ( 23 ) 119 - 124 2000.3

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won't work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

CiNii Books

CiNii Research

researchmap
Active audition for humanoid

K Nakadai, T Lourens, HG Okuno, H Kitano

SEVENTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-2001) / TWELFTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-2000) 832 - 839 2000

　More details

Language：English

Web of Science

researchmap
The method of defending system resources against continuous and high-speed setup process in ATM switching system

WATANABE Hiroshi, NAKADAI Kazuhiro, SATOU Yukio, SAKAGUCHI Zenji, ASHIKAWA Hirotoshi

IEICE technical report. Computer systems 98 ( 572 ) 1 - 8 1999.1

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

For operating the reliable data communications, we use the protocol message to control that communications. In case of the continuous and high-speed operating message to the node on purpose, it arises the problem that we cannot offer the service for lack of the resource of the node. In this paper, we propose the effective way that we can defend from that problem automatically. We propose to implement the operation in the software of the ATM node as a basical rule, instead of that the system manager operate manually. That way has the characteristic that we can execute self-defense automatically in the environment of the inter-communication(Ex.the private network as internet). We propose that the way applies TCP on the internet as well.

CiNii Books

researchmap
Implementation of OPTIMAOrganized Processing toward Intelligent Music Scene Analysis

柏野邦夫, 中臺一博, 木下智義, 田中英彦

全国大会講演論文集 50 ( 0 ) 97 - 98 1995.3

　More details

Language：Japanese

われわれは、聴覚的情景分析を「知覚的な音」の分離抽出(知覚的音源分離)と構造化の問題と捉え、モノラルの楽器演奏の音響信号を題材として、音楽情景分析(音楽音響信号を対象とする聴覚的情景分析)の処理モデルについて検討を行っている。ここで、知覚的音源分離とは、人間がひとつのものとして知覚または認識するような音響エネルギーのまとまり(これを知覚的な音と呼ぶ)を一つのものとして記号化することを指す。われわれは既に、ベイズの定理に基礎を置く定量的かつ階層的な情報統合のメカニズムを備えた音楽情景分析の処理モデルOPTIMA(Organized Processing toward Intelligent Music Scene Analysis)を提案した。この処理モデルに基づき、音楽情景分析の実験システムを実装し検討を行ったので、本稿でその概要を報告する。

CiNii Books

CiNii Research

researchmap
Creation and Verification of Note Hypotheses in OPTIMA based on Statistical Information

中臺一博, 柏野邦夫, 木下智義, 田中英彦

全国大会講演論文集 50 ( 0 ) 101 - 102 1995.3

　More details

Language：Japanese

われわれは、音楽情景分析における処理モデルとしてOPTIMAを提案し、これに基づく音楽情景分析の実験システムの実装・評価を行った。本稿では、実験システムのうち、周波数成分レベル、単音レベル間の処理を行う単音仮説生成処理部の実装および、評価について述べる。

CiNii Books

CiNii Research

researchmap
Employment of music scene information in OPTIMA

木下智義, 柏野邦夫, 中臺一博, 田中英彦

全国大会講演論文集 50 ( 0 ) 99 - 100 1995.3

　More details

Language：Japanese

OPTIMAでは、複数の独立したモジュールに確率をもった仮説の組を出力させ、これを確率伝搬によって統合することによって外界の音響的事象に関する最尤推定像を求める。本稿ではOPTIMAにおいて利用される音楽シーン惰報として、拍位置および和音の情報の抽出と利用について議論し、実験システムに対する評価実験の結果を示す。

CiNii Books

CiNii Research

researchmap
An Optima-based Music Scene Analysis System I : Implementation and Evaluation of Processing Modules

NAKADAI Kazuhiro, KASHINO Kunio, KINOSHITA Tomoyoshi, TANAKA Hidehiko

日本音響学会研究発表会講演論文集 1995 ( 1 ) 481 - 482 1995.3

　More details

Language：Japanese

CiNii Books

CiNii Research

researchmap
An Optima-based Music Scene Analysis System II : Evaluation of the Information Integration Mechanism

KASHINO Kunio, NAKADAI Kazuhiro, KINOSHITA Tomoyoshi, TANAKA Hidehiko

日本音響学会研究発表会講演論文集 1995 ( 1 ) 483 - 484 1995.3

　More details

Language：Japanese

CiNii Books

researchmap
楽器演奏における単音の分離抽出とその音楽情景分析システムへの応用

中臺一博

Master's thesis, 東京大学 1995

　More details

CiNii Research

researchmap
General Description of OPTIMA : A Process Model of Perceptual Sound Source Separation for Music Scene Analysis

柏野邦夫, 中台一博, 田中英彦

全国大会講演論文集 49 ( 0 ) 325 - 326 1994.9

　More details

Language：Japanese

われわれは、モノラルの楽器演奏を対象とする音源分離を題材として、知覚的音源分離システムについて検討を進めている。知覚的音源分離においては、観測データに加え、対象に関する知識や記憶に基づく処理を柔軟に組み合わせて最終的な結果を求めることが本質的な課題である。そこで本稿では、情報統合のメカニズムを備えた知覚的音源分離の処理モデル OPTIMA (Organized Processing toward Intelligent Music Scene Analysis)を提案する。

CiNii Books

CiNii Research

researchmap
Creation of Single Note Hypotheses in OPTIMA

中台一博, 柏野邦夫, 田中英彦

全国大会講演論文集 49 ( 0 ) 327 - 328 1994.9

　More details

Language：Japanese

われわれは、音楽単音記号列生成システムにおける処理モデルとしてOPTIMAを提案した。[1]OPTIMAでは、モジュールが確信度をもった仮説の組を出力する場合、これを確率伝搬によって統合することができる。したがって、音楽単音記号列生成システムのように複数の情報を統合する必要がある場合には、有用な処理モデルであるということができる。OPTIMAの処理のうち本稿で扱う単音仮説生成モジュールでは、各仮説に確信度を与えなければならないため、確信度の与え方が問題である。このような確信度を与える単音仮説生成モジュールとして、音記憶を使用したモジュールが実装されている。このモジュールは音記憶から生成した混合音仮説と入力とのマッチングを行うモジュールであり、和音などの混合音の認識に効果的であった。しかし、一音ごとに音記憶が必要であること、および混合音数の増加にともない計算量が爆発してしまうことなど、効率、精度の面で音記憶だけでは限界があった。そこで、これらの問題を解決するために音色としての本質的な特徴を抽出し、音色空間上に表現を行った。このような音色空間を利用した楽器の類別、認識の研究はニューラルネットワークを使用したものなどがあり、単音に関しては良好な結果が得られている。そこで、本稿では音色空間の利用により、確信度をもった仮説の組を出力し、混合音に対しても認識を行うことができる単音仮説生成法を提案する。この手法では、各単音仮説の確信度は統計的手法により算出することができ、知識は音色ごとに与えられるため、音数に対する知識量の爆発、計算量の爆発を抑えることができる。

CiNii Books

CiNii Research

researchmap
OPTIMA : Organized Processing toward Intelligent Music Scene Analysis -General Description of the Process Model-

Kashino Kunio, Nakadai Kazuhiro, Tanaka Hidehiko

IPSJ SIG Notes 1994 ( 71 ) 57 - 64 1994.8

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

We describe OPTIMA, a process model for the perceptual sound source separation on computers. Our model consists of four parts: bottom-up processing modules, top-down processing modules, knowledge sources, and a hypothesis network for hierarchical and quantitative integration of multiple bits of information. First we present general description of the model. Since one of the most essential problems in the perceptual sound source separation is integration of multiple bits of information, we then focus our discussion on the hypothesis network: we show that our method has permitted efficient, autonomous and stable construction of an optimal internal model of the outer world.

CiNii Books

CiNii Research

researchmap
音源分離システムにおけるパターン照合モジュールの動的負荷分散を用いた並列実装

中臺一博, 柏野邦夫, 田中英彦

情報処理学会研究報告. 人工知能研究会報告 94 ( 67 ) 59 - 60 1994.7

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

CiNii Books

CiNii Research

researchmap
音楽音響信号を対象とする音モデルに基づく音源分離システム

柏野邦夫, 中台一博, 田中英彦

東京大学工学部総合試験所年報 ( 52 ) p79 - 84 1993.9

　More details

Language：Japanese Publisher：東京大学工学部総合試験所

資料形態 : テキストデータプレーンテキスト
コレクション : 国立国会図書館デジタルコレクション > デジタル化資料 > 雑誌
記事分類: 振動工学・音響工学

CiNii Books

CiNii Research

researchmap
A Sound Source Separation System for Polyphonic Music Based on the Tone Models

Nakadai Kazuhiro, Kashino Kunio, Tanaka Hidehiko

IPSJ SIG Notes 1993 ( 32 ) 1 - 8 1993

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

A system configuration, implementation and evaluation of a sound source separation system are described. Input of the system is assumed to be a monaural audio signal of ensemble music, and output is MIDI data which has several MIDI channels, each of which is assigned to one kind of musical instruments. The present approach is based on the matching between registered tone models and sound spectrogram derived from the input signal. Experimental results show that more than 85% of the notes are correctly identified by the system on average, under the condition that the number of simultaneous notes in the input is three or less.

CiNii Books

researchmap

▼display all

Presentations

累積頻度重みを適用したパーティクルフィルタによる実時間楽譜追従

大塚琢馬, 中臺一博, 高橋徹, 尾形哲也, 奥乃博

第73回全国大会講演論文集 2011.3

　More details

Language：Japanese

パーティクルフィルタによる楽譜追従は，音響信号と楽譜との距離に基づくパーティクル重みの計算によって追従性能が大きく左右される．従来のベクトル内積計算やシグモイド関数を用いた重み計算手法では，音響信号の非調波成分や楽器の音色のバリエーションにより，楽譜位置推定が正しい場合，誤った場合でそれぞれの重みに大きな差が生じず，最終的に推定された楽譜位置に誤差が含まれるという問題点があった．本稿では，過去に計算された距離の累積頻度から重みを動的に計算し，正しい楽譜位置ではより高い重みを計算する．評価実験では，累積頻度を用いた重み計算法が，従来の重み計算法よりも楽譜追従精度で改善することが確認された．

researchmap
Intelligent Human Tracking based on Information Integration

NAKAMURA Keisuke, NAKADAI Kazuhiro, INCE Gokhan

IEICE technical report 2011.5

　More details

Language：Japanese

Since scene recognition and robot perception have been of great interest, information integration has become a significant research topic in robotics. From the viewpoint of scalability and reusability, utilization of appropriate middleware is a key factor to improve total system performance. This paper presents an integration methodology of multimodal information through constructing an intelligent human tracking system. Our system architecture interoperably combines two different types of middleware ; HARK and ROS. HARK uses dataflow-oriented middleware for real-time processing while ROS is event-driven middleware for easy integration. We confirmed that the proposed architecture realized real-time processing and considerable improvements of noise-robustness in human tracking.

researchmap
遠隔ユーザの音環境理解を支援するユーザインタフェース

植田俊輔, 今井倫太, 中村圭佑, 中臺一博

JSAI大会論文集 2012

　More details

Language：Japanese

人間は雑音が多い環境下であってもある程度どこでどのような会話が行われているかを理解する事が出来るが，遠隔操作を行うロボットアバタでは遠隔操作者が遠隔地の音環境を理解する事は困難である．本稿では，雑音環境下でも操作者と遠隔地がインタラクションをスムーズに行うことを支援するユーザインタフェースUI-ALTを提案する．オフライン実験によりUI-ALTは遠隔操作者の雑音環境理解に有用であることが示された．

researchmap
Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

全国大会講演論文集 2012.3

　More details

Language：Japanese

人のギター演奏を対象とした実時間のビートトラッキングでは,シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある.我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では,視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い,手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

researchmap
Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

第74回全国大会講演論文集 2012.3

　More details

Language：Japanese

人のギター演奏を対象とした実時間のビートトラッキングでは，シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある．我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では，視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い，手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

researchmap
2P1-P24 Development of a Sound Soure Localization System for Assisting Group Conversation(Communication Robot)

Moon Seong-eun, Takagi Kentaro, Kamashima Tsutomu, Nakadai Kazuhiro, Otake Mihoko

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2013

　More details

Language：Japanese

This paper presents a sound source localization system that composes a wireless microphone array named Jellyfish-02 and robot audition software HARK. Jellyfish-02 surpasses existing microphone array in design and usability, because it has a cover with rechargeable battery, which can be connected to wireless network. We evaluated sound source localization performance of Jellyfish-02, and investigated the percentage of speech overlapped periods in natural conversation. Prom the results, Jellyfish-02 is potentially applicable for assisting group conversation by measuring duration of speech for each participant.

researchmap
Applying FPGA to Sound Separation by Direction-Pass Filter

SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

Technical report of IEICE. VLD 2003.1

　More details

Language：Japanese

Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and are tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and are tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

researchmap
Applying FPGA to Sound Separation by Direction-Pass Filter

SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

IEICE technical report. Computer systems 2003.1

　More details

Language：Japanese

Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of IGHz.

researchmap
Applying FPGA to Sound Separation by Direction - Pass Filter

SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

情報処理学会研究報告システムLSI設計技術（SLDM） 2003.1

　More details

Language：Japanese

Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform(FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

researchmap
Three Simultaneous Speech Recognition by Applying Missing Feature Theory to Robot Audition System

山本俊一, 中臺一博, 辻野広司

人工知能学会全国大会論文集 2004

　More details

Language：Japanese

researchmap
ロボット聴覚へのミッシングフィーチャー理論の適用による三話者同時発話認識

山本俊一, 中臺一博, 辻野広司, 奥乃博

人工知能学会全国大会論文集 2004

　More details

Language：Japanese

本稿では，ロボットに搭載された2つのマイクで録音された三話者同時発話音声を音源分離とミッシングフィーチャー理論に基づく音声認識によって行う手法を提案する．2体のロボットにおける実験により提案手法の有効性を確認する．

researchmap
G-007 Missing Feature Theory Based Interface of Integrating Sound Source Separation and Automatic Speech Recognition

Yamamoto Shunichi, Nakadai Kazuhiro, Tsujino Hiroshi, Okuno Hiroshi G

情報科学技術フォーラム一般講演論文集 2004.8

　More details

Language：Japanese

researchmap
Active Audio - Visual Integration in Real - Time Human Tracking Humanoid SIG

NAKADAI KAZUHIRO, HIDAI KEN-ICHI, OKUNO HIROSHI G, KITANO HIROAKI

IPSJ SIG Notes. ICS 2001.10

　More details

Language：Japanese

This paper describes improvement of auditory processing by active motion and audio-visual integration. Generally, environmental noises and reverberation affect sound source localization and separation in the real world badly. Our real-time human tracking system for humanoid robots attained robust sound source licalization in the real world by active audio-visual integration. Then, we propose a new sound source separation method by active direction pass filter. Our experiments proves that active audio-visual integration is essential to robust perception for extraction of tracking sound source.

researchmap
Research Issues and Current Status of Robot Audition

OKUNO Hiroshi G, NAKADAI Kazuhiro

IEICE technical report. Speech 2001.12

　More details

Language：Japanese

In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

researchmap
Research Issues and Current Status of Robot Audition

OKUNO Hiroshi G, NAKADAI Kazuhiro

IEICE technical report. Natural language understanding and models of communication 2001.12

　More details

Language：Japanese

In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

researchmap
Research Issues and Current Status of Robot Audition

OKUNO Hiroshi G, NAKADAI Kazuhiro

IPSJ SIG Notes 2001.12

　More details

Language：Japanese

In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

researchmap
A Sound Source Separation System for Polyphonic Music Based on the Tone Models

Nakadai Kazuhiro, Kashino Kunio, Tanaka Hidehiko

IPSJ SIG Notes 1993.4

　More details

Language：Japanese

A system configuration, implementation and evaluation of a sound source separation system are described. Input of the system is assumed to be a monaural audio signal of ensemble music, and output is MIDI data which has several MIDI channels, each of which is assigned to one kind of musical instruments. The present approach is based on the matching between registered tone models and sound spectrogram derived from the input signal. Experimental results show that more than 85% of the notes are correctly identified by the system on average, under the condition that the number of simultaneous notes in the input is three or less.

researchmap
音源分離システムにおけるパターン照合モジュールの動的負荷分散を用いた並列実装

中臺一博, 柏野邦夫, 田中英彦

情報処理学会研究報告知能と複雑系（ICS） 1994.7

　More details

Language：Japanese

researchmap
OPTIMA : Organized Processing toward Intelligent Music Scene Analysis -General Description of the Process Model-

Kashino Kunio, Nakadai Kazuhiro, Tanaka Hidehiko

IPSJ SIG Notes 1994.8

　More details

Language：Japanese

We describe OPTIMA, a process model for the perceptual sound source separation on computers. Our model consists of four parts: bottom-up processing modules, top-down processing modules, knowledge sources, and a hypothesis network for hierarchical and quantitative integration of multiple bits of information. First we present general description of the model. Since one of the most essential problems in the perceptual sound source separation is integration of multiple bits of information, we then focus our discussion on the hypothesis network: we show that our method has permitted efficient, autonomous and stable construction of an optimal internal model of the outer world.

researchmap
Creation of Single Note Hypotheses in OPTIMA

中台一博, 柏野邦夫, 田中英彦

全国大会講演論文集 1994.9

　More details

Language：Japanese

われわれは、音楽単音記号列生成システムにおける処理モデルとしてOPTIMAを提案した。[1]OPTIMAでは、モジュールが確信度をもった仮説の組を出力する場合、これを確率伝搬によって統合することができる。したがって、音楽単音記号列生成システムのように複数の情報を統合する必要がある場合には、有用な処理モデルであるということができる。OPTIMAの処理のうち本稿で扱う単音仮説生成モジュールでは、各仮説に確信度を与えなければならないため、確信度の与え方が問題である。このような確信度を与える単音仮説生成モジュールとして、音記憶を使用したモジュールが実装されている。このモジュールは音記憶から生成した混合音仮説と入力とのマッチングを行うモジュールであり、和音などの混合音の認識に効果的であった。しかし、一音ごとに音記憶が必要であること、および混合音数の増加にともない計算量が爆発してしまうことなど、効率、精度の面で音記憶だけでは限界があった。そこで、これらの問題を解決するために音色としての本質的な特徴を抽出し、音色空間上に表現を行った。このような音色空間を利用した楽器の類別、認識の研究はニューラルネットワークを使用したものなどがあり、単音に関しては良好な結果が得られている。そこで、本稿では音色空間の利用により、確信度をもった仮説の組を出力し、混合音に対しても認識を行うことができる単音仮説生成法を提案する。この手法では、各単音仮説の確信度は統計的手法により算出することができ、知識は音色ごとに与えられるため、音数に対する知識量の爆発、計算量の爆発を抑えることができる。

researchmap
Creation and Verification of Note Hypotheses in OPTIMA based on Statistical Information

中臺一博, 柏野邦夫, 木下智義, 田中英彦

全国大会講演論文集 1995.3

　More details

Language：Japanese

われわれは、音楽情景分析における処理モデルとしてOPTIMAを提案し、これに基づく音楽情景分析の実験システムの実装・評価を行った。本稿では、実験システムのうち、周波数成分レベル、単音レベル間の処理を行う単音仮説生成処理部の実装および、評価について述べる。

researchmap
Employment of music scene information in OPTIMA

木下智義, 柏野邦夫, 中臺一博, 田中英彦

全国大会講演論文集 1995.3

　More details

Language：Japanese

OPTIMAでは、複数の独立したモジュールに確率をもった仮説の組を出力させ、これを確率伝搬によって統合することによって外界の音響的事象に関する最尤推定像を求める。本稿ではOPTIMAにおいて利用される音楽シーン惰報として、拍位置および和音の情報の抽出と利用について議論し、実験システムに対する評価実験の結果を示す。

researchmap
Implementation of OPTIMAOrganized Processing toward Intelligent Music Scene Analysis

柏野邦夫, 中臺一博, 木下智義, 田中英彦

全国大会講演論文集 1995.3

　More details

Language：Japanese

われわれは、聴覚的情景分析を「知覚的な音」の分離抽出(知覚的音源分離)と構造化の問題と捉え、モノラルの楽器演奏の音響信号を題材として、音楽情景分析(音楽音響信号を対象とする聴覚的情景分析)の処理モデルについて検討を行っている。ここで、知覚的音源分離とは、人間がひとつのものとして知覚または認識するような音響エネルギーのまとまり(これを知覚的な音と呼ぶ)を一つのものとして記号化することを指す。われわれは既に、ベイズの定理に基礎を置く定量的かつ階層的な情報統合のメカニズムを備えた音楽情景分析の処理モデルOPTIMA(Organized Processing toward Intelligent Music Scene Analysis)を提案した。この処理モデルに基づき、音楽情景分析の実験システムを実装し検討を行ったので、本稿でその概要を報告する。

researchmap
Music note recognition based on prediction of notes

木下智義, 村岡秀哉, 田中英彦

全国大会講演論文集 1998.3

　More details

Language：Japanese

researchmap
The method of defending system resources against continuous and high-speed setup process in ATM switching system

WATANABE Hiroshi, NAKADAI Kazuhiro, SATOU Yukio, SAKAGUCHI Zenji, ASHIKAWA Hirotoshi

IEICE technical report. Computer systems 1999.1

　More details

Language：Japanese

For operating the reliable data communications, we use the protocol message to control that communications. In case of the continuous and high-speed operating message to the node on purpose, it arises the problem that we cannot offer the service for lack of the resource of the node. In this paper, we propose the effective way that we can defend from that problem automatically. We propose to implement the operation in the software of the ATM node as a basical rule, instead of that the system manager operate manually. That way has the characteristic that we can execute self-defense automatically in the environment of the inter-communication(Ex.the private network as internet). We propose that the way applies TCP on the internet as well.

researchmap
Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

IPSJ SIG Notes 2000.3

　More details

Language：Japanese

Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won't work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

researchmap
Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

IPSJ SIG Notes 2000.3

　More details

Language：Japanese

Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won't work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

researchmap
General Description of OPTIMA : A Process Model of Perceptual Sound Source Separation for Music Scene Analysis

柏野邦夫, 中台一博, 田中英彦

全国大会講演論文集 1994.9

　More details

Language：Japanese

われわれは、モノラルの楽器演奏を対象とする音源分離を題材として、知覚的音源分離システムについて検討を進めている。知覚的音源分離においては、観測データに加え、対象に関する知識や記憶に基づく処理を柔軟に組み合わせて最終的な結果を求めることが本質的な課題である。そこで本稿では、情報統合のメカニズムを備えた知覚的音源分離の処理モデル OPTIMA (Organized Processing toward Intelligent Music Scene Analysis)を提案する。

researchmap
An Optima-based Music Scene Analysis System II : Evaluation of the Information Integration Mechanism

KASHINO Kunio, NAKADAI Kazuhiro, KINOSHITA Tomoyoshi, TANAKA Hidehiko

日本音響学会研究発表会講演論文集 1995.3

　More details

Language：Japanese

researchmap
An Optima-based Music Scene Analysis System I : Implementation and Evaluation of Processing Modules

NAKADAI Kazuhiro, KASHINO Kunio, KINOSHITA Tomoyoshi, TANAKA Hidehiko

日本音響学会研究発表会講演論文集 1995.3

　More details

Language：Japanese

researchmap
Audio-visual musical instrument recognition

AngelicaLim, 中村圭佑, 中臺一博, 尾形哲也, 奥乃博

第73回全国大会講演論文集 2011.3

　More details

Language：English

Is this person playing a violin or a flute? Classification of musical instrument performances is usually carried out using audio features such as spectral coefficients. We propose augmenting the typical audio feature set with visual features. We show that a combination of audio features and video perform better than audio alone, and verify this multimodal recognition approach on a real-time robot platform.

researchmap
チューブ型ロボットの姿勢推定のためのEKF-SLAMを用いた可変マイクロホンアレイ位置推定

坂東宜昭, 水本武志, 中臺一博, 奥乃博

第75回全国大会講演論文集 2013.3

　More details

Language：Japanese

災害現場での被災者発見にはがれき内へ進入可能なチューブ型ロボットが有用である．さらにチューブ型ロボットに音源定位機能があれば被災者の声から位置の推定が可能となる．しかし，近年の高精度な音源定位手法は位置が既知のマイクアレイで収録した音声から方向を推定しているが，チューブ型ロボットではマイク配置を事前に計測できない．そこで本稿ではEKF-SLAMによるマイクロフォン位置推定手法提案し，常に変化するロボット姿勢の推定によって本問題を解決する．数値実験と実録音の両方を用いて本手法の有効性を確認した．

researchmap
話者ダイアライゼーションシステムのための音声区間検出および到来方向推定の精度向上の検討

黄楊暘, 大塚琢馬, 中臺一博, 奥乃博

第75回全国大会講演論文集 2013.3

　More details

Language：Japanese

ロボット聴覚では, いつ, どこで, 誰が話したかを解明する音環境理解機能が不可欠である. 本稿では, それらの問題を解決するために, 音声区間検出, 到来方向推定および話者同定技術を組み合わせた処理を話者ダイアライゼーションシステムとする. ロボット聴覚ソフトウエア HARK においては, MUSIC アルゴリズムを前処理として, 音声区間検出および到来方向推定を行っている. しかし, MUSIC スペクトルに基づいて処理を行う際に, 音源数パラメータおよび閾値パラメータが結果を大きく左右する. 本稿では, ブラインド音源分離を前処理とする話者ダイアライゼーションシステムを提案した. 音量閾値パラメータの設定は依然必要であるが, 精度向上したパフォーマンスが得られている.

researchmap
クアドロコプターを用いた飛行雑音に頑健な音源定位

古川孝太郎, 奥谷啓太, 柳楽浩平, 大塚琢馬, 中臺一博, 奥乃博

第75回全国大会講演論文集 2013.3

　More details

Language：Japanese

本研究は多数の回転翼を持つ小型の無人航空機, クアドロコプターにマイクロフォンアレイを搭載し, 周囲の環境における音源定位問題を取り扱う.通常, 飛行時には風圧やローターの駆動に起因する雑音が極めて大であり, 定位精度の劣化原因となり得る.このような雑音環境下では, 一般化固有値分解を用いた MUSIC 法により雑音相関行列を加味することで精度が改善することが知られている.そこで本研究は, 飛行に伴って動的に変化する雑音相関行列の推定へと問題を帰着する.その上で飛行制御などの機体のモニタ情報を用いた推定手法を提案し, 飛行雑音に頑健な音源定位手法を開発する.

researchmap
Incremental Noise Estimation in Outdoor Auditory Scene Analysis using a Quadrocopter with a Microphone Array

OKUTANI Keita, YOSHIDA Takami, NAKAMURA Keisuke, NAKADA Kazuhiro

Journal of the Robotics Society of Japan 2013.9

　More details

Language：Japanese

This paper addresses sound source localization using an aerial vehicle with a microphone array in an outdoor environment to realize outdoor auditory scene analysis. It, for instance, aims at finding distressed people in a disaster situation. In such an environment, noise is quite loud and dynamically-changing, and conventional microphone array techniques studied in the field of indoor robot audition are of less use. We, thus, proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC). It can deal with dynamically-changing high power noise by introducing incrementally-estimated noise correlation matrices. We developed a prototype system for the outdoor auditory scene analysis based on the proposed method using the Parrot AR.Drone with an 8ch microphone array and a Kinect device. Experimental results using the prototype system showed that dynamically-changing noise is properly suppressed with the proposed method even when the signal-to-noise ratio is less than 0dB in an outdoor/indoor environment with the hovering/moving AR.Drone.

researchmap
Volume Adaptation and Visualization by Modeling the Volume Level in Actual Noise Environment for Telepresence System

HAYAMIZU Akira, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2013.12

　More details

Language：Japanese

The Lombard effect is the involuntary tendency of speakers to increase their vocal effort when speaking in loud noise to enhance the audibility of their voice. There is a problem in a telecommunication situation due to the Lombard effect, and would talk loudly than necessary for the conversation partner at a remote location. In this paper, the design and the model that is required in order to adjust automatically the volume of the operator at the remote communication via telepresence robot mobile in the real world, the optimal volume control system LOMBOT equipped with a model was developed. As a result, We confirmed that the volume is adjusted properly to the noise of the remote location

researchmap
TelePaBot : A telepresence system for supporting multi-party conversation

Koike Kyotaro, Imai Michita, Nakamura Keisuke, Nakadai Kazuhiro

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2013.12

　More details

Language：Japanese

A telepresence robot is useful to deal with a situation where a user in a remote area has to control the robot to communicate with people. However, there exists some remaining issues that the target speech is contaminated with unnecessary speeches, and the remote user cannot understand the speech in the case of multi-party conversation. We propose a telepresence party robot, "TelePaBot" that visualizes utterance's position and purveys a selective listening function. A case study suggested that TelePaBot smoothens remote-communication even when multi-party conversation occurs.

researchmap
マイクロホンアレイの位置推定によるホース型ロボットの姿勢推定

坂東宜昭, 大塚琢馬, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 奥乃博

第76回全国大会講演論文集 2014.3

　More details

Language：Japanese

ホース型ロボットは細長い形状が特徴のレスキューロボットで，倒壊した建築物の隙間などの探索が可能である．操縦の効率化のために加速度センサやカメラ画像などを用いた本ロボットの姿勢推定法が提案されてきたが，累積誤差が生じるなどの問題があった．本稿ではマイクロホンアレイと小型スピーカを本ロボットに装着し，音によるこれらの位置推定によって姿勢を推定する手法について述べる．本手法ではスピーカから発する試験音の各マイクへの到達時間差を用いて姿勢を推定するが，到達時間差は現在のマイクとスピーカの位置関係を表しており，過去の誤差を修正できる．実録音データを用いて本手法の有効性を評価した．

researchmap
音ランドマークを用いたマルチコプターの定位

ラナシナパヤ, 中村圭佑, 中臺一博, 高橋秀幸, 木下哲男

第76回全国大会講演論文集 2014.3

　More details

Language：English

We propose a novel approach to multicopter localization, using sound landmarks and one embedded microphone. This approach can benefit to multicopter localization in that it requires less computational power and smaller payloads than image-based approaches. However, the high ego-noise of multicopters is a serious threat for sound-based algorithms. We simulated a 2D localization method based on a Kalman Filter using measurements of acceleration and sound landmarks' intensity. A random walk model is used to update the multicopter's position with the Kalman Filter; the calculated estimation is then corrected using noisy measurements from the embedded microphone and accelerometer. Simulation results show that the proposed algorithm can successfully track the multicopter's motion in a noisy environment. We confirmed the effectiveness of our proposed algorithm by comparing its performance and robustness to a time/phase based algorithm.

researchmap
Deep Neural Networkを用いたマルチモーダル音声認識の為の特徴量学習

山口雄紀, 野田邦昭, 中臺一博, 奥乃博, 尾形哲也

第76回全国大会講演論文集 2014.3

　More details

Language：Japanese

本研究の目標は，マルチモーダル音声認識の為の画像特徴量の設計である．マルチモーダル音声認識の精度向上のためには，唇画像からどのようにして音声認識の最小単位である音素や口形素を表現する情報を取り出すかが重要な課題である．本研究では，特徴量学習の新たな手法として注目を集めているDeep Neural Network (DNN)を用いて大量の唇画像から画像特徴量を自己組織的に抽出する手法を構築した．得られた画像特徴量を孤立単語認識タスクで検証するとともに特徴量空間を解析する事で口形素との関連についても考察した．また，得られた画像特徴量と音声を用いた視聴覚統合によるノイズ頑健性の向上について検証を行った．

researchmap
Design and Implementation of Multidirectional Sound Annotation Tool with HARK

SUGIYAMA Osamu, ITOYAMA Katsutoshi, NAKADAI Kazuhiro, OKUNO Hiroshi G

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2014.6

　More details

Language：Japanese

In this study we designed and developed the multidirectional sound source annotation tool with the robot audition software, HARK. With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user ' s annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.

researchmap
TeleCoBot : A Telepresence system of taking account for conversation environment

TAKAHASHI Masaaki, OGATA Masa, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2014.12

　More details

Language：Japanese

The study of the telepresence robot becomes popular as a communication tool in the remote place. However, there is a problem that the telepresence system can't precisely transfer the user's utterance because of not considering difference of sound environment such as noise. In addition, when the user talks with several people in remote place, the user wants freely to change the speaker volume depending on the situation. Therefore we propose a telepresence conversation robot named "TeleCoBot". It provides the function automatically regulate the volume of user's utterance according to the distance of the partner and noise level in remote place. In addition, user can change the volume freely depending on the conversation situation. In this paper, we conduct the case study, and the result indicated that TeleCoBot's UI should be more effctive and enhance the presence.

researchmap
Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

Technical report of IEICE. EA 2015.3

　More details

Language：Japanese

In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter's ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

researchmap
Wind-induced noise reduction by linear beamforming using a 2-channel microphone

坂田直人, 村上哲郎, 中島弘史, 中臺一博

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 2015.8

　More details

Language：Japanese

researchmap
Automatic impulse response truncation based on relative amplitude spectrum

中島弘史, 坂田直人, 加科優希, 中臺一博

回路とシステムワークショップ論文集 Workshop on Circuits and Systems 2015.8

　More details

Language：Japanese

researchmap
変分ベイズ多チャネルロバストNMFに基づくマイクロホンの移動・被覆を許容する音声強調 (音声) -- (オーガナイズドセッション「あらゆる音を対象とした情報処理の実現に向けて」)

坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 河原達也, 奥乃博

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2016.8

　More details

Language：Japanese

researchmap
A Study on body movements and postures at Human-Robot Interaction using speech and image information

蓮本諒介, 小山大幾, 水本武志, 中村圭佑, 中臺一博, 今井倫太

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2017.2

　More details

Language：Japanese

researchmap
A Study on body movements and postures at Human-Robot Interaction using speech and image information

蓮本諒介, 小山大幾, 水本武志, 中村圭佑, 中臺一博, 今井倫太

電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 2017.2

　More details

Language：Japanese

researchmap
確率的生成モデルに基づく複数A/Dコンバータのチャネル間同期

糸山克寿, 中臺一博, 中臺一博

日本音響学会研究発表会講演論文集(CD-ROM) 2018.2

　More details

Language：Japanese

researchmap
振動センサを用いた災害時の避難者の属性推定に関する検討

尾崎翔, 浅野太, 中臺一博

電子情報通信学会大会講演論文集(CD-ROM) 2018.3

　More details

Language：Japanese

researchmap
可聴音を用いた周波数自動選択に基づく距離推定法の検討

高尾麻衣子, 干場功太郎, 中臺一博, 中臺一博

情報処理学会全国大会講演論文集 2018.3

　More details

Language：Japanese

researchmap
Quad‐directional LSTMを用いた音楽音響信号修復とその評価

谷口亮輔, 干場功太郎, 中臺一博, 中臺一博

情報処理学会全国大会講演論文集 2018.3

　More details

Language：Japanese

researchmap
Development of Robot Audition to Extreme Environments

奥乃博, 糸山克寿, 中臺一博, 中臺一博, 公文誠, 坂東宜昭, 干場功太郎

システム制御情報学会研究発表講演会講演論文集(CD-ROM) 2018.5

　More details

Language：Japanese

researchmap
スペクトル伸縮に基づく複数A/Dコンバータのチャネル間同期

糸山克寿, 中臺一博, 中臺一博

日本機械学会ロボティクス・メカトロニクス講演会講演論文集(CD-ROM) 2018.6

　More details

Language：Japanese

researchmap
振動センサを用いた災害時における年少避難者の特定手法に関する検討

尾崎翔, 浅野太, 中臺一博

電子情報通信学会大会講演論文集(CD-ROM) 2018.8

　More details

Language：Japanese

researchmap
CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya Ogata

2018.11

　More details

Presentation type：Oral presentation (general)

Casual conversations involving multiple speakers and noises from surrounding
devices are part of everyday environments and pose challenges for automatic
speech recognition systems. These challenges in speech recognition are target
for the CHiME-5 challenge. In the present study, an attempt is made to overcome
these challenges by employing a convolutional neural network (CNN)-based
multichannel end-to-end speech recognition system. The system comprises an
attention-based encoder-decoder neural network that directly generates a text
as an output from a sound input. The mulitchannel CNN encoder, which uses
residual connections and batch renormalization, is trained with augmented data,
including white noise injection. The experimental results show that the word
error rate (WER) was reduced by 11.9% absolute from the end-to-end baseline.

researchmap
Robot Audition : Its Issues and State of the Arts

OKUNO Hiroshi G, NAKADAI Kazuhiro

日本音響学会研究発表会講演論文集 2005.3

　More details

Language：Japanese

researchmap
Implementation of Sound Source Separation Filter on Dynamically Reconfigurable Processor

KUROTAKI Shunsuke, SUZUKI Noriaki, NAKADAI Kazuhiro, OKUNO Hiroshi, AMANO Hideharu

IEICE technical report 2005.5

　More details

Language：Japanese

researchmap
Human Robot Interaction Research in HRI-JP

TSUJINO Hiroshi, NAKANO Mikio, NAKADAI Kazuhiro, HASEGAWA Yuji

IEICE technical report 2005.11

　More details

Language：Japanese

As the computer technology advances, machines are expected to perform more functional tasks at home and the importance of technology realizing "human-machine interface that anyone can use" is increasing. An intelligent robot is an ultimate machine in this trend, and the advanced concept and sight of value for the robot are investigated actively. We focus on the "bi-directional human-robot interaction" as a future interface between human and the intelligent robot. In this paper, we present our recent results of the "robot architecture for human-robot interaction", "speech recognition by robot" and "speech recognition by human" in our human-robot interaction research.

researchmap
Robust Domain Selection using Dialogue History in Multi-Domain Spoken Dialogue System

KANDA NAOYUKI, KOMATANI KAZUNORI, NAKANO MIKIO, NAKADAI KAZUHIRO, TSUJINO HIROSHI, OGATA TETSUYA, OKUNO HIROSHI G

IPSJ SIG Notes 2006.2

　More details

Language：Japanese

We have developed a robust domain selection method using dialogue history in multi-domain spoken dialogue systems. We define domain selection as classifying problem among (I) the domain in the previous turn, (II) the domain in which N-best speech recognition results can be accepted with the highest recognition score, (III) other domains. We constructed a classifier by decision tree learning with dialogue corpus. The experimental result using 10 subjects shows that our method could reduced 11.6% domain selection error, compared with a conventional method using speech recognition likelihoods only.

researchmap
D-14-10 Improvement for Noise-Robustness of Automatic Speech Recognition Using Coarse Phoneme Recognition

SUMIYA Ryota, NAKADAI Kazuhiro, NAKANO Mikio, ICHIGE Koichi, HIROSE Yasuo, TSUJINO Hiroshi

Proceedings of the IEICE General Conference 2006.3

　More details

Language：Japanese

researchmap
Towards Information Integration for Human-Robot Interaction

NAKADAI Kazuhiro

IEICE technical report 2006.10

　More details

Language：Japanese

To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach-integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

researchmap
Towards Information Integration for Human-Robot Interaction

NAKADAI Kazuhiro

IEICE technical report 2006.10

　More details

Language：Japanese

To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

researchmap
Towards Information Integration for Human-Robot Interaction

NAKADAI Kazuhiro

IEICE technical report 2006.10

　More details

Language：Japanese

To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

researchmap
Robot Audition System Towards Natural Human-Robot Verbal Communication

中臺一博, 山本俊一, 浅野太

人工知能学会全国大会論文集 2007

　More details

Language：Japanese

researchmap
AS-6-1 Sound Stream Formation and Human Tracking by Integration of Microphone Arrays

Nakadai Kazuhiro, Nakajima Hirofumi, Murase Masamitsu, Okuno Hiroshi G, Hasegawa Yuji, Tsujino Hiroshi

Proceedings of the IEICE General Conference 2007.3

　More details

Language：English

researchmap
High performance blind source separation using an adaptive step-size parameter method

NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, TSUJINO Hiroshi

IEICE technical report 2007.6

　More details

Language：Japanese

This paper describes a novel blind source separation (BSS) method. One of the most important factors in BSS performance is a step-size parameter to update a decomposition matrix which is generally used for extracting a target sound source. A fixed value which was obtained empirically is commonly used as the step-size parameter. However, in the real world, the surrounding environment changes dynamically. So, conventional BSS with a fixed step-size parameter does not provide the best performance and sometimes results in divergence of the decomposition matrix. We propose a method that allows for an adaptive step-size parameter. Since the proposed method is gen- erally applicable to BSS methods, we applied it to six types of BSS algorithms with a microphone array embedded in Honda's ASIMO. Experimental results show that the proposed method improves sound source separation in the four BSS algorithms, and the step-size parameter is maintained optimally even when the surrounding environment changes.

researchmap
Design and Evaluation of Barge-In enable Robot Audition System with ICA and MFT-based ASR

TAKEDA Ryu, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 2008.3

　More details

Language：Japanese

researchmap
Estimation of sound source orientation using a 96 channel microphone array

KIKUCHI Keiko, DAIGO Tohru, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, KANEDA Yutaka

IEICE technical report 2008.7

　More details

Language：Japanese

This paper addresses sound source orientation estimation using a 96ch microphone array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. However, in this method, a transfer function to design a beam-former should be the same as that of target sound source. Otherwise the performance deteriorated due to a mismatch between these two transfer functions. In addition, voice activity detection (VAD) was manually performed. To solve the former, we proposed amplitude-based orientation estimation using a histogram to relax the effect of the mismatch problems mainly caused by phase errors and outliers. For the latter, speech frequency component detection based on inner product and automatic VAD based on auto-correlation are introduced to form a frequency-temporal masking pattern. Preliminary experiments showed that sound source orientation estimation with automatic VAD for actual human voices drastically improved even when using a loudspeaker-based transfer function.

researchmap
Panel Discussion : Application Developments of Speech Recognition

NISIMURA Ryuichi, NAKANO Teppei, KURIHARA Kazutaka, NAKADAI Kazuhiro, YOSHINO Takashi

IPSJ SIG Notes 2008.10

　More details

Language：Japanese

To induce developments of ASR applications, this panel discussion introduces actual case studies. We also indicate some problems of ASR application developments.

researchmap
Realtime Syncronization Method between Audio Signal and Score Using Beats, Melodies, and Harmonies for Singer Robots

OTSUKA Takuma, MURATA Kazumasa, TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 2009.3

　More details

Language：Japanese

researchmap
Simultaneous three talker speech recognition using soft mask and model adaptation technique

TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 2009.3

　More details

Language：Japanese

researchmap
The design of a directional sound source for numerical simulation based on wave acoustics

SUZUKI Toshimasa, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, ARAI Takahiro, HASEGAWA Yuji

IEICE technical report 2009.6

　More details

Language：Japanese

Thanks to improvements in computer performance, numerical simulation based on wave acoustics works in practical time with off-the-shelf computers. Such a numerical simulation method accurately estimates a sound field when it is a simple and simulated environment like a free sound field. However, this method has difficulties in simulating a real-world acoustic environment. One of issues for real-world simulation is to deal with a sound directivity. Thus, most numerical simulators assume a point sound source to avoid this issue. Indeed, several studies to cope with a sound directivity have been reported, but, the accuracy and practical utility are insufficient for real world simulation, because an accurate sound propagation model is necessary to deal with a sound directivity. We use a compact finite difference method based on sound field digitization which has an accurate sound propagation model. However, this method also has a problem, that is, two points are simulated differently even when they are located with the same distance from the sound source due to the difference in the effect of their numerical dispercion. In this paper we, first, confirm the performance of our method by using an omni-directional point source in a free sound field. After that, we show that our method is able to simulate a directional sound source accurately using a combination of a simple loudspeaker and a point source model.

researchmap
Blind Dereverberation Improved By Multi-Stage Processing

NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

電子情報通信学会技術研究報告. EA, 応用音響 2009.7

　More details

Language：Japanese

researchmap
Blind dereverberation improved by multi-stage processing

NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

IEICE technical report 2009.7

　More details

Language：Japanese

This paper addresses a multi-stage processing mechanism that improves various dereverberation methods. In the mechanism, each stage is implemented as an intermediate processing module that connects the outputs of the modules on the previous stage to the inputs of other modules on the next stage. Since the dereverberation performance at each stage depends on the input channel combinations, we proposed two additional processes: channel selection and delay addition. We applied our proposed mechanism with these two processes to two dereberveration methods, i.e., SBM and RDAIF. The proposed system showed the following results: (1) Channel selection process improved 3-10dB. The optimum combination can reduce the number of input channels without any degradation. (2) Delay addition process improved the suppression performance by 3-10dB. (3) Multi-stage mechanism improved for SBM and RDAIF are 18.2dB and 13.6dB, respectively, while the performance without the mechanism are only 14.6dB and 3.5dB, respectively. We can conclude that the proposed mechanism and processes are effective to improve reverberation performance.

researchmap
Robot audition system development and parameter-turning in real environment

TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 2010.3

　More details

Language：Japanese

researchmap
Self-speech cancellation with Semi-blind ICA for Robot speech interaction

TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

全国大会講演論文集 2010.3

　More details

Language：Japanese

researchmap
Real time speaker orientation estimation using a room microphone array

HARUBARA Takuya, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, KANEDA Yutaka

IEICE technical report 2010.7

　More details

Language：Japanese

This paper addresses a real-time sound source orientation estimation system using a 96ch microphone-array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. Furthermore, we showed that the precision of the orientation estimation system is improved to introduce four additional techniques: Amplitude-extraction, correlation-based automatic voice activity detection(VAD), frequency mask and histogram integration. We developed a real-time sound source orientation system. However, the precision of the real-time system is sufficient for practical use. In this paper, we investigate the main causes of the estimation error and propose an advanced real-time orientation estimation system. The experimental results show that the advanced system has lower errors than the previous system by 20°- -30°.

researchmap
Robot Audition : Hands-Free Automatic Speech Recognition under Highly-Noisy Environemnts

NAKADAI Kazuhiro, OKUNO Hiroshi G

IEICE technical report 2011.1

　More details

Language：Japanese

This paper addresses robot audition, which realizes listening capabilities for robots using robot-embedded microphones. For robot audition, we propose real-time sound source separation and automatic speech recognition (ASR) techniques for dynamically changing environments based on microphone array processing, which is applicable to hands-free ASR under highly-noisy environments. Implementation of the proposed techniques is open-sourced as robot audition software called "HARK." We show the effectiveness of these techniques through applications of HARK to robots.

researchmap

▼display all

Industrial property rights

音声処理装置、音声処理方法及びプログラム

中臺一博, 佐畑智幸

　More details

Applicant：本田技研工業株式会社

Application no：特願2017-062795 Date applied：2017.3

Announcement no：特開2018-165761 Date announced：2018.10

J-GLOBAL

researchmap
会話支援装置、会話支援装置の制御方法、及び会話支援装置のプログラム

中臺一博, 中村圭佑

　More details

Applicant：本田技研工業株式会社

Application no：特願2017-042240 Date applied：2017.3

Announcement no：特開2017-129873 Date announced：2017.7

J-GLOBAL

researchmap
音声処理装置、ウェアラブル端末、携帯端末、および音声処理方法

水本武志, 中臺一博

　More details

Applicant：本田技研工業株式会社

Application no：特願2016-203690 Date applied：2016.10

Announcement no：特開2018-067050 Date announced：2018.4

J-GLOBAL

researchmap
音響処理装置および音響処理方法

中臺一博, 小島諒介

　More details

Applicant：本田技研工業株式会社

Application no：特願2016-172985 Date applied：2016.9

Announcement no：特開2018-040848 Date announced：2018.3

J-GLOBAL

researchmap
音声処理装置、音声処理方法及び音声処理プログラム

ゴメスランディ, 中臺一博

　More details

Applicant：本田技研工業株式会社

Application no：特願2016-164608 Date applied：2016.8

Announcement no：特開2018-031909 Date announced：2018.3

J-GLOBAL

researchmap
検査装置および検査方法

水本武志, 中村圭佑, 中臺一博

　More details

Applicant：本田技研工業株式会社

Application no：特願2016-065005 Date applied：2016.3

Announcement no：特開2017-183861 Date announced：2017.10

J-GLOBAL

researchmap
受付システム及び受付方法

近藤宏, 住田直亮, 椎名あす香, 山本俊一, 中臺一博, 中村圭佑

　More details

Applicant：本田技研工業株式会社

Application no：特願2016-066568 Date applied：2016.3

Announcement no：特開2017-182334 Date announced：2017.10

J-GLOBAL

researchmap
受付システムおよび受付方法

住田直亮, 近藤宏, 椎名あす香, 山本俊一, 中臺一博, 中村圭佑

　More details

Applicant：本田技研工業株式会社

Application no：特願2016-062556 Date applied：2016.3

Announcement no：特開2017-174346 Date announced：2017.9

J-GLOBAL

researchmap
音声処理装置および音声処理方法

山本俊一, 住田直亮, 近藤宏, 椎名あす香, 中臺一博, 中村圭佑

　More details

Applicant：本田技研工業株式会社

Application no：特願2016-051137 Date applied：2016.3

Announcement no：特開2017-167270 Date announced：2017.9

J-GLOBAL

researchmap
音声処理装置および音声処理方法

水本武志, 中村圭佑, 中臺一博

　More details

Applicant：本田技研工業株式会社

Application no：特願2015-191879 Date applied：2015.9

Announcement no：特開2017-067948 Date announced：2017.4

J-GLOBAL

researchmap
ロボット聴覚装置

中臺一博, 奥乃博, 北野宏明

　More details

Applicant：科学技術振興事業団

Application no：特願2000-022678 Date applied：2000.1

Announcement no：特開2001-215990 Date announced：2001.8

J-GLOBAL

researchmap
ロボット聴覚装置

中臺一博, 松井龍哉, 奥乃博, 北野宏明

　More details

Applicant：科学技術振興事業団

Application no：特願2000-022679 Date applied：2000.1

Announcement no：特開2001-215991 Date announced：2001.8

J-GLOBAL

researchmap
ロボット聴覚システム

中臺一博, 奥乃博, 北野宏明

　More details

Applicant：科学技術振興事業団

Application no：特願2000-022677 Date applied：2000.1

Announcement no：特開2001-215989 Date announced：2001.8

J-GLOBAL

researchmap
ロボット聴覚装置

中臺一博, 奥乃博, 北野宏明

　More details

Applicant：科学技術振興事業団

Application no：特願平11-341240 Date applied：1999.11

Announcement no：特開2001-157988 Date announced：2001.6

J-GLOBAL

researchmap
ロボット聴覚装置

中臺一博, 奥乃博, 北野宏明

　More details

Applicant：科学技術振興事業団

Application no：特願平11-341240 Date applied：1999.11

Announcement no：特開2001-157988 Date announced：2001.6

Patent/Registration no：特許第3277279号 Date issued：2002.2

J-GLOBAL

researchmap

▼display all

Awards

Best Paper Award

2023.9

　More details

researchmap
Fellow

2023.1 IEEE

　More details

researchmap
2021 IEEE/SICE International Symposium on System Integration (SII 2021) Best Paper Finalist Award

2022.1 IEEE

　More details

researchmap
日本ロボット学会フェロー

2021.9 日本ロボット学会

　More details

researchmap
日本ロボット学会功労賞

2021.9 日本ロボット学会

　More details

researchmap
双葉電子財団衛藤細矢記念賞

2021.5 双葉電子財団

　More details

researchmap
10th International Conference on Cloud Computing, Data Science & Engineering (Confluence-2020), Amity Research Award for Significant contribution in the field of Artificial Intelligence

2021.1

　More details

researchmap
Amity School of Engineering and Technology, Honorary Professor

2021.1

　More details

researchmap
日本景観生態学会第２９回大会ベストポスター賞

2020.3

　More details

researchmap
情報処理学会第81回全国大会奨励賞

2019.3

　More details

researchmap
2019 IEEE/SICE International Symposium on System Integration (SII 2019) Best Paper Finalist Award

2019.1 IEEE

　More details

researchmap
best generation award of innovation program

2018.10 Ministry of Internal Affairs and Communications

Kazuhiro Nakadai

　More details

researchmap
The 36th Annual Conference of the Robotics Society of Japan (RSJ 2018) International Session BEST PAPER AWARD

2018.9 The Robotics Society of Japan

Kazuhiro Nakadai

　More details

researchmap
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017) Best Paper Award Finalist on Safety, Security, and Rescue Robotics (in memory of Motohiro Kisoi)

2017.9 IEEE

Kazuhiro Nakadai

　More details

researchmap
Best Paper Award, Advanced Robotics

2016.9 The Robotics Society of Japan

Kazuhiro Nakadai

　More details

researchmap
Incentive Award

2016.6 JSAI

Kazuhiro Nakadai

　More details

researchmap
IEEE-RAS International Symposium on Safety, Security, and Rescue Robotics (SSRR) Innovative Paper Award

2015.10 IEEE

Kazuhiro Nakadai

　More details

researchmap
IEEE-RAS International Symposium on Safety, Security, and Rescue Robotics Best Demonstration Award

2015.10 IEEE

Kazuhiro Nakadai

　More details

researchmap
Best Paper Award, Advanced Robotics

2014.9 The Robotics Society of Japan

Kazuhiro Nakadai

　More details

researchmap
Best Paper Award (1st Prize), International Conference on Industrial Engineering & Other Applications of Applied Intelligent Systems(IEA/AIE 2013)

2013.6 International Society of Applied Intelligence (ISAI)

Kazuhiro Nakadai

　More details

researchmap
Incentive Award

2012.6 JSAI

Kazuhiro Nakadai

　More details

researchmap
International Conference on Intellignet Robots and Systems (IROS 2011) BEST PAPER Nomination Finalist

2011.10 IEEE

Kazuhiro Nakadai

　More details

researchmap
A Best Paper Award, International Conference on Industrial Engineering & Other Applications of Applied Intelligent Systems(IEA/AIE 2010)

2010.6 International Society of Applied Intelligence (ISAI)

Kazuhiro Nakadai

　More details

researchmap
Incentive Award

2009.6 JSAI

Kazuhiro Nakadai

　More details

researchmap
Best paper award (3rd place)

2009.6 IEEE Vail Computer Elements Workshop

Kazuhiro Nakadai

　More details

researchmap
International Conference on Intelligent Robots and Systems (IROS 2008) New Technology Foundation (NTF) Award For Entertainment Robots and Systems Finalist

2008.10 IEEE

Kazuhiro Nakadai

　More details

researchmap
SI Best Session Award

2006.12 SICE

Kazuhiro Nakadai

　More details

researchmap
Funai Promotion Award

2003.3 Funai Foundation on Information Technology

Kazuhiro Nakadai

　More details

researchmap
International Conference on Intellignet Robots and Systems (IROS 2001) BEST PAPER Nomination Finalist

2002.10 IEEE

Kazuhiro Nakadai

　More details

researchmap
Telecommunication System Technology Award

2002.3 The Telecommunications Advancement Foundation

Kazuhiro Nakadai

　More details

researchmap
Best Paper Award (1st Prize), International Conference on Industrial Engineering & Other Applications of Applied Intelligent Systems(IEA/AIE 2001)

2001.6 International Society of Applied Intelligence (ISAI)

Kazuhiro Nakadai

　More details

researchmap
Best Paper Award, International Conference on Information Society (IS-2000)

2000.10

Kazuhiro Nakadai

　More details

researchmap

▼display all

Research Projects

Smart drone audition: A search and rescue drone system that listens and communicates

Grant number：22KF0141 2023.3 - 2025.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for JSPS Fellows

　 More details

Grant amount：\2200000 （ Direct Cost: \2200000 ）

researchmap
野鳥行動解析のためのマルチモーダル生態環境理解・解析技術の構築

Grant number：20H00475 2020.4 - 2023.3

日本学術振興会科学研究費助成事業基盤研究(A)

中臺一博, 井手一郎, 鈴木麗璽, 森本元, 松林志保, 小島諒介

　 More details

Grant amount：\45500000 （ Direct Cost: \35000000 、 Indirect Cost：\10500000 ）

本研究課題は，ロボット分野で研究開発されてきた「ロボット聴覚技術」を発展させ，視覚処理技術や機械学習技術と統合，生態学・環境学に適用可能な「マルチモーダル環境理解技術」を確立し，野生動物の観測データを質・量ともに数百倍に引き上げる次世代野生動物観測技術の開発により，生態学・環境学を新たなレベルへ導くことをゴールに，野鳥の鳴き声と画像から複数野鳥同時三次元追跡技術を開発し，群れ中の個体間コミュニケーション行動，夜間行動，配偶行動解析に適用すること，実フィールド背景音解析を通じ，音景解析技術を確立，環境・人による野鳥生態系・世代間伝承への影響評価，いずれも手法構築と実フィールド観測・解析の両面から取り組むことを目標としている．初年度については，コロナ禍，ならびにこれに端を発する半導体不足の影響を大きく受け，屋外観測作業が遂行できず，また予定していた新規観測デバイスの構築が遅れた．このため，１年間の繰り越し処理を行ったが，2021年度も大きな状況の好転は見られず，全体として遅延がみられる．この中にあっても，創意工夫を行い，進められる項目について研究を推進し，以下のような実績を上げることができた．
技術的な実績：複数マイクアレイによる三次元追跡技術，校正技術の構築, カメラ付き長期収録デバイスの開発と長期定点観測開始，音景解析技術として，低次元埋め込み手法構築
論文的な実績：雑誌論文7, 国際会議11，国内会議22, 受賞5
その他の実績：本研究課題の国際的なアピール活動として国際会議IEEE/SICE SII 2021 にてオーガナイズドセッション実施，人工知能学会AIチャレンジ研究会で本研究課題をテーマに2回研究会を開催，アウトリーチ活動としてロボット聴覚ソフトウェアHARK講習会を国内外の学会内 (IJCAI2020,人工知能学会合同研究会) で計2回開催．

researchmap
Applications of robot audition techniques to multi-scale observations of ecological dynamics in bird vocalizations

Grant number：19KK0260 2019.10 - 2023.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Fund for the Promotion of Joint International Research (Fostering Joint International Research (B))

　 More details

Grant amount：\18460000 （ Direct Cost: \14200000 、 Indirect Cost：\4260000 ）

researchmap
Audio-Visual Integration to Target Recognition by Drone Audition

Grant number：17K00365 2017.4 - 2020.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)

Kumon Makoto

　 More details

Grant amount：\4550000 （ Direct Cost: \3500000 、 Indirect Cost：\1050000 ）

In this study, it is considered to recognize targets on the ground from drones with microphones. The target acoustic signal obtained at the drone is generally significantly distorted by the ego-noise, and, hence, it is difficult to recognize the target only by acoustic signals. This study aims to develop the technology to compensate this difficulty by incorporating visual sensor information.
Acoustic features that contain pauses is fused with visual features that are normally provided sequentially where it is not trivial to associate the visual information with the acoustic target.
Based on the developed methods, it is shown that audio-visual integration improves the audio target recognition under noisy situation, and as an example, three-dimensional position estimation of moving plural targets by the drone with microphones was achieved.

researchmap
Cognitive Interaction Model of Interaction Gap in Human-Robot Interaction

Grant number：16H02884 2016.4 - 2020.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)

Imai Michita

　 More details

Grant amount：\16250000 （ Direct Cost: \12500000 、 Indirect Cost：\3750000 ）

Our project studied communication between humans and robots from the viewpoint of timing and gap to achieve natural communication. The first result was to develop a method for estimating the tiredness that a person feels when communicating with a robot. Our method was able to estimate the tiredness of communication by detecting the direction of the human face and improve the quality of the conversation of the robot. Secondly, we constructed a method to imitate human body movements in real-time. Previous researches used a time delay to prevent humans from noticing the body movement imitation. Our study devised a method to change the size of body movement imitation. Our method was able to improve communication with people by imitating body movements.

researchmap
Outdoor Scene Understanding for Bird Song Analysis

Grant number：16K00294 2016.4 - 2019.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)

Nakadai Kazuhiro

　 More details

Grant amount：\4550000 （ Direct Cost: \3500000 、 Indirect Cost：\1050000 ）

From the singing voice sound signals of wild birds recorded by multiple microphone arrays, we developed outdoor sound environment understanding technology that extracts structured information on what, when and where of bird singing voice events, and that estimates the relationship between wild birds, by integrating robot audition and machine learning technologies. In addition, we have built an outdoor sound environment understanding system for bird song analysis that is easy to use even for a non-expert, and reduce the burden of singing voice analysis work on wild birds that has been performed manually, which resulted in contributing to the field of animal behavioral sciences and bioacoustics.

researchmap
Deployment of Robot Audition Toward Understanding Real World

Grant number：24220006 2012.5 - 2017.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (S)

OKuno Hiroshi G

　 More details

Grant amount：\218140000 （ Direct Cost: \167800000 、 Indirect Cost：\50340000 ）

This research project aims at deployment of robot audition even to natural and disastrous environments by enhancing the robot audition software HARK. Once HARK for Windows was released, it has been downloaded about 90K times. Applications of multi-party interaction and music co-player robots demonstrate their feasibility. Robustness of sound source localization for UAV provided by iGSVD-MUSIC and sound-based shape estimation and speech enhancement for hose-shaped robots demonstrate the feasibility of using sounds for search and rescue robots. Acoustic analysis of frog choruses and development of HARKBIrd based on HARK and its evaluation in observing and analyzing bird song communication in actual fields demonstrate the feasibility of acoustical analysis of ecology. Finally, we have established fundamental technologies of robot audition for acoustical understanding of real world.

researchmap
聴覚インタラクションの実現に向けた実環境ロボット聴覚の研究

Grant number：24118702 2012.4 - 2014.3

日本学術振興会科学研究費助成事業新学術領域研究(研究領域提案型)

中臺一博

　 More details

Grant amount：\9360000 （ Direct Cost: \7200000 、 Indirect Cost：\2160000 ）

人とロボットが実環境で，より自然にインタラクションを行う「人ロボット共生のための聴覚インタラクション」実現のため，実環境ロボット聴覚技術を開発することを目的とし，当該年度は，個別基礎技術の洗練化とその統合技術に取り組んだ．
（１）実環境ロボット聴覚のためのセンサ同期技術については，自己雑音推定技術のロボット実機上での評価にフォーカスをあて研究を行った．非負値行列分解をノンパラメトリックベイズモデルを用いて拡張した自己雑音抑圧は，マイクロホン１本で，動作リファレンスを必要としない手法であるため，①マイクロホン間同期処理，②音―動作間同期処理が不要になるというメリットがある．まず，移動台車付ヒューマノイドロボット Hearboで，従来手法の中で高い性能が報告されているテンプレート法と比較を行ったところ，信号対雑音比，信号対妨害音比において，従来手法を上回る性能を確認できた．また，実際に人ロボット共生学のターゲットロボットの一つであるRovbovie Wを用いて評価を行ったところ，Hearbo とほぼ同等の性能が得られた．Robovie W は関節角情報が得られないため，従来法は適用できないことを考慮すると，提案法は，高性能かつ適用範囲が広いといえる．
（２）よい聞き手ロボット構築のための実環境ロボット聴覚技術については，これまで研究開発を行ってきた，①音声の聞き分けを行うためのノンパラメトリックベイズモデルに基づく音源同定手法，および，② 音環境理解のためのマイクロホンアレイを用いた定位・分離・認識の統合技術を構築し，オープンソースのロボット聴覚ソフトHARK上で動作可能とした．さらに，③ 可視化技術に関しては，千葉大学大武研究室と共同で，卓上型マイクロホンアレイ「くらげ君」を開発し，上述のHARKを動作させることで，発話の方向やタイミングを，直感的でわかりやすく視覚化するツールを構築した．

researchmap
ロボット聴覚の実環境理解に向けた多面的展開

Grant number：24240035 2012

日本学術振興会科学研究費助成事業基盤研究(A)

奥乃博, 加賀美聡, 糸山克寿, 公文誠, 中臺一博

　 More details

Grant amount：\21060000 （ Direct Cost: \16200000 、 Indirect Cost：\4860000 ）

音は画像と比べ拡散性が強いので,ロボット聴覚による音環境理解は,画像だけでは捉えきれない環境でも理解できる一方,広域から得られる情報の活用方法が課題となる.本研究課題では,既開発のロボット聴覚を基に,実環境音環境理解が可能な安全安心のためのロボット聴覚技術の多面的展開を目的とする.
具体的には,
WP1:多様なマイクロフォンコンフィグレーションへの展開,HARK-16の性能向上や分散設置された複数のマイクロホンアレイの同期方法,
WP2:室内から屋外への展開,室内での音響マップ作成から無人飛行機による空中からの音の取得と音源定位,
WP3:音声から楽音・環境音を含めた音一般への展開,特にノンパラメトリックベイズ信号処理,音光変換による動物音響学,楽器演奏音からの楽器音実時間分離,環境音の擬音語認識,
に取り組むことになっていた.研究開始から辞退までの2ヶ月間で,実験装置の準備と,無人ヘリコプタの使用の詳細化,無人ヘリコプタ搭載用のマルチチャネルAD装置の設計,特に,非同期分散マイクの処理を高性能化するための時間情報付き音響データ転送方式の設計を行った.また,
HARK-Binauralの洗練化,移動音源を対象とした音源定位のベイズ手法の開発,ベイズ手法による突発音や反射音を抑制したMUSIC(Multiple Signal Classification)法の開発,音源の活動状況と音源分離とを同時に推定するノンパラメトリックベイズ手法によるIVA法の開発,楽器音の音モデルのゆらぎを許容する多重演奏曲の楽器音分離法の開発,バンドパスフィルタを用いたカエルホタルの高機能化などに取り組んだ.

researchmap
Computational Auditory Scene Analysis Using Active Audio-Visual Integration in a Dynamically Changing Environment

Grant number：22700165 2010 - 2012

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (B)

NAKADAI Kazuhiro

　 More details

Grant amount：\4030000 （ Direct Cost: \3100000 、 Indirect Cost：\930000 ）

A framework for Audio-Visual Integration (AVI), which can provide optimal integration according to quality of audio and visual information obtained from a robot’s camera and microphone, was proposed and implemented. In addition, the proposed framework was extended by proposing “Active Audio Visual Integration (AAVI)”, which improves the quality of audio and visual information using active robot ’ s motion. Preliminary experiments on automatic speech recognition and voice activity detection showed that the AAVI framework worked effectively even in visually and/or auditorily noisy conditions.

researchmap
音楽を通じた人とロボットの共生

Grant number：22118502 2010 - 2011

日本学術振興会科学研究費助成事業新学術領域研究(研究領域提案型)

中臺一博

　 More details

Grant amount：\11960000 （ Direct Cost: \9200000 、 Indirect Cost：\2760000 ）

H23年度については,これまでに構築した音楽処理に関連する個々の機能(楽譜情報を利用した頑健なビートトラッキング技術,自己雑音抑制技術,Kinectを用いた手の動き検出技術,フルート奏者のフルートの動き検出を利用した曲の開始・終了検出技術,振動子を用いた人・ロボット合奏モデル)を統合して,実機ロボットを用いた合奏デモを構築した.具体的には,人型ロボット2台,演奏者(人間)2名の4者によるカルテットを実現し,ロボットが人に,また人がロボットに合わせることにより調和のとれた人ロボット音楽インタラクションを実現した.また,人の楽器演奏に合わせてテルミンを演奏するロボットを構築し,ロボット分野で最高峰の国際会議であるIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011)のExhibition Sessionや人工知能学会AI-Challenge研究会において実機デモを行い,その有効性を示した.さらに,より人ロボット共生学領域に貢献すべく,ATRで開発した16チャンネル屋内設置型マイクロホンアレイを用いて,複数名が自発的に行う会話に対して,各話者の位置や発話区間を推定する技術を開発した.また,誤推定を測る指標を提案し,その有効性を明らかにした.計画時に提案した音楽インタラクションにとどまらず,マイクロホンアレイを用いたよい聞き手ロボット実現に向けた基礎技術を開発することもでき,計画以上に研究を進めることができた.

researchmap
Development of Robot Audition based on Computational Auditory Scene Analysis

Grant number：19100003 2007 - 2011

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (S)

OKUNO Hiroshi, OGATA Tetsuya, KOMASTANI Kazunori, TAKAHASHI Toru, SHIRAMATSU Shun, NAKADAI Kazuhiro, KITAHARA Tetsuro, ITOYAMA Katsutoshi, ASANO Futoshi

　 More details

Grant amount：\119340000 （ Direct Cost: \91800000 、 Indirect Cost：\27540000 ）

Three main features of Computational Auditory Scene Analysis, sound source localization, sound source separation, and recognition of separated sounds, have been developed and their collections are made available as an open-sourced robot audition software called "HARK". As a proof of concepts in this robot audition, we developed "Prince Shotoku" robots that can listen to simultaneous talkers, and a spoken dialogue system that accepts a barge-in utterance of the user. We also developed various technologies to separate musical instrument parts for polyphonic performance, and real-time score following systems. These musical-related technologies are applied to make musical robots to play ensemble with human players

researchmap
audio-visual speech recognition for robots

Grant number：19700158 2007 - 2008

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (B)

NAKADAI Kazuhiro

　 More details

Grant amount：\3480000 （ Direct Cost: \3300000 、 Indirect Cost：\180000 ）

researchmap

▼display all