Updated on 2026/03/05

写真a

 
nakadai kazuhiro
 
Organization
School of Engineering Professor
Title
Professor
Profile

Kazuhiro Nakadai received a B.E. in electrical engineering in 1993, an M.E. in information engineering in 1995, and a Ph.D. in electrical engineering in 2003 from the University of Tokyo. He worked with Nippon Telegraph and Telephone for four years as a system engineer from 1995 to 1999. After that, he was worked on the Kitano Symbiotic Systems Project, ERATO, JST as a researcher from 1999 to 2003. Currently he is a principal researcher for Honda Research Institute Japan, Co., Ltd. He has had a concurrent position at Tokyo Institute of Technology, as a visiting associate professor from 2006 to 2010, a visiting professor from 2011 to 2017, and a specially-appointed professor from July, 2017. He also had a concurrent position as a guest professor at Waseda University from 2011 to 2018. His research interests include AI, robotics, signal processing, computational auditory scene analysis, multi-modal integration and robot audition. He has been an executive board member for JSAI from 2015 to 2016, and for RSJ from 2017 to 2018. He is also a member of IPSJ, ASJ, HIS, ISCA, ACM and IEEE.

External link

News & Topics
  • Listening drone helps find victims needing rescue in disasters

    2017/12/22

    Languages: English

      More details

    As part of the ImPACT Tough Robotics Challenge Program, an initiative of the Cabinet Office of Japan, a Japanese research group has developed the first system worldwide that is able to detect acoustic signals such as voices from victims needing rescue, even when they are difficult to find or are in places cameras cannot be used. This system was developed using three technological elements: a microphone array technology for the robot ears, an interface for visualization of invisible sounds, and a microphone array that is easily connected to a drone, even in rainy weather.

  • ドローンが耳を澄まして要救助者の位置を検出 ―災害発生時の迅速な救助につながる技術を開発―

    2017/12/08

    Languages: Japanese

      More details

    ドローンのようなロボットによる人命救助はカメラなど視覚的な方法が主 集音方法を工夫して雑音減らし、瓦礫の下の人の声などを検出 迅速かつ効率的な人命救助に活用できる全天候型システムを開発 暗くても、うるさくても、見えない場所でも、音を検出可

Degree

  • Ph. D. ( The Univ. of Tokyo )

Research Interests

  • Robot Audition

  • Computational Auditory Scene Analysis

  • Acoustic Signal Processing

  • Robotics

  • Artificial Intelligence

Research Areas

  • Informatics / Intelligent robotics  / Robot Audition

  • Informatics / Intelligent informatics  / Computational Auditory Scene Analysis

  • Informatics / Human interface and interaction  / HMI, HRI

  • Informatics / Software  / OSS

Education

  • Graduate School of Engineering, The University of Tokyo   Information Engineering

    1993.4 - 1995.3

      More details

  • The University of Tokyo   Faculty of Engineering   Department of Electrical and Electronics Engineering

    1991.4 - 1993.3

      More details

  • The University of Tokyo   School of Arts and Sciences   Natural Sciences I

    1989.4 - 1991.3

      More details

Research History

  • Institute of Science Tokyo   Dept. of Systems and Control Engineering, School of Engineering   Professor   Ph.D.

    2024.10

      More details

    Country:Japan

    researchmap

  • Tokyo Institute of Technology   Department of Systems and Control Engineering, School of Engineering   Professor   Ph.D.

    2022.4 - 2022.9

      More details

    Country:Japan

    researchmap

  • Tokyo Institute of Technology   School of Engineering, Department of Systems and Control Engineering   Specially-appointed Professor

    2016.4 - 2022.3

      More details

  • Waseda University   School of Creative Science and Engineering   Guest Professor

    2011.4 - 2018.3

      More details

  • Tokyo Institute of Technology   Graduate School of Information Science and Engineering   Adjunct Associate Professor -> Adjunct Professor (2012)

    2006.4 - 2016.3

      More details

  • Honda Research Inst. Japan Co., Ltd.   Principal Scientist

    2003.5 - 2022.3

      More details

  • JST ERATO Kitano Symbiotic Systems Project   Researcher

    1999.7 - 2003.4

      More details

  • NTT Comware   Employee

    1997.9 - 1999.6

      More details

  • Nippon Telegraph and Telephone Corporation   Employee

    1995.4 - 1999.6

      More details

▼display all

Professional Memberships

▼display all

Committee Memberships

  • RSJ   execuitive committee member  

    2025.3 - 2027.3   

      More details

    Committee type:Academic society

    researchmap

  • JSAI   execuitive committee member  

    2024.7 - 2026.6   

      More details

    Committee type:Academic society

    researchmap

  • 日本ロボット学会   理事  

    2017.4 - 2019.3   

      More details

    Committee type:Academic society

    researchmap

  • 人工知能学会   理事  

    2015.7 - 2017.6   

      More details

    Committee type:Academic society

    researchmap

Papers

▼display all

Books

  • AIの活用と感情に寄り添う音声認識・合成の新展開 Reviewed

    伊藤, 彰則, 森川, 大輔, 上江洲, 安史, 鳥谷, 輝樹, 高野, 佐代子, 河原, 達也, 鵜木, 祐史, 齊藤, 剛史, 吉村, 奈津江, 平井, 重行, 中島, 佐和子, 大河内, 直之, 中臺, 一博, 糸山, 克寿, 福森, 隆寛, 周藤, 唯, 松田, 裕之, 渡辺, 光太朗, 白土, 浩司, 三井, 祥幹, 鳥居, 崇, 中川, 達也, 高橋, 敏, 加藤, 集平

    エヌ・ティー・エス  2025.4  ( ISBN:9784860439361

     More details

    Total pages:1, 7, 254, 6p, 図版5p   Language:Japanese  

    CiNii Books

    researchmap

  • ロボット聴覚の基礎 : 実環境での音源定位・分離技術 Reviewed

    中臺, 一博, 糸山, 克寿

    オーム社  2025.2  ( ISBN:9784274232527

     More details

    Total pages:vi, 214p   Language:Japanese  

    CiNii Books

    researchmap

  • 感覚デバイス開発―機器が担うヒト感覚の生成・拡張・代替技術 Reviewed

    廣瀬通孝, 小柳光正, 石鍋隆宏, 川上徹, 小澤史朗, 八木康史, 長原一, 鏡慎吾, 徐剛, 奥乃博, 中臺一博, ホンダ・リサーチ, インスティチュート・ジャパン, ほか執筆者

    エヌティーエス  2014.11  ( ISBN:4864690642

     More details

    Total pages:424   Language:Japanese  

    ASIN

    researchmap

MISC

  • 野鳥の歌分析用マイクロホンアレイの開発とその応用

    中臺 一博

    人工知能学会第二種研究会資料   2024 ( Challenge-064 )   01   2024.3

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:一般社団法人 人工知能学会  

    DOI: 10.11517/jsaisigtwo.2024.challenge-064_01

    CiNii Research

    J-GLOBAL

    researchmap

  • Video Vision Transformerに基づく音源定位の提案

    横田遥大, BOZKURTLAR Mert, BOZKURTLAR Mert, YEN Benjamin, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   42nd   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 屋外環境下でのドローンのローターノイズによる地表材質推定手法の検討

    矢野翼, YEN Benjamin, 糸山克寿, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   42nd   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Detection of small moving objects as rare events in videos

    西田健次, 糸山克寿, 糸山克寿, 中臺一博

    人工知能学会第二種研究会資料(Web)   2024 ( Challenge-064 )   05   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    DOI: 10.11517/jsaisigtwo.2024.challenge-064_05

    CiNii Research

    J-GLOBAL

    researchmap

  • 複数のドローンを用いた音源探査のためのROSネットワークの構築

    山本拓実, 干場功太郎, YEN Benjamin, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   42nd   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 距離学習を用いた話者識別に基づく話者ダイアライゼーションの検討

    阿坂脩平, 西田健次, 糸山克寿, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   42nd   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • LCMVベースのScan-and-Sum Beamformerによる面領域内音源の抽出

    安江蒼人, YEN Benjamin, 糸山克寿, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   42nd   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • ガウス過程回帰を用いた音響伝達関数の環境変化適応

    藤田侑樹, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   42nd ( Challenge-066 )   06   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:一般社団法人 人工知能学会  

    DOI: 10.11517/jsaisigtwo.2024.challenge-066_06

    CiNii Research

    J-GLOBAL

    researchmap

  • Biasing Networkを用いた音声認識の雑音耐性向上

    大崎崇博, 周藤唯, 糸山克寿, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   42nd   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Improvement in multi-drone sound source tracking considering self and other dorne noise

    三好智大, 山田泰基, 山田泰基, YEN Benjamin, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   25th   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Environmental sound classification using microphones with a drone

    野島稔生, 大崎崇博, 矢野翼, YEN Benjamin, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   25th   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Improvement in Target Speech Extraction Using Distance- and Speaker-Based Time-Frequency Masks

    田口鐵人, 石井遼平, 大崎崇博, 阿坂脩平, YEN Benjamin, 糸山克寿, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   25th   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • HARK3.6 and Its Application to Active Drone Audition

    中臺一博, 公文誠, 佐々木洋子, 干場功太郎, YEN Benjamin, 糸山克寿, 瀧ヶ平将行, 寺門直哉, LIN Zirui, GULZAR Haris, BUSTO Monikka Rosalianna, 江田毅晴, 天野英晴

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   25th   2024

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 気配センシングに向けた磁束密度センサと風速センサを用いた動作検出

    川口洋慶, SHAKEEL Muhammad, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   41st   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • ロボット聴覚のための音源定位と深層ブラインド音源分離の統合

    合澤隆拓, 合澤隆拓, 坂東宜昭, 糸山克寿, 糸山克寿, 西田健次, 中臺一博, 大西正輝

    日本ロボット学会学術講演会予稿集(CD-ROM)   41st   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 面音源抽出のための複数拘束MVDRビームフォーマーの逐次計算による高速化

    安江蒼人, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   41st   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • フォンミーゼス分布に基づく音響伝達関数オンライン適応の向上

    藤田侑樹, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   41st   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 音声強調ネットワークとアダプターを用いた音声認識の耐雑音ロバスト性向上

    大崎崇博, 周藤唯, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   41st   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Introduction of a Python Package for Robot Audition Open Source Software HARK and its implementation for embedded use

    中臺一博, LIN Zirui, 糸山克寿, 糸山克寿, 瀧ヶ平将行, 寺門直哉, GULZAR Haris, BUSTO Monikka Rosalianna, 江田毅晴, 天野英晴

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   24th   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Towards Natural Spoken Dialogue Systems Based on AI Services

    阿坂脩平, 西田健次, 糸山克寿, 糸山克寿, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   24th   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Groud Surface Material Estimation Using Drone Rotor Noise

    矢野翼, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   24th   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Improved 3D spatial recognition based on audible sound-based echolocation with a 5-channel microphone array

    小林宙輝, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   24th   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • Online Adaptation of Fourier series based Lightweight Transfer Function to Improve Sound Source Localization and Separation

    周藤唯, 瀧ケ平将行, 中臺一博, 中島弘史

    人工知能学会第二種研究会資料(Web)   2023 ( Challenge-063 )   08   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    DOI: 10.11517/jsaisigtwo.2023.challenge-063_08

    CiNii Research

    J-GLOBAL

    researchmap

  • Improving Noise Robustness of Automatic Speech Recognition based on a Parallel Adapter Model with Near-Identity Initialization

    大崎崇博, 周藤唯, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    人工知能学会第二種研究会資料(Web)   2023 ( Challenge-063 )   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • An approach to integrating evolutionary models and field experiments on avian vocalization using trait representations based on generative models

    鈴木麗璽, 古山諒, HARLOW Zachary, 中臺一博, 有田隆也

    人工知能学会第二種研究会資料(Web)   2023 ( Challenge-063 )   07   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    DOI: 10.11517/jsaisigtwo.2023.challenge-063_07

    CiNii Research

    J-GLOBAL

    researchmap

  • 鳥類の鳴き声行動の理解に対するロボット聴覚に基づく観測と生成進化モデル

    古山諒, 鈴木麗璽, 中臺一博, 有田隆也

    日本鳥学会大会講演要旨集   2023   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 鳴き声の音源定位によるシマフクロウの生息位置把握の試み

    土門優介, 鈴木祐太郎, 石塚正仁, 内山秀樹, 矢野幹也, 鈴木麗璽, 中臺一博

    日本鳥学会大会講演要旨集   2023   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • マイクロホンアレイを用いた渡り鳥の群れの飛行ルート推定

    山本悠貴, 鈴木麗璽, 中臺一博, 東信行

    日本鳥学会大会講演要旨集   2023   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 一夫一妻制鳥類のリュウキュウコノハズクは交尾声で異性を惹きつけるのか?

    金杉尚紀, 澤田明, 佐々木瑠太, 細江隼平, 中臺一博, 高木昌興

    日本鳥学会大会講演要旨集   2023   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • ヒバリの求愛飛行実測の試み

    松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2023   2023

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    J-GLOBAL

    researchmap

  • 深層フルランク空間相関分析に基づく遠隔音声認識のフロントエンド

    合澤, 隆拓, 坂東, 宜昭, 糸山, 克寿, 西田, 健次, 中臺, 一博

    第84回全国大会講演論文集   2022 ( 1 )   285 - 286   2022.2

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    雑踏環境下でも頑健な音声認識をする実現するには,音源分離により目的音源を抽出するフロントエンドが不可欠である.このような音源分離は,学習コストの観点から教師なしでの動作が望ましく,混合複素角度中心ガウス法や多チャネル非負値行列因子分解といった線形型確率モデルに基づく手法が提案されていた.本稿では,より高い表現能力をもつ深層フルランク空間相関分析 (neural FCA) に基づくフロントエンドを提案する.Neural FCAは,フルランク空間モデルと深層音源モデルを統合した非線形型確率モデルであり,従来の枠組みより精緻な分離性能を教師なしで獲得できる.Neural FCAを多人数対話のための音声認識フロントエンドとして拡張し,拡散性雑音を含む複数話者の混合音で評価した認識性能を報告する.

    CiNii Books

    CiNii Research

    researchmap

  • Integration of Blockwise Streaming Automatic Speech Recognition with Voice Activity Detection International coauthorship

    周藤唯, SHAKEEL Muhammad, 中臺一博, SHI Jiatong, 渡部晋二

    人工知能学会第二種研究会資料(Web)   2022 ( Challenge-061 )   10   2022

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    DOI: 10.11517/jsaisigtwo.2022.challenge-061_10

    CiNii Research

    J-GLOBAL

    researchmap

  • PyHARK: HARK Python package supporting online and offline processing

    中臺一博, 瀧ヶ平将行, 糸山克寿, 糸山克寿

    人工知能学会第二種研究会資料(Web)   2022 ( Challenge-061 )   04   2022

  • Investigation of a method for detecting small objects from low-resolution images

    西田健次, 糸山克寿, 糸山克寿, 中臺一博

    人工知能学会第二種研究会資料(Web)   2022 ( Challenge-061 )   03   2022

  • 音声に基づくヒクイナの個体数推定と生息地利用状況の可視化

    松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2022   2022

  • 野外鳥類集団における音声相互作用分析のためのマイクロホンアレイに基づく自動観測の検討

    鈴木麗璽, 炭谷晋司, 有田隆也, 松林志保, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2022   2022

  • ロボット聴覚用音響処理ソフトウェアHARKを用いたサウンドスケープの解析

    山本遼, 西田健次, 糸山克寿, 糸山克寿, 松林志穂, 鈴木麗璽, 中臺一博

    日本鳥学会大会講演要旨集   2022   2022

  • 複数マイクロホンアレイのパラメータ同時最適化

    杉山地塩, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   40th   2022

  • 音源定位結果の3D可視化とmAPベースの評価指標の提案

    山本遼, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   40th   2022

  • 環境イベント識別学習フレームワークの提案とその日本語テキスト入力からの音響シーン生成部の実装

    露口弘毅, MUHAMMAD Shakeel, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   40th   2022

  • アンサンブル時間周波数マスクを用いた複数の音声強調手法の統合

    藤田雅彦, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   40th   2022

  • 複数のマイクロホンアレイ搭載ドローンの配置最適化による音源追跡性能の向上

    山田泰基, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   40th   2022

  • An Implementation of GHDSS on an FPGA board

    QIN Ziquan, WEI Kaijie, 天野英晴, 中臺一博

    電子情報通信学会技術研究報告(Web)   122 ( 174(RECONF2022 26-41) )   2022

  • Adapting Acoustic Transfer Functions to Environmental Changes with Mode Filter

    藤田侑樹, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   23rd   2022

  • 2-Dimensional Interpolation of Acoustic Transfer Function and Application for Sound Source Localization

    大崎崇博, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   23rd   2022

  • HARK 3.4-Introduction to PyHARK-

    中臺一博, 糸山克寿, 糸山克寿

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   23rd   2022

  • Extraction of Two Dimensional Area with Expanded Scan-and-Sum Beamforming

    安江蒼人, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   23rd   2022

  • Study of drone swarm action planning in multiple sound source tracking

    山田泰基, 糸山克寿, 糸山克寿, 西田健次, 中臺一博

    人工知能学会第二種研究会資料(Web)   2022 ( Challenge-061 )   07   2022

  • Calibration of microphone array shape with arbitrary sound mixtures as input

    糸山克寿, 糸山克寿, 中臺一博

    人工知能学会第二種研究会資料(Web)   2022 ( Challenge-061 )   11   2022

  • Off-loading of sound localization on an FPGA board

    HOU Zhongyang, WEI Kaijie, 天野英晴, 中臺一博

    電子情報通信学会技術研究報告(Web)   122 ( 174(RECONF2022 26-41) )   2022

  • Soundscape analysis using robot audition open source software HARK

    山本遼, 西田健次, 糸山克寿, 中臺一博, 中臺一博

    日本生態学会大会講演要旨(Web)   69th   2022

  • Analysis of acoustic interactions among wild birds based on sound source localization techniques

    鈴木麗璽, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

    日本生態学会大会講演要旨(Web)   69th   2022

  • 野外での鳥類鳴き声観測のためのWebベース録音ユニットと可視化ツールの試作

    炭谷晋司, 大和祐介, 鈴木麗璽, 小島諒介, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   39th   2021

  • Robot audition approaches to observations of bird vocalizations

    鈴木麗璽, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

    日本生態学会大会講演要旨(Web)   68th   2021

  • 類似度行列を考慮した野鳥の歌自動識別の検討

    山本遼, 中臺一博, 中臺一博, 西田健次, 糸山克寿

    日本ロボット学会学術講演会予稿集(CD-ROM)   39th   2021

  • エコロケーションに基づく視覚シーンの再構成手法の提案と入力特徴量の検討

    岸波華彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   39th   2021

  • A playback experiment on songbirds using simulated vocalizations based on a generative model

    炭谷晋司, 鈴木麗璽, 有田隆也, 和多和宏, 松林志保, 中臺一博, 中臺一博, 奥乃博

    人工知能学会第二種研究会資料(Web)   2021 ( Challenge-058 )   2021

  • Improvement of Sound Source Localization and Separation with Fully-Online Always-Adaptation of Transfer Functions

    中臺一博, 中臺一博, 瀧ケ平雅行, 河合熊輔, 中島弘史

    人工知能学会第二種研究会資料(Web)   2021 ( Challenge-058 )   2021

  • Evaluation of spatial source separation using NMF with multiple microphone arrays under reverberation

    鍵本泰宏, 糸山克寿, 西田健次, 中臺一博, 中臺一博

    人工知能学会第二種研究会資料(Web)   2021 ( Challenge-058 )   05   2021

  • A Study of Sound Classification Using Transfer Learning

    露口弘毅, 西田健次, 糸山克寿, 中臺一博, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   22nd   2021

  • Robot Audition 5.0-Evolution & Prospect-

    中臺一博, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   22nd   2021

  • Evaluation of Speech Recognition Performance Improvement by Spotforming

    合澤隆拓, 鍵本泰宏, 西田健次, 糸山克寿, 中臺一博, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   22nd   2021

  • 複数マイクロホンアレイの同期および3次元位置・姿勢推定の同時最適化の検討

    杉山地塩, 糸山克寿, 西田健次, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   39th   2021

  • アンサンブル時間周波数マスクによる音声強調手法の評価

    藤田雅彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   39th   2021

  • ヒクイナの鳴き声自動観測の可能性と今後の課題

    松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博, 奥乃博

    日本鳥学会大会講演要旨集   2021 (CD-ROM)   2021

  • 類似度行列による野鳥の歌識別器の検討

    山本遼, 中臺一博, 中臺一博, 糸山克寿, 西田健次, 鈴木麗璽, 松林志保

    日本鳥学会大会講演要旨集   2021 (CD-ROM)   2021

  • ロボット聴覚技術に基づく鳥類音声の方位角・仰角に関する音源定位と音風景の観測

    鈴木麗璽, 林晃一郎, 大坂英樹, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

    日本鳥学会大会講演要旨集   2021 (CD-ROM)   2021

  • Acoustic monitoring of owl fledglings

    松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博

    景観生態学   25 ( 1 )   87 - 89   2020.6

     More details

  • Multi-scale approaches to observations of bird vocalizations using robot audition techniques

    鈴木麗璽, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

    日本生態学会大会講演要旨(Web)   67th   2020.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ロボット聴覚からのクロスモーダルへの期待—メディアエクスペリエンス・バーチャル環境基礎

    中臺 一博

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   119 ( 386 )   107 - 112   2020.1

     More details

    Language:Japanese   Publisher:東京 : 電子情報通信学会  

    CiNii Research

    researchmap

    Other Link: https://ndlsearch.ndl.go.jp/books/R000000004-I030249880

  • ドローン搭載マイクロホンアレイを用いた音源探査の高精度化に向けた静音プロペラの開発

    干場功太郎, 野田龍介, 中田敏是, 劉浩, 泉田啓, 中臺一博, 中臺一博, 公文誠, 奥乃博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   38th   2020

  • Examination of voice-based sentiment estimation method using facial expression-based sentiment estimation

    西田健次, 山田亨, 糸山克寿, 中臺一博, 中臺一博

    人工知能学会AIチャレンジ研究会(Web)   57th   2020

  • 重み付け尤度関数と定在波を用いた可聴音による二次元環境認識

    岸波華彦, 糸山克寿, 西田健次, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   38th   2020

  • テニスの打球音による球種識別の検討

    山本修己, 西田健次, 糸山克寿, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   38th   2020

  • ロボット聴覚技術の活用による鳥類音声の到来方向に基づく音風景の可視化の検討

    鈴木麗璽, ZHAO Hao, 炭谷晋司, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   38th   2020

  • 複数マイクロホンアレイを用いたNMFによる空間音源分離法の提案と評価

    鍵本泰宏, 糸山克寿, 西田健次, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   38th   2020

  • 環境音情報と画像情報を用いた物体検出による音ラベル付きセグメントの生成

    鈴木啓, 糸山克寿, 西田健次, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   38th   2020

  • The 31st IEEE/RSJ International Conference on Intelligent Systems and Robots (IROS 2018)

    Nakadai Kazuhiro

    Journal of the Robotics Society of Japan   37 ( 1 )   70 - 72   2019

     More details

    Language:Japanese   Publisher:The Robotics Society of Japan  

    DOI: 10.7210/jrsj.37.70

    CiNii Books

    CiNii Research

    researchmap

    Other Link: https://ndlsearch.ndl.go.jp/books/R000000004-I029462341

  • 柔軟索状レスキューロボットのための空気噴射音下での単チャネル音声強調

    坂東宜昭, 安部祐一, 糸山克寿, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 中臺一博, 奥乃博

    日本機械学会ロボティクス・メカトロニクス講演会講演論文集(CD-ROM)   2019   2019

  • 「見えない」鳥を音で追う:定位技術を活用した鳥類観測

    松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 奧乃博

    日本景観生態学会大会発表要旨集(Web)   29th   2019

  • ドローンによる地上音源の位置推定―HARKを用いたドローン聴覚の取り組み―

    公文誠, 若林瑞保, 干場功太郎, 中臺一博, 中臺一博, 奥乃博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   19th   ROMBUNNO.2E3‐09   2018.12

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • An Experiment of Drone Control and Sensor Data Transmission using 920MHz Multi-hop Wireless Communication System

    加川敏規, 小野文枝, SHAN Lin, 三浦龍, 中臺一博, 干場功太郎, 公文誠, 奥乃博, 加藤晋, 児島史秀

    電子情報通信学会技術研究報告   118 ( 344(RCC2018 58-106) )   217‐221   2018.11

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Fine-scale observations of spatiotemporal dynamics and vocalization type of birdsongs using microphone arrays and unsupervised feature mapping

    Reiji Suzuki, Shinji Sumitani, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

    Proceedings of the 10th International Conference on Ecological Informatics   72-73   2018.9

     More details

    Language:English   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • Spatial localization of vocalizations of Spotted Towhee (Pipilo maculatus) in playback experiments using robot audition techniques Reviewed

    Shinji Sumitani, Reiji Suzuki, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

    Proceedings of the 10th International Conference on Ecological Informatics   265   2018.9

     More details

    Language:English   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • 音情報を活用したフクロウの歌行動観測の試み

    松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2018   72   2018.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ロボット聴覚技術に基づく鳥類の歌行動の二次元定位精度改善と次元圧縮に基づく分類支援

    炭谷晋司, 鈴木麗璽, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2018   73   2018.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • マイクロホンアレイを用いた鳥類の歌行動の三次元音源到来方向推定

    林晃一郎, 鈴木麗璽, 松林志保, 有田隆也, 小島諒介, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2018   74   2018.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 複数のマイクロホンアレイの遠隔制御に基づく鳥類の歌行動の二次元定位

    森松健充, 炭谷晋司, 鈴木麗璽, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2018   72   2018.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 複数のマイクロホンアレイをネットワーク制御可能な鳥類の歌行動観測システムの構築

    森松健充, 炭谷晋司, 鈴木麗璽, 松林志保, 有田隆也, 中臺一博, 中臺一博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   36th   ROMBUNNO.2J2‐03   2018.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 音響センサによるサイバー救助犬のパンディングの検出

    鈴木拓也, 中臺一博, 中臺一博, 奥乃博, 星達也, 水野直希, 大貫和也, 濱田龍之介, 大野和則, 干場功太郎

    日本ロボット学会学術講演会予稿集(CD-ROM)   36th   ROMBUNNO.2J2‐05   2018.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • マイクロホンアレイを用いた鳥類の3次元音源到来方向推定

    林晃一郎, 鈴木麗璽, 松林志保, 有田隆也, 小島諒介, 中臺一博, 奧乃博

    日本鳥学会2018年度大会講演要旨集   74   2018.9

     More details

    Language:Japanese  

    researchmap

  • Understanding relationships between spatial movements and bird song-types using a robot audition system HARK with microphone arrays Reviewed

    Shinji Sumitani, Reiji Suzuki, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

    Proc. of the 27th International Ornithological Congress   188   2018.8

     More details

    Language:English   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • Acoustic monitoring of the nocturnal owl (Strix uralensis) using microphone arrays and a robot audition system, HARK: A case study in the Ikoma mountains, Japan Reviewed

    Shiho Matsubayashi, Fumiyuki Saito, Reiji Suzuki, Kazuhiro Nakadai, Hiroshi G. Okuno

    Proc. of the 27th International Ornithological Congress   213   2018.8

     More details

    Language:English   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • Introduction to Sound Source Localization and Separation Software Using Microphone Array Processing

    Nakadai Kazuhiro

    SYSTEMS, CONTROL AND INFORMATION   62 ( 2 )   42 - 49   2018.8

     More details

    Language:Japanese   Publisher:一般社団法人 システム制御情報学会  

    DOI: 10.11509/isciesci.62.2_42

    CiNii Books

    CiNii Research

    researchmap

  • Understanding ecoacoustic interactions among songbirds as complex systems using robot audition techniques

    Reiji Suzuki, Shinji Sumitani, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

    Abstract Booklet of EVOSLACE: Workshop on the emergence and evolution of social learning, communication, language and culture in natural and artificial agents in ALIFE2018   22   2018.7

     More details

    Language:English   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • Transition and the current technologies in acoustic signal processing: From the viewpoint of robot audition Reviewed

    Nakadai Kazuhiro

    THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN   74 ( 7 )   394 - 400   2018.7

     More details

    Language:Japanese   Publisher:Acoustical Society of Japan  

    DOI: 10.20697/jasj.74.7_394

    CiNii Books

    CiNii Research

    J-GLOBAL

    researchmap

  • Field observations of ecoacoustic dynamics of a Japanese bush warbler using an open-source software for robot audition HARK Reviewed

    Reiji Suzuki, Shinji Sumitani, Shiho Matsubayashi, Takaya Arita, Kazuhiro Nakadai, Hiroshi G. Okuno

    Journal of Ecoacoustics   2   EYAJ46   2018.6

     More details

    Language:English   Publishing type:Rapid communication, short report, research note, etc. (scientific journal)  

    researchmap

  • Development of Robot Audition to Extreme Environments

    奥乃博, 糸山克寿, 中臺一博, 中臺一博, 公文誠, 坂東宜昭, 干場功太郎

    システム制御情報学会研究発表講演会講演論文集(CD-ROM)   62nd   ROMBUNNO.221‐1   2018.5

     More details

    Language:Japanese   Publisher:システム制御情報学会  

    J-GLOBAL

    researchmap

  • ロボット聴覚技術を活用した鳥類の行動観測

    鈴木麗璽, 中臺一博, 奥乃博

    日本鳥学会誌(フォーラム)   67 ( 1 )   155-157   2018.5

     More details

    Language:Japanese   Publishing type:Internal/External technical report, pre-print, etc.  

    researchmap

  • ロボット聴覚技術を用いた鳥類の歌行動分析の試み―複数のマイクロホンアレイを用いた二次元リアルタイム歌定位―

    鈴木麗璽, 炭谷晋司, 中臺一博, 中臺一博, 奥乃博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   18th   ROMBUNNO.1D6‐04   2017.12

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 人間とロボットとの対話環境における対話終了タイミングの検討 (情報ネットワーク)

    北川 遼, 蓮本 諒介, 今井 倫太, 中臺 一博

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   117 ( 306 )   31 - 34   2017.11

     More details

    Language:Japanese   Publisher:電子情報通信学会  

    CiNii Books

    CiNii Research

    researchmap

  • Establishment and Experimental Demonstration of Distant Speech Recognition System for Communication Robot

    山本 俊一, 住田 直亮, 中臺 一博

    Honda R&D technical review   29 ( 2 )   110 - 117   2017.10

     More details

    Language:Japanese   Publisher:本田技術研究所  

    CiNii Books

    CiNii Research

    researchmap

  • マイクロホンアレイを利用したウグイスの歌行動の時空間分析

    炭谷晋司, 鈴木麗璽, 有田隆也, 松林志保, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2017   92   2017.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • マイクロフォンアレイを用いた野鳥観測:ソウシチョウの歌行動をめぐる予備的調査報告

    松林志保, 斎藤史之, 鈴木麗璽, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2017   92   2017.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ロボット聴覚技術を活用した野鳥の歌行動観測・分析ツールHARKBirdの機能強化

    千葉尚彬, 炭谷晋司, 松林志保, 鈴木麗璽, 有田隆也, 中臺一博, 中臺一博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   35th   ROMBUNNO.3A3‐03   2017.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • UAV搭載マイクロホンアレイを用いた組み込みシステムによる音源探査性能の評価

    干場功太郎, 中臺一博, 中臺一博, 公文誠, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   35th   ROMBUNNO.3A2‐04   2017.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • マルチロータヘリコプタ収録音の音源分離におけるシステムパラメータと分離性能について―GHDSSとBNP‐MAPの比較

    鷲崎海, 公文誠, 大塚琢馬, 奥乃博, 干場功太郎, 中臺一博, 中臺一博

    日本ロボット学会学術講演会予稿集(CD-ROM)   35th   ROMBUNNO.3A2‐05   2017.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 災害救助犬の呼吸音と周囲の音を同時に計測するサイバスーツの開発

    水野直希, 大貫和也, 星達也, 山口竣平, 濱田龍之介, 大野和則, 中臺一博, 奥乃博, 田所諭

    日本ロボット学会学術講演会予稿集(CD-ROM)   35th   ROMBUNNO.3A3‐02   2017.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Contributing to a Community of Open Source Software

    中臺 一博

    映像情報メディア学会誌 = The journal of the Institute of Image Information and Television Engineers   71 ( 5 )   647 - 653   2017.9

     More details

    Language:Japanese   Publisher:映像情報メディア学会  

    DOI: 10.3169/itej.71.647

    CiNii Books

    CiNii Research

    researchmap

  • Investigation of passive acoustic anemometer using wind noise correlation

    73 ( 8 )   472 - 479   2017.8

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • Field observations and virtual experiences of bird songs in the soundscape using an open-source software for robot audition HARK

    Shinji Sumitani, Reiji Suzuki, Takaya Arita, Naren, Shiho Matsubayashi, Kazuhiro Nakadai, Hiroshi G. Okuno

    Abstract Book of 4th International Symposium on Acoustic Communication by Animals   116-117   2017.7

     More details

    Language:English   Publishing type:Rapid communication, short report, research note, etc. (scientific journal)  

    researchmap

  • Bird song explorer: 野鳥の歌行動体験のための立体音響に基づく仮想森林アプリケーション

    娜 仁, 鈴木 麗璽, 有田 隆也, 中臺 一博, 奥乃 博

    第79回全国大会講演論文集   2017 ( 1 )   239 - 240   2017.3

     More details

    Language:Japanese   Publisher:情報処理学会  

    我々は,マイクロホンアレイとロボット聴覚ソフトウェアHARKを用いて野鳥の歌行動を観測・分析する簡易なシステムHARKBirdを開発している.観測した音空間を臨場的に体験することは,野鳥の生態理解への貢献をはじめ,教育や啓蒙など幅広い活用が期待される.本発表では,ゲームエンジンであるUnityを用いて,野鳥が棲息し歌う様子を3次元空間上の仮想的な森林等で表現するアプリケーションを提案する.具体的には,いくつかの調査地で録音し音源定位・分離した野鳥の歌を,実環境と同じタイミングと方位で仮想的なフィールドに配置し再生する.ユーザはアバターを動かして野鳥を探索しながら立体音響で臨場的に歌を聴くことができる.目的に応じて任意の歌を配置することも可能である.

    CiNii Books

    CiNii Research

    researchmap

  • マイクロホンアレイ搭載UAVを用いた屋外実環境実時間音源探査

    干場功太郎, 若林瑞保, 鷲崎海, 石木隆洋, 公文誠, GABRIEL Daniel, 中臺一博, 中臺一博, 奥乃博

    情報処理学会全国大会講演論文集   79th ( 1 )   1.199‐1.200   2017.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Bird song explorer:野鳥の歌行動体験のための立体音響に基づく仮想森林アプリケーション

    NARAN, 鈴木麗璽, 有田隆也, 中臺一博, 中臺一博, 奥乃博

    情報処理学会全国大会講演論文集   79th ( 4 )   4.239‐4.240   2017.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Report of JSAI SigConf 2016

    中臺 一博, 小林 一郎, 和泉 潔

    人工知能 : 人工知能学会誌 : journal of the Japanese Society for Artificial Intelligence   32 ( 2 )   297 - 304   2017.3

     More details

    Language:Japanese   Publisher:人工知能学会 ; 2014-  

    DOI: 10.11517/jjsai.32.2_297

    CiNii Books

    CiNii Research

    researchmap

  • A Study on body movements and postures at Human-Robot Interaction using speech and image information

    蓮本 諒介, 小山 大幾, 水本 武志, 中村 圭佑, 中臺 一博, 今井 倫太

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   116 ( 461 )   19 - 22   2017.2

     More details

    Language:Japanese   Publisher:電子情報通信学会  

    CiNii Books

    researchmap

  • A Study on body movements and postures at Human-Robot Interaction using speech and image information

    蓮本 諒介, 小山 大幾, 水本 武志, 中村 圭佑, 中臺 一博, 今井 倫太

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   116 ( 462 )   19 - 22   2017.2

     More details

    Language:Japanese   Publisher:電子情報通信学会  

    CiNii Books

    researchmap

  • 外来種ソウシチョウが在来種の歌行動へ与える影響を探る:マイクロフォンアレイを用いた森林性鳥類の観測実例

    松林志保, 斉藤史之, 鈴木麗璽, 千葉尚彬, 中臺一博, 中臺一博, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   49th   23‐28 (WEB ONLY)   2017

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Evaluation of microphone array for sound source localization using UAV

    HOSHIBA Kotaro, WASHIZAKI Kai, WAKABAYASHI Mizuho, KUMON Makoto, NAKADAI Kazuhiro

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2017 ( 0 )   1P1 - R05   2017

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    <p>Sound source localization using a microphone array embedded on an unmanned aerial vehicle has been studied to detect and localize people who need help in a disaster-stricken area. Because such sound source localization should work in outdoor environments, the design of the microphone array is crucial. We thus developed two types of microphone array; 16ch two-storied hexagonal and 12ch spherical microphone arrays. These two microphone arrays were evaluated via numerical simulation with discussions on the appropriate design of microphone arrays.</p>

    DOI: 10.1299/jsmermd.2017.1P1-R05

    researchmap

  • Real-Time Human-Voice Enhancement for a Hose-Shaped Rescue Robot Based on Multi-Channel Low-Rank Sparse Decomposition

    bando Yoshiaki, Ambe Yuichi, Itoyama Katsutoshi, Konyo Masashi, Tadokoro Satoshi, Nakadai Kazuhiro, Yoshii Kazuyoshi, G. Okuno Hiroshi

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2017 ( 0 )   1P2 - P05   2017

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    <p>This paper presents a real-time human-voice enhancement method for a hose-shaped rescue robot based on multi-channel low-rank sparse decomposition. Although microphone arrays equipped on hose-shaped robots are crucial for finding victims under collapsed buildings, human voices captured by the microphone array are contaminated by environment-dependent and non-stationary ego-noise. Our method decomposes multi-channel amplitude spectrograms into sparse and low-rank components (human voice and noise) without any prior training. This decomposition is conducted with a state-space model representing the dynamics of these components in a mini-batch manner. Experimental results show that the performance difference between our method and its offline version is less than 3dB in signal-to-distortion ratio.</p>

    DOI: 10.1299/jsmermd.2017.1p2-p05

    researchmap

  • アクティブ周波数レンジフィルタを用いた雑音にロバストな音源定位手法の提案

    干場功太郎, 中臺一博, 中臺一博, 公文誠, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   49th   9‐14 (WEB ONLY)   2017

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • HARK2.3の紹介とタフロボティクスチャレンジへの展開

    中臺一博, 中臺一博, 中臺一博, 坂東宜昭, 水本武志, 干場功太郎, 小島諒介, 糸山克寿, 杉山治, 公文誠, 奥乃博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   17th   ROMBUNNO.3A3‐3   2016.12

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 空間情報を用いた鳥の歌分析 Invited

    小島 諒介, 杉山 治, 干場 功太郎, 鈴木 麗璽, 中臺 一博

    第46回AIチャレンジ研究会予稿集 (SIG-Challenge)   046-05   25-31   2016.11

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • 複数のマイクロホンアレイとロボット聴覚ソフトウエアHARKを用いた野鳥の観測精度の検討 Invited

    松林志保, 鈴木麗璽, 小島諒介, 中臺一博

    人工知能学会2015年度研究会優秀賞記念講演集   10-15   2016.11

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • Semi-Automatic Bird Song Analysis by Spatial-Cue-Based Integration of Sound Source Detection, Localization, Separation, and Identification Reviewed

    Ryosuke Kojima, Osamu Sugiyama, Reiji Suzuki, Kazuhiro Nakadai, Charles E. Taylor

    IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2016)   1287-1292   2016.10

     More details

    Language:English   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • Automatic impulse-response-truncating method affecting less on broadband phase information

    72 ( 10 )   627 - 634   2016.10

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • マイクロホンアレイを用いた森林性野鳥の定位精度の検証とその応用:歌の空間的な位置およびタイミングから知る複数種の棲み分け

    松林志保, 鈴木麗璽, 小島諒介, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2016   138   2016.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • マイクロホンアレイを用いたオオヨシキリのソングポスト定位

    鈴木麗璽, 松林志保, 斎藤史之, 村手達佳, 増田智久, 山本晃一, 小島諒介, 中臺一博, 中臺一博, 奥乃博

    日本鳥学会大会講演要旨集   2016   151   2016.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 音源位置を考慮した音源同定のための確率モデルとその学習

    小島諒介, 杉山治, 鈴木麗璽, 中臺一博

    第34回日本ロボット学会学術講演会 (RSJ2016)資料   4 pages   2016.9

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • 変分ベイズ多チャネルRNMFに基づく柔軟索状レスキューロボットのための音声強調

    坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 中臺一博, 吉井和佳, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   34th   ROMBUNNO.1C2‐04   2016.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 変分ベイズ多チャネルロバストNMFに基づくマイクロホンの移動・被覆を許容する音声強調 (音声) -- (オーガナイズドセッション「あらゆる音を対象とした情報処理の実現に向けて」)

    坂東 宜昭, 糸山 克寿, 昆陽 雅司, 田所 諭, 中臺 一博, 吉井 和佳, 河原 達也, 奥乃 博

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   116 ( 189 )   47 - 52   2016.8

     More details

    Language:Japanese   Publisher:電子情報通信学会  

    CiNii Books

    researchmap

  • The Past, the Present, and the Future of Special Interest Groups of JSAI

    Journal of the Japanese Society for Artificial Intelligence   31 ( 4 )   531 - 549   2016.7

     More details

    Language:Japanese   Publisher:The Japanese Society for Artificial Intelligence  

    DOI: 10.11517/jjsai.31.4_531

    CiNii Books

    researchmap

  • The Past, the Present, and the Future of Special Interest Groups of JSAI

    和泉 潔, 中臺 一博, 栗原 聡

    人工知能 : 人工知能学会誌 : journal of the Japanese Society for Artificial Intelligence   31 ( 4 )   531 - 549,530   2016.7

     More details

    Language:Japanese   Publisher:人工知能学会 ; 2014-  

    DOI: 10.11517/jjsai.31.4_531

    CiNii Books

    CiNii Research

    researchmap

  • 3D Posture Estimation for a Hose-shaped Rescue Robot using a Microphone and Accelerometer Array

    bando Yoshiaki, Itoyama Katsutoshi, Konyo Masashi, Tadokoro Satoshi, Nakadai Kazuhiro, Yoshii Kazuyoshi, G. Okuno Hiroshi

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2016 ( 0 )   1A2 - 10a6   2016.6

     More details

    Language:Japanese   Publisher:一般社団法人 日本機械学会  

    <p>This paper presents an online method that estimates a 3D posture of a hose-shaped rescue robot using a microphone and accelerometer array. Posture (shape) estimation of a self-driving hose-shaped rescue robot is crucial for handling the robot body because the unseen robot posture deforms in narrow spaces under collapsed buildings. Conventional sound-based method that uses time-differences of arrivals (TDOAs) works only on a two-dimensional surface and is often hampered by the rubble around the robot. Our method eliminates the outliers of sound-based TDOA measurements, and compensates the lack of the posture information with the tilt information measured by accelerometers. Experimental results using a 3-m hose-shaped robot that was deployed in a simple 3D structure demonstrate that our method reduces the errors of initial states to about 20cm in the 3D space.</p>

    DOI: 10.1299/jsmermd.2016.1A2-10a6

    J-GLOBAL

    researchmap

  • Development of Robot Audition under Severe Conditions

    G. OKUNO Hiroshi, NAKADI Kazuhiro, KUMON Makoto, ITOYAMA Katsutoshi, YOSHII Kazuyoshi, BANDO Yoshiaki, Sasaki Yoko

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2016 ( 0 )   1A2 - 09b3   2016.6

     More details

    Language:Japanese   Publisher:一般社団法人 日本機械学会  

    <p>The ability of robots to listen to several things at once with their own "ears", i.e., <i>robot audition</i>, is critical in improving the performance of search and rescue activities under severe conditions. This paper introduces "HARK" robot audition open-source software and its capabilities of suppressing ego-noise that is caused by robot's own movements such as motor, propeller and/or flying noise. Then it describes three main applications of robot audition: 1) Unmanned Aerial Vehicle (UAV) with a microphone array to capture sounds can localize a sound source by suppressing ego-noise with either hovering, slow gliding or fast gliding. It can also recognize a sound source by CNN. 2) A serpentine robot with a microphone array can estimate its posture by sound. It can also enhance a voice by Online Robust PCA. 3) A robot with a LiDAR and 32-channel microphone can visualize a sound map by superimposing sound source directions on point clouds.</p>

    DOI: 10.1299/jsmermd.2016.1a2-09b3

    CiNii Research

    J-GLOBAL

    researchmap

  • Online Localization of Multiple Sound Sources and Multiple Robots with Asynchronous Microphone Arrays

    Sekiguchi Kouhei, bando Yoshiaki, Nakamura Keisuke, Nakadai Kazuhiro, Itoyama Katsutoshi, Yoshii Kazuyoshi

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2016   1A2-09b5   2016.6

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    This paper presents an online method for localizing the positions of multiple sound sources and stationary robots and synchronizing microphone arrays attached to those robots. Since each robot can estimate only the directions of sound sources, the two-dimensional source positions can be estimated from the source direction estimated by each robot using a triangulation. In addition, mixture signals can be separated accurately by regarding multiple microphone arrays as one big array. To perform these tasks, some methods have been proposed for localizing and synchronizing microphone arrays. These methods, however, assume only a single sound source exists. To overcome this limitation, we estimate the directions of arrival (DOAs) and separate observed signals to estimate the time differences of arrival (TDOAs) by using microphone array techniques, and integrate the DOAs and TDOAs by using a state-space model. The latent variables are estimated in an online manner with a FastSLAM2.0 algorithm.

    DOI: 10.1299/jsmermd.2016.1A2-09b5

    CiNii Research

    J-GLOBAL

    researchmap

  • Report of JSAI SigConf 2015(Special Interest Group Report)

    Izumi Kiyoshi, Nakadai Kazuhiro, Yamakawa Hiroshi

    journal of the Japanese Society for Artificial Intelligence   31 ( 2 )   299 - 304   2016.3

     More details

    Language:Japanese   Publisher:The Japanese Society for Artificial Intelligence  

    DOI: 10.11517/jjsai.31.2_299

    CiNii Books

    CiNii Research

    researchmap

  • Computational Creation of Footsteps Illusion Art and Its Practical Applications

    Nakadai Kazuhiro, Okuno Hiroshi G, Mizumoto Takeshi, Nakamura Keisuke

    シミュレーション = Journal of the Japan Society for Simulation Technology   35 ( 1 )   32 - 38   2016.3

     More details

    Language:Japanese   Publisher:小宮山印刷工業  

    CiNii Books

    CiNii Research

    researchmap

  • 音源到来方向・時間差を用いた非同期複数マイクロホンアレイ位置のオンライン推定

    関口 航平, 中村 圭佑, 坂東 宜昭, 糸山 克寿, 吉井 和佳, 中臺 一博

    情報処理学会 第78回全国大会   2016 ( 1 )   483 - 484   2016.3

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    本稿では非同期複数マイクロホンアレイの同期ずれ・位置推定手法について述べる.マイクロホンアレイを搭載した複数台のロボットを用いた音源定位・分離などの音環境認識技術は,単独のロボットを用いた場合よりも高精度な処理を行うことができる.しかし,複数台のロボットを用いたマイクロホンアレイ信号処理には,各ロボットの位置,マイクロホンアレイ間の同期ずれの推定が不可欠である.本稿では各マイクロホンアレイごとに個別に推定した音源定位・位相情報をもとに,非同期複数マイクロホンアレイ間の同期ずれ・位置推定を行う.ロボットと音源の位置・同期ずれを潜在変数として持つ状態空間モデルを設計し,その事後分布をオンライン推定する.

    CiNii Books

    CiNii Research

    researchmap

  • Robust Recognition of Simultaneous Speech By a Mobile Robot

    Jean-Marc Valin, Shun'ichi Yamamoto, Jean Rouat, Francois Michaud, Kazuhiro Nakadai, Hiroshi G. Okuno

    IEEE Transactions on Robotics, Vol. 23, No. 4, pp. 742-752, 2007   2016.2

     More details

    Publishing type:Internal/External technical report, pre-print, etc.  

    This paper describes a system that gives a mobile robot the ability to<br />
    perform automatic speech recognition with simultaneous speakers. A microphone<br />
    array is used along with a real-time implementation of Geometric Source<br />
    Separation and a post-filter that gives a further reduction of interference<br />
    from other sources. The post-filter is also used to estimate the reliability of<br />
    spectral features and compute a missing feature mask. The mask is used in a<br />
    missing feature theory-based speech recognition system to recognize the speech<br />
    from simultaneous Japanese speakers in the context of a humanoid robot.<br />
    Recognition rates are presented for three simultaneous speakers located at 2<br />
    meters from the robot. The system was evaluated on a 200 word vocabulary at<br />
    different azimuths between sources, ranging from 10 to 90 degrees. Compared to<br />
    the use of the microphone array source separation alone, we demonstrate an<br />
    average reduction in relative recognition error rate of 24% with the<br />
    post-filter and of 42% when the missing features approach is combined with the<br />
    post-filter. We demonstrate the effectiveness of our multi-source microphone<br />
    array post-filter and the improvement it provides when used in conjunction with<br />
    the missing features theory.

    DOI: 10.1109/TRO.2007.900612

    arXiv

    researchmap

  • UAV搭載マイクアレイを用いた高雑音環境下における音イベント検出・識別の並列最適化

    杉山治, 小島諒介, 中臺一博, 中臺一博

    人工知能学会AIチャレンジ研究会(Web)   46th   32‐36 (WEB ONLY)   2016

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 部分共有アーキテクチャを用いた深層学習ベースの音源同定の検討

    森戸隆之, 杉山治, 小島諒介, 中臺一博, 中臺一博

    人工知能学会AIチャレンジ研究会(Web)   46th   12‐17 (WEB ONLY)   2016

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 深層学習による多チャネル音響信号に対する音源同定の検討

    森戸隆之, 杉山治, 上村知史, 小島諒介, 中臺一博, 中臺一博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   16th   ROMBUNNO.2D1‐4   2015.12

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • HARK2.2の新機能とその組込み,SaaSへの展開

    中臺一博, 中臺一博, 水本武志, 中村圭佑, 奥乃博

    計測自動制御学会システムインテグレーション部門講演会(CD-ROM)   16th   ROMBUNNO.2M2‐1   2015.12

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ロバスト主成分分析を用いた動作雑音抑圧に基づく柔軟索状ロボットのための音声強調

    坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 中臺一博, 吉井和佳, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   33rd   ROMBUNNO.2D2-05   2015.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Automatic impulse response truncation based on relative amplitude spectrum

    中島 弘史, 坂田 直人, 加科 優希, 中臺 一博

    回路とシステムワークショップ論文集 Workshop on Circuits and Systems   28   208 - 213   2015.8

     More details

    Language:Japanese   Publisher:[電子情報通信学会]  

    CiNii Research

    J-GLOBAL

    researchmap

  • Wind-induced noise reduction by linear beamforming using a 2-channel microphone

    坂田 直人, 村上 哲郎, 中島 弘史, 中臺 一博

    回路とシステムワークショップ論文集 Workshop on Circuits and Systems   28   359 - 364   2015.8

     More details

    Language:Japanese   Publisher:[電子情報通信学会]  

    CiNii Research

    J-GLOBAL

    researchmap

  • 両耳聴ロボット聴覚ソフトウェアHARK‐BinauralとRaspberry Pi2を用いたヒューマノイドロボットへの適用

    坂東宜昭, 金宜鉉, 糸山克寿, 吉井和佳, 中臺一博, 中臺一博, 奥乃博

    情報処理学会研究報告(Web)   2015 ( MUS-107 )   VOL.2015-MUS-107,NO.33 (WEB ONLY)   2015.5

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 柔軟索状レスキューロボットのためのロバスト主成分分析を用いた走行雑音抑圧

    坂東 宜昭, 池宮 由楽, 糸山 克寿, 昆陽 雅司, 田所 諭, 中臺 一博, 吉井 和佳, 奥乃 博

    第77回全国大会講演論文集   2015 ( 1 )   505 - 506   2015.3

     More details

    Language:Japanese  

    本稿では,柔軟索状レスキューロボットのための走行雑音抑圧手法について述べる.人間の侵入が困難な災害現場(例:倒壊家屋)においては,被災者の声を手がかりにしたレスキューロボットによる捜索が有用である.柔軟索状レスキューロボットなどの地上走行型ロボットでは,自身の走行雑音によって被災者の声が聞き取りづらくなるうえ,走行雑音は接地面に依存するため,事前の予測が困難であった.本研究では,この問題を解決するため,繰り返し出現する周波数成分を事前情報を用いずに除去することができるロバスト主成分分析を用いて走行雑音抑圧を行う.実際にロボットを動作させて得られた録音データを用いた実験により,提案法を評価した.

    CiNii Books

    researchmap

  • 柔軟索状レスキューロボットのためのロバスト主成分分析を用いた走行雑音抑圧

    坂東宜昭, 池宮由楽, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博

    情報処理学会全国大会講演論文集   77th ( 2 )   2.505-2.506   2015.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

    SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

    IEICE technical report. Signal processing   114 ( 474 )   1 - 6   2015.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter&#039;s ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

    CiNii Books

    CiNii Research

    researchmap

  • Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

    SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

    IEICE technical report. Speech   114 ( 475 )   1 - 6   2015.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter&#039;s ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

    CiNii Books

    researchmap

  • Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

    SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

    Technical report of IEICE. EA   114 ( 473 )   1 - 6   2015.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter&#039;s ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

    CiNii Books

    researchmap

  • TeleCoBot : A Telepresence system of taking account for conversation environment

    TAKAHASHI Masaaki, OGATA Masa, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   114 ( 351 )   1 - 5   2014.12

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    The study of the telepresence robot becomes popular as a communication tool in the remote place. However, there is a problem that the telepresence system can&#039;t precisely transfer the user&#039;s utterance because of not considering difference of sound environment such as noise. In addition, when the user talks with several people in remote place, the user wants freely to change the speaker volume depending on the situation. Therefore we propose a telepresence conversation robot named &quot;TeleCoBot&quot;. It provides the function automatically regulate the volume of user&#039;s utterance according to the distance of the partner and noise level in remote place. In addition, user can change the volume freely depending on the conversation situation. In this paper, we conduct the case study, and the result indicated that TeleCoBot&#039;s UI should be more effctive and enhance the presence.

    CiNii Books

    researchmap

  • Deep Neural Networkを用いたマルチモーダル音声認識

    野田邦昭, 山口雄紀, 中臺一博, 奥乃博, 尾形哲也

    日本ロボット学会学術講演会予稿集(CD-ROM)   32nd   ROMBUNNO.1I1-04   2014.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • マイクロホンアレイを用いた駆動機構付ホース型ロボットの姿勢推定

    坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   32nd   ROMBUNNO.1I2-02   2014.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Design and Implementation of Multidirectional Sound Annotation Tool with HARK

    SUGIYAMA Osamu, ITOYAMA Katsutoshi, NAKADAI Kazuhiro, OKUNO Hiroshi G

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   114 ( 85 )   23 - 26   2014.6

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this study we designed and developed the multidirectional sound source annotation tool with the robot audition software, HARK. With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Our proposed sound annotation tool provides drag &amp; drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user &#039; s annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.

    CiNii Books

    CiNii Research

    researchmap

  • Deep Neural Networkを用いたマルチモーダル音声認識の為の特徴量学習

    山口雄紀, 野田邦昭, 中臺一博, 奥乃博, 尾形哲也

    第76回全国大会講演論文集   2014 ( 1 )   465 - 466   2014.3

     More details

    Language:Japanese  

    本研究の目標は,マルチモーダル音声認識の為の画像特徴量の設計である.マルチモーダル音声認識の精度向上のためには,唇画像からどのようにして音声認識の最小単位である音素や口形素を表現する情報を取り出すかが重要な課題である.本研究では,特徴量学習の新たな手法として注目を集めているDeep Neural Network (DNN)を用いて大量の唇画像から画像特徴量を自己組織的に抽出する手法を構築した.得られた画像特徴量を孤立単語認識タスクで検証するとともに特徴量空間を解析する事で口形素との関連についても考察した.また,得られた画像特徴量と音声を用いた視聴覚統合によるノイズ頑健性の向上について検証を行った.

    CiNii Books

    researchmap

  • マイクロホンアレイの位置推定によるホース型ロボットの姿勢推定

    坂東宜昭, 大塚琢馬, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 奥乃博

    第76回全国大会講演論文集   2014 ( 1 )   189 - 190   2014.3

     More details

    Language:Japanese  

    ホース型ロボットは細長い形状が特徴のレスキューロボットで,倒壊した建築物の隙間などの探索が可能である.操縦の効率化のために加速度センサやカメラ画像などを用いた本ロボットの姿勢推定法が提案されてきたが,累積誤差が生じるなどの問題があった.本稿ではマイクロホンアレイと小型スピーカを本ロボットに装着し,音によるこれらの位置推定によって姿勢を推定する手法について述べる.本手法ではスピーカから発する試験音の各マイクへの到達時間差を用いて姿勢を推定するが,到達時間差は現在のマイクとスピーカの位置関係を表しており,過去の誤差を修正できる.実録音データを用いて本手法の有効性を評価した.

    CiNii Books

    researchmap

  • 音ランドマークを用いたマルチコプターの定位

    ラナシナパヤ, 中村圭佑, 中臺一博, 高橋秀幸, 木下哲男

    第76回全国大会講演論文集   2014 ( 1 )   185 - 186   2014.3

     More details

    Language:English  

    We propose a novel approach to multicopter localization, using sound landmarks and one embedded microphone. This approach can benefit to multicopter localization in that it requires less computational power and smaller payloads than image-based approaches. However, the high ego-noise of multicopters is a serious threat for sound-based algorithms. We simulated a 2D localization method based on a Kalman Filter using measurements of acceleration and sound landmarks&#039; intensity. A random walk model is used to update the multicopter&#039;s position with the Kalman Filter; the calculated estimation is then corrected using noisy measurements from the embedded microphone and accelerometer. Simulation results show that the proposed algorithm can successfully track the multicopter&#039;s motion in a noisy environment. We confirmed the effectiveness of our proposed algorithm by comparing its performance and robustness to a time/phase based algorithm.

    CiNii Books

    researchmap

  • DI-1-6 Scene Analysis Based on Robot Audition

    Nakadai Kazuhiro, Nakamura Keisuke, Tezuka Taiki

    Proceedings of the IEICE General Conference   2014 ( 2 )   "SS - 18"-"SS-19"   2014.3

     More details

    Language:Japanese   Publisher:一般社団法人電子情報通信学会  

    CiNii Books

    CiNii Research

    researchmap

  • 相関行列スケーリングを用いた屋外音源探索手法の解析

    大畑琢磨, 長峰諒英, 中村圭佑, 石崎孝幸, 水本武志, 中臺一博, 中臺一博

    人工知能学会AIチャレンジ研究会(Web)   41st   2014

  • Online calibration and transfer function estimation of an asynchronous microphone array

    Nakadai Kazuhiro, Nakamura Keisuke

    THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN   70 ( 7 )   397 - 402   2014

     More details

    Language:Japanese   Publisher:一般社団法人 日本音響学会  

    DOI: 10.20697/jasj.70.7_397

    CiNii Books

    CiNii Research

    researchmap

  • マイクロホンアレイとスピーカをもつ柔軟索状ロボットのための動的スピーカ選択による姿勢推定の高速化

    坂東宜昭, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 吉井和佳, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   41st   8 (WEB ONLY)   2014

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • TelePaBot : A telepresence system for supporting multi-party conversation

    Koike Kyotaro, Imai Michita, Nakamura Keisuke, Nakadai Kazuhiro

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   113 ( 372 )   1 - 6   2013.12

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    A telepresence robot is useful to deal with a situation where a user in a remote area has to control the robot to communicate with people. However, there exists some remaining issues that the target speech is contaminated with unnecessary speeches, and the remote user cannot understand the speech in the case of multi-party conversation. We propose a telepresence party robot, &quot;TelePaBot&quot; that visualizes utterance&#039;s position and purveys a selective listening function. A case study suggested that TelePaBot smoothens remote-communication even when multi-party conversation occurs.

    CiNii Books

    researchmap

  • Volume Adaptation and Visualization by Modeling the Volume Level in Actual Noise Environment for Telepresence System

    HAYAMIZU Akira, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報   113 ( 372 )   35 - 40   2013.12

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    The Lombard effect is the involuntary tendency of speakers to increase their vocal effort when speaking in loud noise to enhance the audibility of their voice. There is a problem in a telecommunication situation due to the Lombard effect, and would talk loudly than necessary for the conversation partner at a remote location. In this paper, the design and the model that is required in order to adjust automatically the volume of the operator at the remote communication via telepresence robot mobile in the real world, the optimal volume control system LOMBOT equipped with a model was developed. As a result, We confirmed that the volume is adjusted properly to the noise of the remote location

    CiNii Books

    researchmap

  • Incremental Noise Estimation in Outdoor Auditory Scene Analysis using a Quadrocopter with a Microphone Array

    OKUTANI Keita, YOSHIDA Takami, NAKAMURA Keisuke, NAKADA Kazuhiro

    Journal of the Robotics Society of Japan   31 ( 7 )   676 - 683   2013.9

     More details

    Language:Japanese   Publisher:The Robotics Society of Japan  

    This paper addresses sound source localization using an aerial vehicle with a microphone array in an outdoor environment to realize outdoor auditory scene analysis. It, for instance, aims at finding distressed people in a disaster situation. In such an environment, noise is quite loud and dynamically-changing, and conventional microphone array techniques studied in the field of indoor robot audition are of less use. We, thus, proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC). It can deal with dynamically-changing high power noise by introducing incrementally-estimated noise correlation matrices. We developed a prototype system for the outdoor auditory scene analysis based on the proposed method using the Parrot AR.Drone with an 8ch microphone array and a Kinect device. Experimental results using the prototype system showed that dynamically-changing noise is properly suppressed with the proposed method even when the signal-to-noise ratio is less than 0dB in an outdoor/indoor environment with the hovering/moving AR.Drone.

    DOI: 10.7210/jrsj.31.676

    CiNii Books

    researchmap

  • Multirotor UAVを用いた音源定位のための雑音相関行列推定

    古川孝太郎, 大塚琢馬, 糸山克寿, 中臺一博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   31st   ROMBUNNO.3D3-02   2013.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ホース型ロボットのマイクロホンアレイを用いた姿勢推定

    坂東宜昭, 大塚琢馬, 水本武志, 糸山克寿, 中臺一博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   31st   ROMBUNNO.3D3-01   2013.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 話者ダイアライゼーションシステムのための音声区間検出および到来方向推定の精度向上の検討

    黄楊暘, 大塚琢馬, 中臺一博, 奥乃博

    第75回全国大会講演論文集   2013 ( 1 )   479 - 480   2013.3

     More details

    Language:Japanese   Publisher:情報処理学会  

    ロボット聴覚では, いつ, どこで, 誰が話したかを解明する音環境理解機能が不可欠である. 本稿では, それらの問題を解決するために, 音声区間検出, 到来方向推定および話者同定技術を組み合わせた処理を話者ダイアライゼーションシステムとする. ロボット聴覚ソフトウエア HARK においては, MUSIC アルゴリズムを前処理として, 音声区間検出および到来方向推定を行っている. しかし, MUSIC スペクトルに基づいて処理を行う際に, 音源数パラメータおよび閾値パラメータが結果を大きく左右する. 本稿では, ブラインド音源分離を前処理とする話者ダイアライゼーションシステムを提案した. 音量閾値パラメータの設定は依然必要であるが, 精度向上したパフォーマンスが得られている.

    CiNii Books

    CiNii Research

    researchmap

  • チューブ型ロボットの姿勢推定のためのEKF-SLAMを用いた可変マイクロホンアレイ位置推定

    坂東宜昭, 水本武志, 中臺一博, 奥乃博

    全国大会講演論文集   2013 ( 1 )   439 - 441   2013.3

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    災害現場での被災者発見にはがれき内へ進入可能なチューブ型ロボットが有用である.さらにチューブ型ロボットに音源定位機能があれば被災者の声から位置の推定が可能となる.しかし,近年の高精度な音源定位手法は位置が既知のマイクアレイで収録した音声から方向を推定しているが,チューブ型ロボットではマイク配置を事前に計測できない.そこで本稿ではEKF-SLAMによるマイクロフォン位置推定手法提案し,常に変化するロボット姿勢の推定によって本問題を解決する.数値実験と実録音の両方を用いて本手法の有効性を確認した.

    CiNii Books

    CiNii Research

    researchmap

  • チューブ型ロボットの姿勢推定のためのEKF-SLAMを用いた可変マイクロホンアレイ位置推定

    坂東宜昭, 水本武志, 中臺一博, 奥乃博

    第75回全国大会講演論文集   2013 ( 1 )   439 - 440   2013.3

     More details

    Language:Japanese  

    災害現場での被災者発見にはがれき内へ進入可能なチューブ型ロボットが有用である.さらにチューブ型ロボットに音源定位機能があれば被災者の声から位置の推定が可能となる.しかし,近年の高精度な音源定位手法は位置が既知のマイクアレイで収録した音声から方向を推定しているが,チューブ型ロボットではマイク配置を事前に計測できない.そこで本稿ではEKF-SLAMによるマイクロフォン位置推定手法提案し,常に変化するロボット姿勢の推定によって本問題を解決する.数値実験と実録音の両方を用いて本手法の有効性を確認した.

    CiNii Books

    CiNii Research

    researchmap

  • クアドロコプターを用いた飛行雑音に頑健な音源定位

    古川孝太郎, 奥谷啓太, 柳楽浩平, 大塚琢馬, 中臺一博, 奥乃博

    第75回全国大会講演論文集   2013 ( 1 )   489 - 490   2013.3

     More details

    Language:Japanese  

    本研究は多数の回転翼を持つ小型の無人航空機, クアドロコプターにマイクロフォンアレイを搭載し, 周囲の環境における音源定位問題を取り扱う.通常, 飛行時には風圧やローターの駆動に起因する雑音が極めて大であり, 定位精度の劣化原因となり得る.このような雑音環境下では, 一般化固有値分解を用いた MUSIC 法により雑音相関行列を加味することで精度が改善することが知られている.そこで本研究は, 飛行に伴って動的に変化する雑音相関行列の推定へと問題を帰着する.その上で飛行制御などの機体のモニタ情報を用いた推定手法を提案し, 飛行雑音に頑健な音源定位手法を開発する.

    CiNii Books

    researchmap

  • ホースの伸び縮みによるマイク位置の変化を許容するマイクロホンアレイを用いたホース型ロボットの姿勢推定

    坂東宜昭, 大塚琢馬, 糸山克寿, 中村圭佑, 昆陽雅司, 田所諭, 中臺一博, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   38th   10 (WEB ONLY)   2013

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 2P1-P24 Development of a Sound Soure Localization System for Assisting Group Conversation(Communication Robot)

    Moon Seong-eun, Takagi Kentaro, Kamashima Tsutomu, Nakadai Kazuhiro, Otake Mihoko

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2013 ( 0 )   _2P1 - P24_1-_2P1-P24_2   2013

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    This paper presents a sound source localization system that composes a wireless microphone array named Jellyfish-02 and robot audition software HARK. Jellyfish-02 surpasses existing microphone array in design and usability, because it has a cover with rechargeable battery, which can be connected to wireless network. We evaluated sound source localization performance of Jellyfish-02, and investigated the percentage of speech overlapped periods in natural conversation. Prom the results, Jellyfish-02 is potentially applicable for assisting group conversation by measuring duration of speech for each participant.

    CiNii Books

    J-GLOBAL

    researchmap

  • マイクロホンアレイを用いた複数人対話からの音声区間検出および話者方向推定の評価手法

    黄楊暘, 大塚琢馬, 中臺一博, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   30th   ROMBUNNO.3D1-4   2012.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Sensing Technology for Listening to a Mixture of Sounds

    OKUNO Hiroshi G, NAKADAI Kazuhiro, MIZUMOTO Takeshi

    The Journal of the Institute of Electronics, Information, and Communication Engineers   95 ( 5 )   401 - 404   2012.5

     More details

    Language:Japanese   Publisher:一般社団法人電子情報通信学会  

    私たちが日常耳にする音は複数の音や背景雑音が混じった混合音である.実世界で音情報を活用するためには「聞き分ける」機能が不可欠である.聞き分けるセンサ技術は,インストルメンテーション(装置化)という観点から音を収録するデバイス(センサ)と収録音に対する処理ソフトウェアから構成される.本稿では,混合音のセンサ技術の動向を,ロボット聴覚とカエルの合唱の観測について解説を行う.混合音を聞き分けるという立場から,音源定位,音源分離,分離音認識に取り組むべきであると考え,音環境理解という研究を過去15年進めてきた.離れて聞くという技術は,ロボットでは不可欠の技術であり,ロボット聴覚に不可欠な機能を統合的に提供するソフトウェアHARKを開発し,公開している.HARKの設計思想から具体的な実装まで概観し,その応用として,音環境可視化技術と人ロボット共生学への応用について報告する.また,カエルの合唱機構を音を聞き分けて解析する応用では,フィールドで聞こえる様々な音のために,音響処理だけでは難しいので,近傍の音を拾ってLEDを光らせる「カエルホタル」を開発した.カエルホタルを多数並べて実際の田んぼで観測し,カエルの鳴き方の観測実験についても合わせて報告する.以上の報告を通して,混合音を聞き分ける技術が,今後重要な技術になることを提案する.

    CiNii Books

    CiNii Research

    researchmap

  • Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

    糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

    第74回全国大会講演論文集   2012 ( 1 )   355 - 356   2012.3

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    人のギター演奏を対象とした実時間のビートトラッキングでは,シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある.我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では,視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い,手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

    CiNii Books

    researchmap

  • Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

    糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

    全国大会講演論文集   2012 ( 1 )   355 - 357   2012.3

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    人のギター演奏を対象とした実時間のビートトラッキングでは,シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある.我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では,視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い,手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

    CiNii Books

    researchmap

  • 多チャンネルマイクロホンアレイを用いた音声区間検出および音源定位の精度の向上の検討

    HUANG Yangyang, 大塚琢馬, 中臺一博, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   36th   5 (WEB ONLY)   2012

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ロボットのための実環境ロバストな実時間超解像三次元音源定位

    中村圭佑, 中臺一博, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   36th   2 (WEB ONLY)   2012

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 遠隔ユーザの音環境理解を支援するユーザインタフェース

    植田 俊輔, 今井 倫太, 中村 圭佑, 中臺 一博

    人工知能学会全国大会論文集   2012 ( 0 )   3K1R111 - 3K1R111   2012

     More details

    Language:Japanese   Publisher:一般社団法人 人工知能学会  

    &lt;p&gt;人間は雑音が多い環境下であってもある程度どこでどのような会話が行われているかを理解する事が出来るが,遠隔操作を行うロボットアバタでは遠隔操作者が遠隔地の音環境を理解する事は困難である.本稿では,雑音環境下でも操作者と遠隔地がインタラクションをスムーズに行うことを支援するユーザインタフェースUI-ALTを提案する.オフライン実験によりUI-ALTは遠隔操作者の雑音環境理解に有用であることが示された.&lt;/p&gt;

    DOI: 10.11517/pjsai.jsai2012.0_3k1r111

    CiNii Books

    CiNii Research

    researchmap

  • Intelligent Human Tracking Based on Multimodal Integration

    NAKAMURA Keisuke, NAKADAI Kazuhiro, ASANO Futoshi, NAKAJIMA Hirofumi, INCE G&ouml, khan

    Transactions of the Society of Instrument and Control Engineers   48 ( 6 )   349 - 358   2012

     More details

    Language:English   Publisher:The Society of Instrument and Control Engineers  

    Localization and tracking of humans are essential research topics in robotics. In particular, Sound Source Localization (SSL) has been of great interest. Despite the numerous reported methods, SSL in a real environment had mainly three issues; robustness against noise with high power, no framework for selective listening to sound sources, and tracking of inactive and/or noisy sound sources. For the first issue, we extended Multiple SIgnal Classification by incorporating Generalized Eigen Value Decomposition (GEVD-MUSIC) so that it can deal with high power noise and can select target sound sources. For the second issue, we proposed Sound Source Identification (SSI) based on hierarchical Gaussian mixture models and integrated it with GEVD-MUSIC to realize a function to listen to a specific sound source according to the sort of the sound source. For the third issue, auditory and visual human tracking were integrated using particle filtering. These three techniques are integrated into an intelligent human tracking system. Experimental results showed that integration of SSL and SSI successfully achieved human tracking only by audition, and the audio-visual integration showed considerable improvement in tracking by compensating the loss of auditory or visual information.

    researchmap

  • A Platform for Recognizing Interactive Behavior on Human-Robot Interaction

    SHIOMI Masahiro, IWAI Yoshio, SUMI Yasuyuki, NAKADAI Kazuhiro, HAGITA Norihiro

    Journal of the Robotics Society of Japan   29 ( 10 )   883 - 886   2011.12

     More details

    Language:Japanese   Publisher:The Robotics Society of Japan  

    DOI: 10.7210/jrsj.29.883

    CiNii Books

    researchmap

  • Intelligent Human Tracking based on Information Integration

    NAKAMURA Keisuke, NAKADAI Kazuhiro, INCE Gokhan

    IEICE technical report   111 ( 32 )   35 - 40   2011.5

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    Since scene recognition and robot perception have been of great interest, information integration has become a significant research topic in robotics. From the viewpoint of scalability and reusability, utilization of appropriate middleware is a key factor to improve total system performance. This paper presents an integration methodology of multimodal information through constructing an intelligent human tracking system. Our system architecture interoperably combines two different types of middleware ; HARK and ROS. HARK uses dataflow-oriented middleware for real-time processing while ROS is event-driven middleware for easy integration. We confirmed that the proposed architecture realized real-time processing and considerable improvements of noise-robustness in human tracking.

    CiNii Books

    CiNii Research

    researchmap

  • ロボット聴覚用オープンソースソフトウェア HARKの展開

    中臺一博, 奥乃博

    デジタルプラクティス   2 ( 2 )   133 - 140   2011.4

     More details

    Language:Japanese   Publisher:情報処理学会  

    ロボット聴覚用のオープンソースソフトウェアとして研究開発を行っているHARK (HRI-JP Audition for Robots with Kyoto Univ.) の展開について説明する.HARK は複数のマイクロフォン(マイクロフォンアレイ)からの入力をもとに,音源定位,音源分離,さらに分離音声の認識までをサポートするソフトウェアであり,GUIプログラミング環境上で様々なモジュールを配置・接続することにより,形状やマイクロフォンレイアウトが異なるロボットに対応させたり,用途に合わせたロボット聴覚システムを構築したりすることができる.本稿では,HARK の設計指針を解説し,HARKを用いて構築したシステムの応用例,HARKの展開も併せて報告する.

    CiNii Books

    CiNii Research

    researchmap

  • 累積頻度重みを適用したパーティクルフィルタによる実時間楽譜追従

    大塚琢馬, 中臺一博, 高橋徹, 尾形哲也, 奥乃博

    第73回全国大会講演論文集   2011 ( 1 )   305 - 306   2011.3

     More details

    Language:Japanese  

    パーティクルフィルタによる楽譜追従は,音響信号と楽譜との距離に基づくパーティクル重みの計算によって追従性能が大きく左右される.従来のベクトル内積計算やシグモイド関数を用いた重み計算手法では,音響信号の非調波成分や楽器の音色のバリエーションにより,楽譜位置推定が正しい場合,誤った場合でそれぞれの重みに大きな差が生じず,最終的に推定された楽譜位置に誤差が含まれるという問題点があった.本稿では,過去に計算された距離の累積頻度から重みを動的に計算し,正しい楽譜位置ではより高い重みを計算する.評価実験では,累積頻度を用いた重み計算法が,従来の重み計算法よりも楽譜追従精度で改善することが確認された.

    CiNii Books

    researchmap

  • Audio-visual musical instrument recognition

    AngelicaLim, 中村圭佑, 中臺一博, 尾形哲也, 奥乃博

    第73回全国大会講演論文集   2011 ( 1 )   309 - 310   2011.3

     More details

    Language:English  

    Is this person playing a violin or a flute? Classification of musical instrument performances is usually carried out using audio features such as spectral coefficients. We propose augmenting the typical audio feature set with visual features. We show that a combination of audio features and video perform better than audio alone, and verify this multimodal recognition approach on a real-time robot platform.

    CiNii Books

    researchmap

  • Machine Audition Technology that Listens to Multiple Voiced Speech at Once

    G. OKUNO Hiroshi, NAKADAI Kazuhiro

    The Journal of The Institute of Electrical Engineers of Japan   131 ( 3 )   159 - 163   2011.3

     More details

    Language:Japanese   Publisher:一般社団法人 電気学会  

    This article has no abstract.

    DOI: 10.1541/ieejjournal.131.159

    CiNii Books

    CiNii Research

    researchmap

  • Robot Audition : Hands-Free Automatic Speech Recognition under Highly-Noisy Environemnts

    NAKADAI Kazuhiro, OKUNO Hiroshi G

    IEICE technical report   110 ( 401 )   7 - 12   2011.1

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    This paper addresses robot audition, which realizes listening capabilities for robots using robot-embedded microphones. For robot audition, we propose real-time sound source separation and automatic speech recognition (ASR) techniques for dynamically changing environments based on microphone array processing, which is applicable to hands-free ASR under highly-noisy environments. Implementation of the proposed techniques is open-sourced as robot audition software called &quot;HARK.&quot; We show the effectiveness of these techniques through applications of HARK to robots.

    CiNii Books

    CiNii Research

    researchmap

  • マルチロボットによるKinectを用いた同期合奏

    糸原達彦, 水本武志, LIM Angelica, 大塚琢馬, 中村圭佑, 長谷川雄二, 中臺一博, 尾形哲也, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   34th   B102-10 (WEB ONLY)   2011

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 音源定位手法MUSICのベイズ拡張

    大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

    人工知能学会AIチャレンジ研究会(Web)   34th   B102-6 (WEB ONLY)   2011

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • AI-1-3 OPEN-SOURCED ROBOT AUDITION SOFTWARE HARK

    Okuno Hiroshi G, Nakadai Kazuhiro, Takahashi Toru

    Proceedings of the Society Conference of IEICE   2010   "SS - 72"-"SS-73"   2010.8

     More details

    Language:Japanese   Publisher:一般社団法人電子情報通信学会  

    CiNii Books

    CiNii Research

    researchmap

  • ロボット聴覚ソフトウエアHARKとそのロボットへの適用

    高橋徹, 中臺一博, 奥乃博

    電気関係学会東海支部連合大会講演論文集(CD-ROM)   2010   ROMBUNNO.S3-1   2010.8

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Real time speaker orientation estimation using a room microphone array

    HARUBARA Takuya, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, KANEDA Yutaka

    IEICE technical report   110 ( 131 )   19 - 24   2010.7

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    This paper addresses a real-time sound source orientation estimation system using a 96ch microphone-array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. Furthermore, we showed that the precision of the orientation estimation system is improved to introduce four additional techniques: Amplitude-extraction, correlation-based automatic voice activity detection(VAD), frequency mask and histogram integration. We developed a real-time sound source orientation system. However, the precision of the real-time system is sufficient for practical use. In this paper, we investigate the main causes of the estimation error and propose an advanced real-time orientation estimation system. The experimental results show that the advanced system has lower errors than the previous system by 20°- -30°.

    CiNii Books

    CiNii Research

    researchmap

  • Score Following by Particle Filtering for Music Robots

    OTSUKA Takuma, NAKADAI Kazuhiro, TAKAHASHI Toru, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集   72 ( 0 )   913 - 914   2010.3

     More details

    Language:English  

    CiNii Books

    researchmap

  • Robot audition system development and parameter-turning in real environment

    TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集   72 ( 0 )   29 - 30   2010.3

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • Self-speech cancellation with Semi-blind ICA for Robot speech interaction

    TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G.

    全国大会講演論文集   72 ( 0 )   27 - 28   2010.3

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • Robot Audition Open-Sourced Software HARK

    奥乃 博, 中臺 一博

    日本ロボット学会誌(Journal of the Robotics Society of Japan)   28 ( 1 )   6 - 9   2010.1

     More details

    Language:Japanese   Publisher:日本ロボット学会  

    DOI: 10.7210/jrsj.28.6

    CiNii Books

    CiNii Research

    researchmap

  • On special issue "Robot Audition"

    中臺 一博, 宮下 敬宏, 奥乃 博

    日本ロボット学会誌(Journal of the Robotics Society of Japan)   28 ( 1 )   1 - 1   2010.1

     More details

    Language:Japanese   Publisher:日本ロボット学会  

    CiNii Books

    CiNii Research

    researchmap

  • On special issue "Robot Audition"

    NAKADAI Kazuhiro, MIYASHITA Takahiro, OKUNO Hiroshi G

    Journal of the Robotics Society of Japan   28 ( 1 )   1 - 1   2010.1

     More details

    Language:Japanese   Publisher:The Robotics Society of Japan  

    DOI: 10.7210/jrsj.28.1

    CiNii Books

    CiNii Research

    researchmap

  • Robot Audition Open-Sourced Software HARK

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    Journal of the Robotics Society of Japan   28 ( 1 )   6 - 9   2010.1

     More details

    Language:Japanese   Publisher:The Robotics Society of Japan  

    DOI: 10.7210/jrsj.28.6

    CiNii Books

    researchmap

  • リサンプル‐ブロック処理と並列化に基づくICAの実時間実装

    武田龍, 中臺一博, 高橋徹, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   28th   ROMBUNNO.1H3-1   2010

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 打楽器とロボットとの合奏のための結合振動子モデルに基づく打撃時刻予測

    水本武志, 中臺一博, 大塚琢馬, 高橋徹, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   28th   ROMBUNNO.1H3-2   2010

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Blind dereverberation improved by multi-stage processing

    NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

    IEICE technical report   109 ( 136 )   7 - 12   2009.7

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    This paper addresses a multi-stage processing mechanism that improves various dereverberation methods. In the mechanism, each stage is implemented as an intermediate processing module that connects the outputs of the modules on the previous stage to the inputs of other modules on the next stage. Since the dereverberation performance at each stage depends on the input channel combinations, we proposed two additional processes: channel selection and delay addition. We applied our proposed mechanism with these two processes to two dereberveration methods, i.e., SBM and RDAIF. The proposed system showed the following results: (1) Channel selection process improved 3-10dB. The optimum combination can reduce the number of input channels without any degradation. (2) Delay addition process improved the suppression performance by 3-10dB. (3) Multi-stage mechanism improved for SBM and RDAIF are 18.2dB and 13.6dB, respectively, while the performance without the mechanism are only 14.6dB and 3.5dB, respectively. We can conclude that the proposed mechanism and processes are effective to improve reverberation performance.

    CiNii Books

    researchmap

  • Blind Dereverberation Improved By Multi-Stage Processing

    NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

    電子情報通信学会技術研究報告. EA, 応用音響   109 ( 136 )   7 - 12   2009.7

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    This paper addresses a multi-stage processing mechanism that improves various dereverberation methods. In the mechanism, each stage is implemented as an intermediate processing module that connects the outputs of the modules on the previous stage to the inputs of other modules on the next stage. Since the dereverberation performance at each stage depends on the input channel combinations, we proposed two additional processes: channel selection and delay addition. We applied our proposed mechanism with these two processes to two dereberveration methods, i.e., SBM and RDAIF. The proposed system showed the following results: (1) Channel selection process improved 3-10dB. The optimum combination can reduce the number of input channels without any degradation. (2) Delay addition process improved the suppression performance by 3-10dB. (3) Multi-stage mechanism improved for SBM and RDAIF are 18.2dB and 13.6dB, respectively, while the performance without the mechanism are only 14.6dB and 3.5dB, respectively. We can conclude that the proposed mechanism and processes are effective to improve reverberation performance.

    CiNii Books

    CiNii Research

    researchmap

  • The design of a directional sound source for numerical simulation based on wave acoustics

    鈴木 淑正, 中島 弘史, 中臺 一博

    聴覚研究会資料   39 ( 4 )   325 - 330   2009.6

     More details

    Language:Japanese   Publisher:日本音響学会聴覚研究委員会  

    CiNii Books

    CiNii Research

    researchmap

  • The design of a directional sound source for numerical simulation based on wave acoustics

    SUZUKI Toshimasa, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, ARAI Takahiro, HASEGAWA Yuji

    IEICE technical report   109 ( 100 )   109 - 114   2009.6

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    Thanks to improvements in computer performance, numerical simulation based on wave acoustics works in practical time with off-the-shelf computers. Such a numerical simulation method accurately estimates a sound field when it is a simple and simulated environment like a free sound field. However, this method has difficulties in simulating a real-world acoustic environment. One of issues for real-world simulation is to deal with a sound directivity. Thus, most numerical simulators assume a point sound source to avoid this issue. Indeed, several studies to cope with a sound directivity have been reported, but, the accuracy and practical utility are insufficient for real world simulation, because an accurate sound propagation model is necessary to deal with a sound directivity. We use a compact finite difference method based on sound field digitization which has an accurate sound propagation model. However, this method also has a problem, that is, two points are simulated differently even when they are located with the same distance from the sound source due to the difference in the effect of their numerical dispercion. In this paper we, first, confirm the performance of our method by using an omni-directional point source in a free sound field. After that, we show that our method is able to simulate a directional sound source accurately using a combination of a simple loudspeaker and a point source model.

    CiNii Books

    researchmap

  • Simultaneous three talker speech recognition using soft mask and model adaptation technique

    TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集   71 ( 0 )   35 - 36   2009.3

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • Realtime Syncronization Method between Audio Signal and Score Using Beats, Melodies, and Harmonies for Singer Robots

    OTSUKA Takuma, MURATA Kazumasa, TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, OGATA Tetsuya, OKUNO Hiroshi G.

    全国大会講演論文集   71 ( 0 )   243 - 244   2009.3

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • Perspective of Robot Systems Coexisting with People

    NAKADAI Kazuhiro, HASEGAWA Yuji, SEKIGUCHI Tatsuhiko, TSUJINO Hiroshi

    Journal of the Robotics Society of Japan   27 ( 1 )   6 - 9   2009.1

     More details

    Language:Japanese   Publisher:一般社団法人 日本ロボット学会  

    DOI: 10.7210/jrsj.27.6

    CiNii Books

    researchmap

  • Panel Discussion : Application Developments of Speech Recognition

    NISIMURA Ryuichi, NAKANO Teppei, KURIHARA Kazutaka, NAKADAI Kazuhiro, YOSHINO Takashi

    IPSJ SIG Notes   2008 ( 102 )   55 - 60   2008.10

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    To induce developments of ASR applications, this panel discussion introduces actual case studies. We also indicate some problems of ASR application developments.

    CiNii Books

    researchmap

  • ICA-based Robot Audition for recognizing barge-in speech under reverberation

    武田龍, 中臺一博, 高橋徹, 駒谷和範, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   26th   ROMBUNNO.1A2-02   2008.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • A Beat-Tracking Robot for Human-Robot Intraction and Its Evaluation

    村田和真, 中臺一博, 武田龍, 吉井和佳, 奥乃博, 鳥井豊隆, 長谷川雄二, 辻野広司

    日本ロボット学会学術講演会予稿集(CD-ROM)   26th   ROMBUNNO.1A1-03   2008.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Improving Speech Recognition of Periphery Talkers by Generating Soft Masks for Robot Audition

    高橋徹, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   26th   ROMBUNNO.1A1-01   2008.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ミッシングフィーチャ理論に基づく複数話者同時発話音声認識における音響特徴量とマスクの検討

    高橋徹, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

    日本音響学会研究発表会講演論文集(CD-ROM)   2008   ROMBUNNO.2-P-16   2008.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Estimation of sound source orientation using a 96 channel microphone array

    KIKUCHI Keiko, DAIGO Tohru, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, KANEDA Yutaka

    IEICE technical report   108 ( 143 )   13 - 18   2008.7

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    This paper addresses sound source orientation estimation using a 96ch microphone array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. However, in this method, a transfer function to design a beam-former should be the same as that of target sound source. Otherwise the performance deteriorated due to a mismatch between these two transfer functions. In addition, voice activity detection (VAD) was manually performed. To solve the former, we proposed amplitude-based orientation estimation using a histogram to relax the effect of the mismatch problems mainly caused by phase errors and outliers. For the latter, speech frequency component detection based on inner product and automatic VAD based on auto-correlation are introduced to form a frequency-temporal masking pattern. Preliminary experiments showed that sound source orientation estimation with automatic VAD for actual human voices drastically improved even when using a loudspeaker-based transfer function.

    CiNii Books

    researchmap

  • Design and Evaluation of Barge-In enable Robot Audition System with ICA and MFT-based ASR

    TAKEDA Ryu, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G.

    全国大会講演論文集   70 ( 0 )   135 - 136   2008.3

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • 1P1-G13 Overview of Open Source Software for Robot Audition

    Nakadai Kazuhiro, Yamamoto Shunichi, Okuno Hiroshi G, Nakajima Hirofumi, Hasegawa Yuji, Tsujino Hiroshi

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)   2008 ( 0 )   _1P1 - G13_1-_1P1-G13_4   2008

     More details

    Language:Japanese   Publisher:一般社団法人 日本機械学会  

    This paper describes an open source software system for robot audition called HARK (Honda Research Institute Japan Audition for Robots with Kyoto University). HARK consists of a lot of modules including multi-channel audio input, sound source localization, sound source tracking, sound source separation and recognition of separated speech for robot audition based on the data-flow oriented software programming environment, FlowDesigner. By combining these modules using a GUI environment, a user can easily build a robot audition system for various types of robots and acoustic environments. Through HARK applications to Honda ASIMO and Robovie with different microphone settings, we showed high software portability and reusability of HARK.

    CiNii Books

    CiNii Research

    J-GLOBAL

    researchmap

  • ビートトラッキングロボットの構築と評価

    村田和真, 中臺一博, 武田龍, 奥乃博, 長谷川雄二, 辻野広司

    人工知能学会AIチャレンジ研究会   28th   13 - 20   2008

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • E-052 Semi-Blind Source Separation using ICA for Barge-In-Capable Robot Spoken Dialogue

    Takeda Ryu, Nakadai Kazuhiro, Komatani Kazunori, Ogata Tetsuya, Okuno Hiroshi G.

    情報科学技術フォーラム一般講演論文集   6 ( 2 )   261 - 262   2007.8

     More details

    Language:Japanese   Publisher:Forum on Information Technology  

    CiNii Books

    researchmap

  • High performance blind source separation using an adaptive step-size parameter method

    NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, TSUJINO Hiroshi

    IEICE technical report   107 ( 120 )   19 - 24   2007.6

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    This paper describes a novel blind source separation (BSS) method. One of the most important factors in BSS performance is a step-size parameter to update a decomposition matrix which is generally used for extracting a target sound source. A fixed value which was obtained empirically is commonly used as the step-size parameter. However, in the real world, the surrounding environment changes dynamically. So, conventional BSS with a fixed step-size parameter does not provide the best performance and sometimes results in divergence of the decomposition matrix. We propose a method that allows for an adaptive step-size parameter. Since the proposed method is gen- erally applicable to BSS methods, we applied it to six types of BSS algorithms with a microphone array embedded in Honda&#039;s ASIMO. Experimental results show that the proposed method improves sound source separation in the four BSS algorithms, and the step-size parameter is maintained optimally even when the surrounding environment changes.

    CiNii Books

    CiNii Research

    researchmap

  • Robust Domain Selection Using Dialogue History in Multi-domain Spoken Dialogue Systems

    神田直之, 駒谷和範, 中野幹生, 中臺一博, 辻野広司, 尾形哲也, 奥乃博

    情報処理学会論文誌   48 ( 5 )   1980 - 1989   2007.5

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    We have developed a robust domain selection method using dialogue history in multi-domain spoken dialogue systems. We define domain selection as a classifying problem among (I) the domain in the previous turn, (II) the domain in which N-best speech recognition results can be accepted with the highest recognition score, (III) other domains. We constructed a classifier by decision tree learning with dialogue data. We implemented a multi-domain spoken dialogue system with 5 domains, and collected dialogue data from 10 subjects. The experimental result showed our method reduced 16.2% of domain selection errors, compared with a conventional method using speech recognition likelihoods only.

    CiNii Research

    J-GLOBAL

    researchmap

  • AS-6-1 Sound Stream Formation and Human Tracking by Integration of Microphone Arrays

    Nakadai Kazuhiro, Nakajima Hirofumi, Murase Masamitsu, Okuno Hiroshi G, Hasegawa Yuji, Tsujino Hiroshi

    Proceedings of the IEICE General Conference   2007   "S - 65"-"S-66"   2007.3

     More details

    Language:English   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Books

    researchmap

  • 音を視覚化する録音再生システム

    吉田雅敏, 海尻聡, 山本俊一, 中臺一博, 駒谷和範, 尾形哲也, 奥乃博

    情報処理学会全国大会講演論文集   69th ( 2 )   2.577-2.578   2007.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 口じゃんけん判定ロボットの開発~ロボット聴覚システムの応用に向けて~

    中臺一博, 山本俊一, 奥乃博, 中島弘史, 長谷川雄二, 辻野広司

    人工知能学会AIチャレンジ研究会   26th   59 - 64   2007

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Robot Audition System Towards Natural Human-Robot Verbal Communication

    中臺 一博, 山本 俊一, 浅野 太

    人工知能学会全国大会論文集   21   1 - 4   2007

     More details

    Language:Japanese   Publisher:人工知能学会  

    CiNii Books

    researchmap

  • Towards Information Integration for Human-Robot Interaction

    NAKADAI Kazuhiro

    IEICE technical report   106 ( 298 )   19 - 26   2006.10

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

    researchmap

  • Towards Information Integration for Human-Robot Interaction

    NAKADAI Kazuhiro

    IEICE technical report   106 ( 300 )   37 - 44   2006.10

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach-integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

    CiNii Books

    CiNii Research

    researchmap

  • Towards Information Integration for Human-Robot Interaction

    NAKADAI Kazuhiro

    IEICE technical report   106 ( 296 )   19 - 26   2006.10

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

    CiNii Books

    CiNii Research

    researchmap

  • Improvement in Online Simultaneous Speech Recognizer by Using GA

    山本俊一, 中臺一博, 中野幹生, 辻野広司, VALIN Jean‐Marc, 駒谷和範, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   24th   1B12   2006.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • D-14-10 Improvement for Noise-Robustness of Automatic Speech Recognition Using Coarse Phoneme Recognition

    SUMIYA Ryota, NAKADAI Kazuhiro, NAKANO Mikio, ICHIGE Koichi, HIROSE Yasuo, TSUJINO Hiroshi

    Proceedings of the IEICE General Conference   2006 ( 1 )   134 - 134   2006.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Books

    researchmap

  • パーティクルフィルタによる音源追跡の性能評価

    村瀬昌満, 中台一博, 奥乃博

    情報処理学会全国大会講演論文集   68th ( 2 )   345 - 346   2006.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 複数ドメイン音声対話システムにおける対話履歴を利用したドメイン選択の高精度化

    神田直之, 駒谷和範, 中野幹生, 中台一博, 辻野広司, 尾形哲也, 奥乃博

    情報処理学会全国大会講演論文集   68th ( 2 )   329 - 330   2006.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • GAによる話者位置への同時発話認識システムの最適化

    山本俊一, 中台一博, 中野幹生, 辻野広司, VALIN Jean‐Marc, 武田龍, 駒谷和範, 尾形哲也, 奥乃博

    情報処理学会全国大会講演論文集   68th ( 2 )   5 - 6   2006.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Robust Domain Selection using Dialogue History in Multi-Domain Spoken Dialogue System

    KANDA NAOYUKI, KOMATANI KAZUNORI, NAKANO MIKIO, NAKADAI KAZUHIRO, TSUJINO HIROSHI, OGATA TETSUYA, OKUNO HIROSHI G

    IPSJ SIG Notes   2006 ( 12 )   55 - 60   2006.2

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    We have developed a robust domain selection method using dialogue history in multi-domain spoken dialogue systems. We define domain selection as classifying problem among (I) the domain in the previous turn, (II) the domain in which N-best speech recognition results can be accepted with the highest recognition score, (III) other domains. We constructed a classifier by decision tree learning with dialogue corpus. The experimental result using 10 subjects shows that our method could reduced 11.6% domain selection error, compared with a conventional method using speech recognition likelihoods only.

    CiNii Books

    researchmap

  • Human Robot Interaction Research in HRI-JP

    TSUJINO Hiroshi, NAKANO Mikio, NAKADAI Kazuhiro, HASEGAWA Yuji

    IEICE technical report   105 ( 426 )   31 - 36   2005.11

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    As the computer technology advances, machines are expected to perform more functional tasks at home and the importance of technology realizing &quot;human-machine interface that anyone can use&quot; is increasing. An intelligent robot is an ultimate machine in this trend, and the advanced concept and sight of value for the robot are investigated actively. We focus on the &quot;bi-directional human-robot interaction&quot; as a future interface between human and the intelligent robot. In this paper, we present our recent results of the &quot;robot architecture for human-robot interaction&quot;, &quot;speech recognition by robot&quot; and &quot;speech recognition by human&quot; in our human-robot interaction research.

    CiNii Books

    researchmap

  • Multiple Moving Speakers Tracking based on Multiple Kalman Filters and Accuracy Evaluatiton

    村瀬昌満, 山本俊一, VALIN Jean‐Marc, 中台一博, 山田健太郎, 駒谷和範, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   23rd   3C26   2005.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Recognition of Three Simultaneous Speech Signals Using MFT for a Humanoid

    山本俊一, VALIN Jean‐Marc, 中台一博, 中野幹生, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   23rd   3C35   2005.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 聖徳太子ロボット―視聴覚統合によるロボット聴覚―

    奥乃博, 中台一博

    画像センシングシンポジウム講演論文集   11th   87 - 92   2005.6

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Implementation of Sound Source Separation Filter on Dynamically Reconfigurable Processor

    KUROTAKI Shunsuke, SUZUKI Noriaki, NAKADAI Kazuhiro, OKUNO Hiroshi, AMANO Hideharu

    IEICE technical report   105 ( 43 )   67 - 72   2005.5

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Research

    researchmap

  • Robot Audition : Its Issues and State of the Arts

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    日本音響学会研究発表会講演論文集   2005 ( 1 )   633 - 636   2005.3

     More details

  • マイクロフォンアレイによる分離音声認識のためのミッシングフィーチャーマスク自動生成

    山本俊一, VALIN J‐M, 中台一博, 駒谷和範, 尾形哲也, 奥乃博

    情報処理学会全国大会講演論文集   67th ( 2 )   377 - 378   2005.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ミッシングフィーチャ理論を適用した同時発話認識システムの同時発話文による評価

    山本俊一, VALIN Jean‐Marc, 中台一博, 中野幹生, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

    人工知能学会AIチャレンジ研究会   22nd   101 - 106   2005

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Evaluation of MFT-Based Interface between Sound Source Separation and ASR

    山本俊一, 中台一博, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   22nd   1C33   2004.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • G-007 Missing Feature Theory Based Interface of Integrating Sound Source Separation and Automatic Speech Recognition

    Yamamoto Shunichi, Nakadai Kazuhiro, Tsujino Hiroshi, Okuno Hiroshi G

    情報科学技術フォーラム一般講演論文集   3 ( 2 )   357 - 360   2004.8

     More details

    Language:Japanese   Publisher:Forum on Information Technology  

    CiNii Books

    researchmap

  • マルチモーダル情報統合によるヒューマノイドロボットの挙動選択

    戸田充彦, 中台一博, 駒谷和範, 尾形哲也, 奥乃博

    情報処理学会全国大会講演論文集   66th ( 2 )   2.193-2.194   2004.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • ミッシングフィーチャー理論による三話者同時発話認識の向上

    山本俊一, 中台一博, 辻野広司, 駒谷和範, 尾形哲也, 奥乃博

    情報処理学会全国大会講演論文集   66th ( 2 )   2.287-2.288   2004.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • アクティブオーディションによる自然なヒューマン・ロボットインターフェースの実現に関する研究(認知と身体性)(<特集>人工知能分野における博士論文)

    中臺 一博

    人工知能   19 ( 1 )   106_2 - 106_2   2004.1

     More details

    Language:Japanese   Publisher:一般社団法人 人工知能学会  

    これまでロボットの聴覚機能に関する研究は,人間とのソーシャルインタラクションで最も重要であるにもかかわらず,あまり行われていなかった.また,ロボット聴覚を実現するために,実環境・実時間処理という観点から問題点は指摘されてきたものの,これらを体系的にまとめた報告はなかった.そこで,本研究では,まず,ロボット聴覚の課題を体系的に整理し,解決に向けた具体的な方法を議論する.そして,アクティブな動作はロボット聴覚の向上に本質的であると捉え,これをロボット聴覚に適用したアクティブオーディションを提案する.また,複数の聴覚情報の統合,聴覚情報以外の感覚情報との統合を行うことによる知覚向上およびより一般的な処理を目指したロボットによる一般的な音(混合音)の理解についても併せて議論する.実際に上半身ヒューマノイドロボットSIG(http://winnie.kuis.kyoto-u.ac.jp/SIG/)上に構築したシステムは,ロボットに特有な動作時のノイズをキャンセルすることで,アクティブな動作の聴覚処理への利用を可能とした.また,アクティブな動作を効果的に用いることにより,視聴覚統合による話者の定位・追跡,注意を向けた方向の音源を実時間で抽出できるアクティブ方向通過型フィルタによる音源分離,分離音の音声認識といった機能を実現した.システムの各機能およびシステム全体を通した統合評価を通じて,アクティブオーディション,感覚情報の統合,一般音理解の有効性・ロバスト性,ヒューマン・ロボットインタフェースとしての有効性を示した.

    DOI: 10.11517/jjsai.19.1_106_2

    CiNii Books

    CiNii Research

    researchmap

  • Three Simultaneous Speech Recognition by Applying Missing Feature Theory to Robot Audition System

    山本 俊一, 中臺 一博, 辻野 広司

    人工知能学会全国大会論文集   18   1 - 4   2004

     More details

    Language:Japanese   Publisher:人工知能学会  

    CiNii Books

    researchmap

  • ロボット聴覚へのミッシングフィーチャー理論の適用による三話者同時発話認識

    山本 俊一, 中臺 一博, 辻野 広司, 奥乃 博

    人工知能学会全国大会論文集   4 ( 0 )   41 - 41   2004

     More details

    Language:Japanese   Publisher:一般社団法人 人工知能学会  

    本稿では,ロボットに搭載された2つのマイクで録音された三話者同時発話音声を音源分離とミッシングフィーチャー理論に基づく音声認識によって行う手法を提案する.2体のロボットにおける実験により提案手法の有効性を確認する.

    researchmap

  • ロボットに装着したマイクロフォンアレイによる音源分離とミッシングフィーチャー理論に基づく音声認識

    山本俊一, VALIN Jean‐Marc, 中台一博, 奥乃博

    人工知能学会AIチャレンジ研究会   20th   27 - 32   2004

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Robotics Based on Al Technology : Robot Audition: State of the Art and Future Directions

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    IPSJ Magazine   44 ( 11 )   1138 - 1144   2003.11

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    ロボットが家庭に入ってくるようになり, ロボットと人とのコミュニケーション, 特に, ロボットに装備されたマイクロフォンを用いたコミュニケーションや音による環境知覚がますます重要になってきている. 最近, ロボット自身の耳による聴覚機能がようやく活発になってきた. では, ロボットのための聴覚機能にはどのようなものが必要であろうか.

    CiNii Books

    CiNii Research

    researchmap

  • ロボットを対象とした散乱理論による三話者同時発話の定位・分離・認識の向上

    中台一博, 奥乃博, 辻野広司

    人工知能学会AIチャレンジ研究会   18th   33 - 38   2003.11

     More details

  • Improvement of Recognition of Three Simultaneous Speeches By Hierarchical AV Integration for Humanoid and Scattering Theory

    中台一博, 松浦大輔, 奥乃博, 辻野広司

    日本ロボット学会学術講演会予稿集(CD-ROM)   21st   2K14   2003.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Design and Implementation of Action Selection System for Humanoid Robot

    戸田充彦, 中台一博, 宮下敬宏, 奥乃博

    日本ロボット学会学術講演会予稿集(CD-ROM)   21st   3F23   2003.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 人間に似た外見を持つロボットReplieにおける挙動選択システム

    戸田充彦, 山本俊一, 中台一博, 奥乃博

    情報処理学会全国大会講演論文集   65th ( 4 )   4.211-4.212   2003.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Applying FPGA to Sound Separation by Direction - Pass Filter

    SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

    情報処理学会研究報告システムLSI設計技術(SLDM)   2003 ( 7 )   135 - 140   2003.1

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform(FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

    CiNii Books

    researchmap

  • Applying FPGA to Sound Separation by Direction-Pass Filter

    SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

    IEICE technical report. Computer systems   102 ( 611 )   79 - 84   2003.1

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of IGHz.

    CiNii Books

    researchmap

  • Applying FPGA to Sound Separation by Direction-Pass Filter

    SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

    Technical report of IEICE. VLD   102 ( 609 )   79 - 84   2003.1

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and are tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and are tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

    CiNii Books

    CiNii Research

    researchmap

  • Exploiting Auditory Fovea in Humanoid-Human Interaction

    Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroshi G. Okuno, Hiroaki Kitano, Hiroaki Kitano

    Proceedings of Eighteenth National Conference on Artificial Intelligence (AAAI-2002)   431-438   431 - 438   2002.12

     More details

  • アクティブオーディションによる複数音源の定位・分離・認識

    中台一博, 奥乃博, 北野宏明

    人工知能学会AIチャレンジ研究会   16th   25 - 32   2002.11

     More details

  • Building Robot Audition-Development of Humanoid SIG2-

    中台一博, 松浦大輔, 宮下敬宏, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集(CD-ROM)   20th   1H19   2002.10

     More details

  • Focus-of-Attention Control in Speaker Tracking by Using Support Vector Machine

    松浦大輔, 中台一博, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集(CD-ROM)   20th   1C33   2002.10

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Auditory fovea based speech enchancement and its application to human-robot dialog system

    Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroshi G. Okuno, Hiroaki Kitano, Hiroaki Kitano

    7th International Conference on Spoken Language Processing, ICSLP 2002   1817 - 1820   2002.1

     More details

  • Auditory Fovea Based Speech Separation and Its Application to Dialog System

    Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroshi G. Okuno, Hiroaki Kitano, Hiroaki Kitano

    Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2002)   2   1320 - 1325   2002.1

     More details

  • Real-Time Speaker Localization and Speech Separation by Audio-Visual Integration

    Kazuhiro Nakadai, Ken Ichi Hidai, Hiroshi G. Okuno, Hiroaki Kitano

    Proceedings of IEEE/RSJ International Conference on Robots and Automation (ICRA-2002)   1   1043 - 1049   2002.1

     More details

  • Active Audition Based Human-Humanoid Interaction

    Nakadai Kazuhiro, Okuno Hiroshi G, Kitano Hiroaki

    SICE Division Conference Program and Abstracts   2002 ( 0 )   522 - 522   2002

     More details

    Publisher:公益社団法人 計測自動制御学会  

    Robots to interact with people should understand various events simultaneously. To realize such capabilities in robots, integration of audition, vision and other sensory information and active motion for better perception are essential. This paper describes active audition that improves robot audition to integrate audition, vision and active motion. Our active audition based upper-torso robot can localize and interact with people even when occlusion and simultaneous speech occur.

    DOI: 10.11499/siced.si2002.0.522.0

    CiNii Research

    researchmap

  • Real-time active human tracking by hierarchical integration of audition and vision

    NAKADAI K.

    Proc. IEEE-RAS Int. Conf. on Robots and Automation, Washington, DC, 2002   2002

     More details

  • Are a pair of ears sufficient for robot audition ?

    Okuno Hiroshi G, Nakadai Kazuhiro

    THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN   58 ( 3 )   205 - 210   2002

     More details

    Language:Japanese   Publisher:一般社団法人 日本音響学会  

    聴覚は人間にとって最も重要な感覚である。言語によるコミュニケーションが聴覚によって成立することは容易に理解されるが,「ヒトは聴覚によってのみ言語を獲得し,そこに文化が生まれ,継承される。書かれた言語は目によって伝承されるが,話す言葉は耳からしか得られない。話し言葉があって書く言葉が生まれる」ことを,多くの人が理解していないのは残念なことである(鈴木淳一,小林武夫共著『耳科学-難聴に挑む』(中公新書1598,2001))。

    DOI: 10.20697/jasj.58.3_205

    CiNii Books

    CiNii Research

    researchmap

  • Research Issues and Current Status of Robot Audition

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    IPSJ SIG Notes   2001 ( 123 )   69 - 74   2001.12

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    In this paper, we present an active audition system which is implemented on the humanoid robot &quot;SIG the humanoid&quot;. The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

    CiNii Books

    CiNii Research

    researchmap

  • Research Issues and Current Status of Robot Audition

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    IEICE technical report. Natural language understanding and models of communication   101 ( 520 )   69 - 74   2001.12

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this paper, we present an active audition system which is implemented on the humanoid robot &quot;SIG the humanoid&quot;. The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

    CiNii Books

    researchmap

  • Research Issues and Current Status of Robot Audition

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    IEICE technical report. Speech   101 ( 522 )   69 - 74   2001.12

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this paper, we present an active audition system which is implemented on the humanoid robot &quot;SIG the humanoid&quot;. The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

    CiNii Books

    CiNii Research

    researchmap

  • Human-Robot Interaction Through Real-Time Auditory and Visual Multiple-Talker Tracking

    Hiroshi G. Okuno, Kazuhiro Nakadai, Ken Ichi Hidai, Hiroshi Mizoguchi, Hiroaki Kitano

    Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2001)   3   1402 - 1409   2001.12

     More details

  • Epipolar Geometry Based Sound Localization and Extraction for Humanoid Audition

    Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

    Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2001)   3   1395 - 1401   2001.12

     More details

  • Active Audio - Visual Integration in Real - Time Human Tracking Humanoid SIG

    NAKADAI KAZUHIRO, HIDAI KEN-ICHI, OKUNO HIROSHI G, KITANO HIROAKI

    IPSJ SIG Notes. ICS   2001 ( 97 )   37 - 42   2001.10

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    This paper describes improvement of auditory processing by active motion and audio-visual integration. Generally, environmental noises and reverberation affect sound source localization and separation in the real world badly. Our real-time human tracking system for humanoid robots attained robust sound source licalization in the real world by active audio-visual integration. Then, we propose a new sound source separation method by active direction pass filter. Our experiments proves that active audio-visual integration is essential to robust perception for extraction of tracking sound source.

    CiNii Books

    CiNii Research

    researchmap

  • ステレオ視による実時間人物追跡システムの高精度化

    日台健一, 中台一博, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集   19th   155   2001.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Real-Time Human Tracking by Integrating Auditory and Visual Streams.

    中台一博, 日台健一, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集   19th   583 - 584   2001.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Improvement of Real-time Human Tracking System by Stereo Vision.

    日台健一, 中台一博, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集   19th   581 - 582   2001.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 視聴覚のストリームベース統合による実時間人物追跡システム

    中台一博, 日台健一, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集   19th   155   2001.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 視聴覚情報の階層的統合による実時間アクティブ人物追跡

    中台一博, 日台健一, 奥乃博, 北野宏明

    人工知能学会AIチャレンジ研究会   13th   35 - 42   2001.6

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • 顔認識とアクティブオーディションを利用した実時間人物追跡

    中台一博, 日台健一, 溝口博, 奥乃博, 北野宏明

    人工知能学会AIチャレンジ研究会   11th   27 - 34   2001.3

     More details

  • Real-time auditory and visual multiple-object tracking for robots

    NAKADAI K.

    Proceedints of the Seventeenth International Joint Conference on Atificial Intelligence (IJCAI-01)   2001

     More details

  • Active Audition System and Humanoid Exterior Design.

    K. Nakadai, T. Matsui, H. G. Okuno, H. Kitano

    BProceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2000)   2   1453 - 1461   2000.12

     More details

  • Control an Interactive Robot to Integrate Image sequence and Sound stream in Dynamic Scene.

    中川友紀子, 中台一博, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集   18th   113 - 114   2000.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Active Audition System Using Robot Cover Acoustics.

    中台一博, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集   18th   103 - 104   2000.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Proposal of Active Audition for Humanoid Auditory Capabailities.

    中台一博, 奥乃博, 北野宏明

    日本ロボット学会学術講演会予稿集   18th   105 - 106   2000.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

    OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

    IPSJ SIG Notes   2000 ( 23 )   116 - 124   2000.3

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won&#039;t work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

    CiNii Books

    researchmap

  • Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

    OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

    IPSJ SIG Notes   2000 ( 23 )   119 - 124   2000.3

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won&#039;t work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

    CiNii Books

    CiNii Research

    researchmap

  • Active audition for humanoid

    K Nakadai, T Lourens, HG Okuno, H Kitano

    SEVENTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-2001) / TWELFTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-2000)   832 - 839   2000

     More details

    Language:English  

    Web of Science

    researchmap

  • The method of defending system resources against continuous and high-speed setup process in ATM switching system

    WATANABE Hiroshi, NAKADAI Kazuhiro, SATOU Yukio, SAKAGUCHI Zenji, ASHIKAWA Hirotoshi

    IEICE technical report. Computer systems   98 ( 572 )   1 - 8   1999.1

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    For operating the reliable data communications, we use the protocol message to control that communications. In case of the continuous and high-speed operating message to the node on purpose, it arises the problem that we cannot offer the service for lack of the resource of the node. In this paper, we propose the effective way that we can defend from that problem automatically. We propose to implement the operation in the software of the ATM node as a basical rule, instead of that the system manager operate manually. That way has the characteristic that we can execute self-defense automatically in the environment of the inter-communication(Ex.the private network as internet). We propose that the way applies TCP on the internet as well.

    CiNii Books

    researchmap

  • Implementation of OPTIMAOrganized Processing toward Intelligent Music Scene Analysis

    柏野 邦夫, 中臺 一博, 木下 智義, 田中 英彦

    全国大会講演論文集   50 ( 0 )   97 - 98   1995.3

     More details

    Language:Japanese  

    われわれは、聴覚的情景分析を「知覚的な音」の分離抽出(知覚的音源分離)と構造化の問題と捉え、モノラルの楽器演奏の音響信号を題材として、音楽情景分析(音楽音響信号を対象とする聴覚的情景分析)の処理モデルについて検討を行っている。ここで、知覚的音源分離とは、人間がひとつのものとして知覚または認識するような音響エネルギーのまとまり(これを知覚的な音と呼ぶ)を一つのものとして記号化することを指す。われわれは既に、ベイズの定理に基礎を置く定量的かつ階層的な情報統合のメカニズムを備えた音楽情景分析の処理モデルOPTIMA(Organized Processing toward Intelligent Music Scene Analysis)を提案した。この処理モデルに基づき、音楽情景分析の実験システムを実装し検討を行ったので、本稿でその概要を報告する。

    CiNii Books

    CiNii Research

    researchmap

  • Creation and Verification of Note Hypotheses in OPTIMA based on Statistical Information

    中臺 一博, 柏野 邦夫, 木下 智義, 田中 英彦

    全国大会講演論文集   50 ( 0 )   101 - 102   1995.3

     More details

    Language:Japanese  

    われわれは、音楽情景分析における処理モデルとしてOPTIMAを提案し、これに基づく音楽情景分析の実験システムの実装・評価を行った。本稿では、実験システムのうち、周波数成分レベル、単音レベル間の処理を行う単音仮説生成処理部の実装および、評価について述べる。

    CiNii Books

    CiNii Research

    researchmap

  • Employment of music scene information in OPTIMA

    木下 智義, 柏野 邦夫, 中臺 一博, 田中 英彦

    全国大会講演論文集   50 ( 0 )   99 - 100   1995.3

     More details

    Language:Japanese  

    OPTIMAでは、複数の独立したモジュールに確率をもった仮説の組を出力させ、これを確率伝搬によって統合することによって外界の音響的事象に関する最尤推定像を求める。本稿ではOPTIMAにおいて利用される音楽シーン惰報として、拍位置および和音の情報の抽出と利用について議論し、実験システムに対する評価実験の結果を示す。

    CiNii Books

    CiNii Research

    researchmap

  • An Optima-based Music Scene Analysis System I : Implementation and Evaluation of Processing Modules

    NAKADAI Kazuhiro, KASHINO Kunio, KINOSHITA Tomoyoshi, TANAKA Hidehiko

    日本音響学会研究発表会講演論文集   1995 ( 1 )   481 - 482   1995.3

     More details

  • An Optima-based Music Scene Analysis System II : Evaluation of the Information Integration Mechanism

    KASHINO Kunio, NAKADAI Kazuhiro, KINOSHITA Tomoyoshi, TANAKA Hidehiko

    日本音響学会研究発表会講演論文集   1995 ( 1 )   483 - 484   1995.3

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • 楽器演奏における単音の分離抽出とその音楽情景分析システムへの応用

    中臺一博

    Master's thesis, 東京大学   1995

  • General Description of OPTIMA : A Process Model of Perceptual Sound Source Separation for Music Scene Analysis

    柏野 邦夫, 中台 一博, 田中 英彦

    全国大会講演論文集   49 ( 0 )   325 - 326   1994.9

     More details

    Language:Japanese  

    われわれは、モノラルの楽器演奏を対象とする音源分離を題材として、知覚的音源分離システムについて検討を進めている。知覚的音源分離においては、観測データに加え、対象に関する知識や記憶に基づく処理を柔軟に組み合わせて最終的な結果を求めることが本質的な課題である。そこで本稿では、情報統合のメカニズムを備えた知覚的音源分離の処理モデル OPTIMA (Organized Processing toward Intelligent Music Scene Analysis)を提案する。

    CiNii Books

    CiNii Research

    researchmap

  • Creation of Single Note Hypotheses in OPTIMA

    中台 一博, 柏野 邦夫, 田中 英彦

    全国大会講演論文集   49 ( 0 )   327 - 328   1994.9

     More details

    Language:Japanese  

    われわれは、音楽単音記号列生成システムにおける処理モデルとしてOPTIMAを提案した。[1]OPTIMAでは、モジュールが確信度をもった仮説の組を出力する場合、これを確率伝搬によって統合することができる。したがって、音楽単音記号列生成システムのように複数の情報を統合する必要がある場合には、有用な処理モデルであるということができる。OPTIMAの処理のうち本稿で扱う単音仮説生成モジュールでは、各仮説に確信度を与えなければならないため、確信度の与え方が問題である。このような確信度を与える単音仮説生成モジュールとして、音記憶を使用したモジュールが実装されている。このモジュールは音記憶から生成した混合音仮説と入力とのマッチングを行うモジュールであり、和音などの混合音の認識に効果的であった。しかし、一音ごとに音記憶が必要であること、および混合音数の増加にともない計算量が爆発してしまうことなど、効率、精度の面で音記憶だけでは限界があった。そこで、これらの問題を解決するために音色としての本質的な特徴を抽出し、音色空間上に表現を行った。このような音色空間を利用した楽器の類別、認識の研究はニューラルネットワークを使用したものなどがあり、単音に関しては良好な結果が得られている。そこで、本稿では音色空間の利用により、確信度をもった仮説の組を出力し、混合音に対しても認識を行うことができる単音仮説生成法を提案する。この手法では、各単音仮説の確信度は統計的手法により算出することができ、知識は音色ごとに与えられるため、音数に対する知識量の爆発、計算量の爆発を抑えることができる。

    CiNii Books

    CiNii Research

    researchmap

  • OPTIMA : Organized Processing toward Intelligent Music Scene Analysis -General Description of the Process Model-

    Kashino Kunio, Nakadai Kazuhiro, Tanaka Hidehiko

    IPSJ SIG Notes   1994 ( 71 )   57 - 64   1994.8

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    We describe OPTIMA, a process model for the perceptual sound source separation on computers. Our model consists of four parts: bottom-up processing modules, top-down processing modules, knowledge sources, and a hypothesis network for hierarchical and quantitative integration of multiple bits of information. First we present general description of the model. Since one of the most essential problems in the perceptual sound source separation is integration of multiple bits of information, we then focus our discussion on the hypothesis network: we show that our method has permitted efficient, autonomous and stable construction of an optimal internal model of the outer world.

    CiNii Books

    CiNii Research

    researchmap

  • 音源分離システムにおけるパターン照合モジュールの動的負荷分散を用いた並列実装

    中臺 一博, 柏野 邦夫, 田中 英彦

    情報処理学会研究報告. 人工知能研究会報告   94 ( 67 )   59 - 60   1994.7

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    CiNii Books

    CiNii Research

    researchmap

  • 音楽音響信号を対象とする音モデルに基づく音源分離システム

    柏野 邦夫, 中台 一博, 田中 英彦

    東京大学工学部総合試験所年報   ( 52 )   p79 - 84   1993.9

     More details

    Language:Japanese   Publisher:東京大学工学部総合試験所  

    資料形態 : テキストデータ プレーンテキスト
    コレクション : 国立国会図書館デジタルコレクション > デジタル化資料 > 雑誌
    記事分類: 振動工学・音響工学

    CiNii Books

    CiNii Research

    researchmap

  • A Sound Source Separation System for Polyphonic Music Based on the Tone Models

    Nakadai Kazuhiro, Kashino Kunio, Tanaka Hidehiko

    IPSJ SIG Notes   1993 ( 32 )   1 - 8   1993

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    A system configuration, implementation and evaluation of a sound source separation system are described. Input of the system is assumed to be a monaural audio signal of ensemble music, and output is MIDI data which has several MIDI channels, each of which is assigned to one kind of musical instruments. The present approach is based on the matching between registered tone models and sound spectrogram derived from the input signal. Experimental results show that more than 85% of the notes are correctly identified by the system on average, under the condition that the number of simultaneous notes in the input is three or less.

    CiNii Books

    researchmap

▼display all

Presentations

  • チューブ型ロボットの姿勢推定のためのEKF-SLAMを用いた可変マイクロホンアレイ位置推定

    坂東宜昭, 水本武志, 中臺一博, 奥乃博

    第75回全国大会講演論文集  2013.3 

     More details

    Language:Japanese  

    災害現場での被災者発見にはがれき内へ進入可能なチューブ型ロボットが有用である.さらにチューブ型ロボットに音源定位機能があれば被災者の声から位置の推定が可能となる.しかし,近年の高精度な音源定位手法は位置が既知のマイクアレイで収録した音声から方向を推定しているが,チューブ型ロボットではマイク配置を事前に計測できない.そこで本稿ではEKF-SLAMによるマイクロフォン位置推定手法提案し,常に変化するロボット姿勢の推定によって本問題を解決する.数値実験と実録音の両方を用いて本手法の有効性を確認した.

    researchmap

  • 話者ダイアライゼーションシステムのための音声区間検出および到来方向推定の精度向上の検討

    黄楊暘, 大塚琢馬, 中臺一博, 奥乃博

    第75回全国大会講演論文集  2013.3 

     More details

    Language:Japanese  

    ロボット聴覚では, いつ, どこで, 誰が話したかを解明する音環境理解機能が不可欠である. 本稿では, それらの問題を解決するために, 音声区間検出, 到来方向推定および話者同定技術を組み合わせた処理を話者ダイアライゼーションシステムとする. ロボット聴覚ソフトウエア HARK においては, MUSIC アルゴリズムを前処理として, 音声区間検出および到来方向推定を行っている. しかし, MUSIC スペクトルに基づいて処理を行う際に, 音源数パラメータおよび閾値パラメータが結果を大きく左右する. 本稿では, ブラインド音源分離を前処理とする話者ダイアライゼーションシステムを提案した. 音量閾値パラメータの設定は依然必要であるが, 精度向上したパフォーマンスが得られている.

    researchmap

  • クアドロコプターを用いた飛行雑音に頑健な音源定位

    古川孝太郎, 奥谷啓太, 柳楽浩平, 大塚琢馬, 中臺一博, 奥乃博

    第75回全国大会講演論文集  2013.3 

     More details

    Language:Japanese  

    本研究は多数の回転翼を持つ小型の無人航空機, クアドロコプターにマイクロフォンアレイを搭載し, 周囲の環境における音源定位問題を取り扱う.通常, 飛行時には風圧やローターの駆動に起因する雑音が極めて大であり, 定位精度の劣化原因となり得る.このような雑音環境下では, 一般化固有値分解を用いた MUSIC 法により雑音相関行列を加味することで精度が改善することが知られている.そこで本研究は, 飛行に伴って動的に変化する雑音相関行列の推定へと問題を帰着する.その上で飛行制御などの機体のモニタ情報を用いた推定手法を提案し, 飛行雑音に頑健な音源定位手法を開発する.

    researchmap

  • Incremental Noise Estimation in Outdoor Auditory Scene Analysis using a Quadrocopter with a Microphone Array

    OKUTANI Keita, YOSHIDA Takami, NAKAMURA Keisuke, NAKADA Kazuhiro

    Journal of the Robotics Society of Japan  2013.9 

     More details

    Language:Japanese  

    This paper addresses sound source localization using an aerial vehicle with a microphone array in an outdoor environment to realize outdoor auditory scene analysis. It, for instance, aims at finding distressed people in a disaster situation. In such an environment, noise is quite loud and dynamically-changing, and conventional microphone array techniques studied in the field of indoor robot audition are of less use. We, thus, proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC). It can deal with dynamically-changing high power noise by introducing incrementally-estimated noise correlation matrices. We developed a prototype system for the outdoor auditory scene analysis based on the proposed method using the Parrot AR.Drone with an 8ch microphone array and a Kinect device. Experimental results using the prototype system showed that dynamically-changing noise is properly suppressed with the proposed method even when the signal-to-noise ratio is less than 0dB in an outdoor/indoor environment with the hovering/moving AR.Drone.

    researchmap

  • Volume Adaptation and Visualization by Modeling the Volume Level in Actual Noise Environment for Telepresence System

    HAYAMIZU Akira, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  2013.12 

     More details

    Language:Japanese  

    The Lombard effect is the involuntary tendency of speakers to increase their vocal effort when speaking in loud noise to enhance the audibility of their voice. There is a problem in a telecommunication situation due to the Lombard effect, and would talk loudly than necessary for the conversation partner at a remote location. In this paper, the design and the model that is required in order to adjust automatically the volume of the operator at the remote communication via telepresence robot mobile in the real world, the optimal volume control system LOMBOT equipped with a model was developed. As a result, We confirmed that the volume is adjusted properly to the noise of the remote location

    researchmap

  • TelePaBot : A telepresence system for supporting multi-party conversation

    Koike Kyotaro, Imai Michita, Nakamura Keisuke, Nakadai Kazuhiro

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  2013.12 

     More details

    Language:Japanese  

    A telepresence robot is useful to deal with a situation where a user in a remote area has to control the robot to communicate with people. However, there exists some remaining issues that the target speech is contaminated with unnecessary speeches, and the remote user cannot understand the speech in the case of multi-party conversation. We propose a telepresence party robot, "TelePaBot" that visualizes utterance's position and purveys a selective listening function. A case study suggested that TelePaBot smoothens remote-communication even when multi-party conversation occurs.

    researchmap

  • マイクロホンアレイの位置推定によるホース型ロボットの姿勢推定

    坂東宜昭, 大塚琢馬, 糸山克寿, 昆陽雅司, 田所諭, 中臺一博, 奥乃博

    第76回全国大会講演論文集  2014.3 

     More details

    Language:Japanese  

    ホース型ロボットは細長い形状が特徴のレスキューロボットで,倒壊した建築物の隙間などの探索が可能である.操縦の効率化のために加速度センサやカメラ画像などを用いた本ロボットの姿勢推定法が提案されてきたが,累積誤差が生じるなどの問題があった.本稿ではマイクロホンアレイと小型スピーカを本ロボットに装着し,音によるこれらの位置推定によって姿勢を推定する手法について述べる.本手法ではスピーカから発する試験音の各マイクへの到達時間差を用いて姿勢を推定するが,到達時間差は現在のマイクとスピーカの位置関係を表しており,過去の誤差を修正できる.実録音データを用いて本手法の有効性を評価した.

    researchmap

  • 音ランドマークを用いたマルチコプターの定位

    ラナシナパヤ, 中村圭佑, 中臺一博, 高橋秀幸, 木下哲男

    第76回全国大会講演論文集  2014.3 

     More details

    Language:English  

    We propose a novel approach to multicopter localization, using sound landmarks and one embedded microphone. This approach can benefit to multicopter localization in that it requires less computational power and smaller payloads than image-based approaches. However, the high ego-noise of multicopters is a serious threat for sound-based algorithms. We simulated a 2D localization method based on a Kalman Filter using measurements of acceleration and sound landmarks' intensity. A random walk model is used to update the multicopter's position with the Kalman Filter; the calculated estimation is then corrected using noisy measurements from the embedded microphone and accelerometer. Simulation results show that the proposed algorithm can successfully track the multicopter's motion in a noisy environment. We confirmed the effectiveness of our proposed algorithm by comparing its performance and robustness to a time/phase based algorithm.

    researchmap

  • Deep Neural Networkを用いたマルチモーダル音声認識の為の特徴量学習

    山口雄紀, 野田邦昭, 中臺一博, 奥乃博, 尾形哲也

    第76回全国大会講演論文集  2014.3 

     More details

    Language:Japanese  

    本研究の目標は,マルチモーダル音声認識の為の画像特徴量の設計である.マルチモーダル音声認識の精度向上のためには,唇画像からどのようにして音声認識の最小単位である音素や口形素を表現する情報を取り出すかが重要な課題である.本研究では,特徴量学習の新たな手法として注目を集めているDeep Neural Network (DNN)を用いて大量の唇画像から画像特徴量を自己組織的に抽出する手法を構築した.得られた画像特徴量を孤立単語認識タスクで検証するとともに特徴量空間を解析する事で口形素との関連についても考察した.また,得られた画像特徴量と音声を用いた視聴覚統合によるノイズ頑健性の向上について検証を行った.

    researchmap

  • Design and Implementation of Multidirectional Sound Annotation Tool with HARK

    SUGIYAMA Osamu, ITOYAMA Katsutoshi, NAKADAI Kazuhiro, OKUNO Hiroshi G

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  2014.6 

     More details

    Language:Japanese  

    In this study we designed and developed the multidirectional sound source annotation tool with the robot audition software, HARK. With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user ' s annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.

    researchmap

  • TeleCoBot : A Telepresence system of taking account for conversation environment

    TAKAHASHI Masaaki, OGATA Masa, IMAI Michita, NAKAMURA Keisuke, NAKADAI Kazuhiro

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  2014.12 

     More details

    Language:Japanese  

    The study of the telepresence robot becomes popular as a communication tool in the remote place. However, there is a problem that the telepresence system can't precisely transfer the user's utterance because of not considering difference of sound environment such as noise. In addition, when the user talks with several people in remote place, the user wants freely to change the speaker volume depending on the situation. Therefore we propose a telepresence conversation robot named "TeleCoBot". It provides the function automatically regulate the volume of user's utterance according to the distance of the partner and noise level in remote place. In addition, user can change the volume freely depending on the conversation situation. In this paper, we conduct the case study, and the result indicated that TeleCoBot's UI should be more effctive and enhance the presence.

    researchmap

  • Wind-induced Noise Reduction in Time Domain Using Closely-aligned Two Microphones

    SAKATA Naoto, NAKAJIMA Hirofumi, NAKADAI Kazuhiro

    Technical report of IEICE. EA  2015.3 

     More details

    Language:Japanese  

    In this report, wind-induced noise reduction in time domain was investigated using closely-aligned two microphones. A linear beamforming filter in frequency domain on the basis of time frame decomposition was applied to signals in time domain. The beamforming filter's ability of reduction for wind-induced noise was compared between with and without the time frame decomposition. As a result of performing the wind-induced noise reduction, the signal-to-noise ratio was improved by about 2 to 13 dB, for recorded signals disturbed by wind-induced noise. In case of that the filter composed of some simple delay units was employed, the time frame decomposition was very influential to the ability of reduction for wind-induced noise.

    researchmap

  • Wind-induced noise reduction by linear beamforming using a 2-channel microphone

    坂田 直人, 村上 哲郎, 中島 弘史, 中臺 一博

    回路とシステムワークショップ論文集 Workshop on Circuits and Systems  2015.8 

     More details

    Language:Japanese  

    researchmap

  • Automatic impulse response truncation based on relative amplitude spectrum

    中島 弘史, 坂田 直人, 加科 優希, 中臺 一博

    回路とシステムワークショップ論文集 Workshop on Circuits and Systems  2015.8 

     More details

    Language:Japanese  

    researchmap

  • 変分ベイズ多チャネルロバストNMFに基づくマイクロホンの移動・被覆を許容する音声強調 (音声) -- (オーガナイズドセッション「あらゆる音を対象とした情報処理の実現に向けて」)

    坂東 宜昭, 糸山 克寿, 昆陽 雅司, 田所 諭, 中臺 一博, 吉井 和佳, 河原 達也, 奥乃 博

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  2016.8 

     More details

    Language:Japanese  

    researchmap

  • A Study on body movements and postures at Human-Robot Interaction using speech and image information

    蓮本 諒介, 小山 大幾, 水本 武志, 中村 圭佑, 中臺 一博, 今井 倫太

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  2017.2 

     More details

    Language:Japanese  

    researchmap

  • A Study on body movements and postures at Human-Robot Interaction using speech and image information

    蓮本 諒介, 小山 大幾, 水本 武志, 中村 圭佑, 中臺 一博, 今井 倫太

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報  2017.2 

     More details

    Language:Japanese  

    researchmap

  • 確率的生成モデルに基づく複数A/Dコンバータのチャネル間同期

    糸山克寿, 中臺一博, 中臺一博

    日本音響学会研究発表会講演論文集(CD-ROM)  2018.2 

     More details

    Language:Japanese  

    researchmap

  • 振動センサを用いた災害時の避難者の属性推定に関する検討

    尾崎翔, 浅野太, 中臺一博

    電子情報通信学会大会講演論文集(CD-ROM)  2018.3 

     More details

    Language:Japanese  

    researchmap

  • 可聴音を用いた周波数自動選択に基づく距離推定法の検討

    高尾麻衣子, 干場功太郎, 中臺一博, 中臺一博

    情報処理学会全国大会講演論文集  2018.3 

     More details

    Language:Japanese  

    researchmap

  • Quad‐directional LSTMを用いた音楽音響信号修復とその評価

    谷口亮輔, 干場功太郎, 中臺一博, 中臺一博

    情報処理学会全国大会講演論文集  2018.3 

     More details

    Language:Japanese  

    researchmap

  • Development of Robot Audition to Extreme Environments

    奥乃博, 糸山克寿, 中臺一博, 中臺一博, 公文誠, 坂東宜昭, 干場功太郎

    システム制御情報学会研究発表講演会講演論文集(CD-ROM)  2018.5 

     More details

    Language:Japanese  

    researchmap

  • スペクトル伸縮に基づく複数A/Dコンバータのチャネル間同期

    糸山克寿, 中臺一博, 中臺一博

    日本機械学会ロボティクス・メカトロニクス講演会講演論文集(CD-ROM)  2018.6 

     More details

    Language:Japanese  

    researchmap

  • 振動センサを用いた災害時における年少避難者の特定手法に関する検討

    尾崎翔, 浅野太, 中臺一博

    電子情報通信学会大会講演論文集(CD-ROM)  2018.8 

     More details

    Language:Japanese  

    researchmap

  • CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

    Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya Ogata

    2018.11 

     More details

    Presentation type:Oral presentation (general)  

    Casual conversations involving multiple speakers and noises from surrounding
    devices are part of everyday environments and pose challenges for automatic
    speech recognition systems. These challenges in speech recognition are target
    for the CHiME-5 challenge. In the present study, an attempt is made to overcome
    these challenges by employing a convolutional neural network (CNN)-based
    multichannel end-to-end speech recognition system. The system comprises an
    attention-based encoder-decoder neural network that directly generates a text
    as an output from a sound input. The mulitchannel CNN encoder, which uses
    residual connections and batch renormalization, is trained with augmented data,
    including white noise injection. The experimental results show that the word
    error rate (WER) was reduced by 11.9% absolute from the end-to-end baseline.

    researchmap

  • Robot Audition : Its Issues and State of the Arts

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    日本音響学会研究発表会講演論文集  2005.3 

     More details

    Language:Japanese  

    researchmap

  • Implementation of Sound Source Separation Filter on Dynamically Reconfigurable Processor

    KUROTAKI Shunsuke, SUZUKI Noriaki, NAKADAI Kazuhiro, OKUNO Hiroshi, AMANO Hideharu

    IEICE technical report  2005.5 

     More details

    Language:Japanese  

    researchmap

  • Human Robot Interaction Research in HRI-JP

    TSUJINO Hiroshi, NAKANO Mikio, NAKADAI Kazuhiro, HASEGAWA Yuji

    IEICE technical report  2005.11 

     More details

    Language:Japanese  

    As the computer technology advances, machines are expected to perform more functional tasks at home and the importance of technology realizing "human-machine interface that anyone can use" is increasing. An intelligent robot is an ultimate machine in this trend, and the advanced concept and sight of value for the robot are investigated actively. We focus on the "bi-directional human-robot interaction" as a future interface between human and the intelligent robot. In this paper, we present our recent results of the "robot architecture for human-robot interaction", "speech recognition by robot" and "speech recognition by human" in our human-robot interaction research.

    researchmap

  • Robust Domain Selection using Dialogue History in Multi-Domain Spoken Dialogue System

    KANDA NAOYUKI, KOMATANI KAZUNORI, NAKANO MIKIO, NAKADAI KAZUHIRO, TSUJINO HIROSHI, OGATA TETSUYA, OKUNO HIROSHI G

    IPSJ SIG Notes  2006.2 

     More details

    Language:Japanese  

    We have developed a robust domain selection method using dialogue history in multi-domain spoken dialogue systems. We define domain selection as classifying problem among (I) the domain in the previous turn, (II) the domain in which N-best speech recognition results can be accepted with the highest recognition score, (III) other domains. We constructed a classifier by decision tree learning with dialogue corpus. The experimental result using 10 subjects shows that our method could reduced 11.6% domain selection error, compared with a conventional method using speech recognition likelihoods only.

    researchmap

  • D-14-10 Improvement for Noise-Robustness of Automatic Speech Recognition Using Coarse Phoneme Recognition

    SUMIYA Ryota, NAKADAI Kazuhiro, NAKANO Mikio, ICHIGE Koichi, HIROSE Yasuo, TSUJINO Hiroshi

    Proceedings of the IEICE General Conference  2006.3 

     More details

    Language:Japanese  

    researchmap

  • Towards Information Integration for Human-Robot Interaction

    NAKADAI Kazuhiro

    IEICE technical report  2006.10 

     More details

    Language:Japanese  

    To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach-integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

    researchmap

  • Towards Information Integration for Human-Robot Interaction

    NAKADAI Kazuhiro

    IEICE technical report  2006.10 

     More details

    Language:Japanese  

    To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

    researchmap

  • Towards Information Integration for Human-Robot Interaction

    NAKADAI Kazuhiro

    IEICE technical report  2006.10 

     More details

    Language:Japanese  

    To realize natural human-robot interaction, we consider that a robot should have at least two functions, that is, real-world auditory scene analysis by a robot to understand the surrounding environments, and robot expression to send information to users intelligibly and naturally. These functions should be highly-robust against environmental changes because the robot has to work in dynamically-changing noisy environments. This paper addresses information integration to improve robustness of these functions since we believe that such a integration approach improves robustness, and describes four research topics based on the integration approach - integration of sound source separation and automatic speech recognition based on missing feature theory, sound source tracking by integration of two types of microphone arrays, multimodal expression by using prosodic information and head colors, and new speech function using a directional speaker.

    researchmap

  • Robot Audition System Towards Natural Human-Robot Verbal Communication

    中臺 一博, 山本 俊一, 浅野 太

    人工知能学会全国大会論文集  2007 

     More details

    Language:Japanese  

    researchmap

  • AS-6-1 Sound Stream Formation and Human Tracking by Integration of Microphone Arrays

    Nakadai Kazuhiro, Nakajima Hirofumi, Murase Masamitsu, Okuno Hiroshi G, Hasegawa Yuji, Tsujino Hiroshi

    Proceedings of the IEICE General Conference  2007.3 

     More details

    Language:English  

    researchmap

  • High performance blind source separation using an adaptive step-size parameter method

    NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, TSUJINO Hiroshi

    IEICE technical report  2007.6 

     More details

    Language:Japanese  

    This paper describes a novel blind source separation (BSS) method. One of the most important factors in BSS performance is a step-size parameter to update a decomposition matrix which is generally used for extracting a target sound source. A fixed value which was obtained empirically is commonly used as the step-size parameter. However, in the real world, the surrounding environment changes dynamically. So, conventional BSS with a fixed step-size parameter does not provide the best performance and sometimes results in divergence of the decomposition matrix. We propose a method that allows for an adaptive step-size parameter. Since the proposed method is gen- erally applicable to BSS methods, we applied it to six types of BSS algorithms with a microphone array embedded in Honda's ASIMO. Experimental results show that the proposed method improves sound source separation in the four BSS algorithms, and the step-size parameter is maintained optimally even when the surrounding environment changes.

    researchmap

  • Design and Evaluation of Barge-In enable Robot Audition System with ICA and MFT-based ASR

    TAKEDA Ryu, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集  2008.3 

     More details

    Language:Japanese  

    researchmap

  • Estimation of sound source orientation using a 96 channel microphone array

    KIKUCHI Keiko, DAIGO Tohru, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, HASEGAWA Yuji, KANEDA Yutaka

    IEICE technical report  2008.7 

     More details

    Language:Japanese  

    This paper addresses sound source orientation estimation using a 96ch microphone array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. However, in this method, a transfer function to design a beam-former should be the same as that of target sound source. Otherwise the performance deteriorated due to a mismatch between these two transfer functions. In addition, voice activity detection (VAD) was manually performed. To solve the former, we proposed amplitude-based orientation estimation using a histogram to relax the effect of the mismatch problems mainly caused by phase errors and outliers. For the latter, speech frequency component detection based on inner product and automatic VAD based on auto-correlation are introduced to form a frequency-temporal masking pattern. Preliminary experiments showed that sound source orientation estimation with automatic VAD for actual human voices drastically improved even when using a loudspeaker-based transfer function.

    researchmap

  • Panel Discussion : Application Developments of Speech Recognition

    NISIMURA Ryuichi, NAKANO Teppei, KURIHARA Kazutaka, NAKADAI Kazuhiro, YOSHINO Takashi

    IPSJ SIG Notes  2008.10 

     More details

    Language:Japanese  

    To induce developments of ASR applications, this panel discussion introduces actual case studies. We also indicate some problems of ASR application developments.

    researchmap

  • Realtime Syncronization Method between Audio Signal and Score Using Beats, Melodies, and Harmonies for Singer Robots

    OTSUKA Takuma, MURATA Kazumasa, TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集  2009.3 

     More details

    Language:Japanese  

    researchmap

  • Simultaneous three talker speech recognition using soft mask and model adaptation technique

    TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集  2009.3 

     More details

    Language:Japanese  

    researchmap

  • The design of a directional sound source for numerical simulation based on wave acoustics

    SUZUKI Toshimasa, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, ARAI Takahiro, HASEGAWA Yuji

    IEICE technical report  2009.6 

     More details

    Language:Japanese  

    Thanks to improvements in computer performance, numerical simulation based on wave acoustics works in practical time with off-the-shelf computers. Such a numerical simulation method accurately estimates a sound field when it is a simple and simulated environment like a free sound field. However, this method has difficulties in simulating a real-world acoustic environment. One of issues for real-world simulation is to deal with a sound directivity. Thus, most numerical simulators assume a point sound source to avoid this issue. Indeed, several studies to cope with a sound directivity have been reported, but, the accuracy and practical utility are insufficient for real world simulation, because an accurate sound propagation model is necessary to deal with a sound directivity. We use a compact finite difference method based on sound field digitization which has an accurate sound propagation model. However, this method also has a problem, that is, two points are simulated differently even when they are located with the same distance from the sound source due to the difference in the effect of their numerical dispercion. In this paper we, first, confirm the performance of our method by using an omni-directional point source in a free sound field. After that, we show that our method is able to simulate a directional sound source accurately using a combination of a simple loudspeaker and a point source model.

    researchmap

  • Blind Dereverberation Improved By Multi-Stage Processing

    NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

    電子情報通信学会技術研究報告. EA, 応用音響  2009.7 

     More details

    Language:Japanese  

    researchmap

  • Blind dereverberation improved by multi-stage processing

    NAKAJIMA Hirofumi, DAIGO Tohru, NAKADAI Kazuhiro, KANEDA Yutaka, HASEGAWA Yuji

    IEICE technical report  2009.7 

     More details

    Language:Japanese  

    This paper addresses a multi-stage processing mechanism that improves various dereverberation methods. In the mechanism, each stage is implemented as an intermediate processing module that connects the outputs of the modules on the previous stage to the inputs of other modules on the next stage. Since the dereverberation performance at each stage depends on the input channel combinations, we proposed two additional processes: channel selection and delay addition. We applied our proposed mechanism with these two processes to two dereberveration methods, i.e., SBM and RDAIF. The proposed system showed the following results: (1) Channel selection process improved 3-10dB. The optimum combination can reduce the number of input channels without any degradation. (2) Delay addition process improved the suppression performance by 3-10dB. (3) Multi-stage mechanism improved for SBM and RDAIF are 18.2dB and 13.6dB, respectively, while the performance without the mechanism are only 14.6dB and 3.5dB, respectively. We can conclude that the proposed mechanism and processes are effective to improve reverberation performance.

    researchmap

  • Robot audition system development and parameter-turning in real environment

    TAKAHASHI Toru, NAKADAI Kazuhiro, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集  2010.3 

     More details

    Language:Japanese  

    researchmap

  • Self-speech cancellation with Semi-blind ICA for Robot speech interaction

    TAKEDA Ryu, NAKADAI Kazuhiro, TAKAHASHI Toru, KOMATANI Kazunori, OGATA Tetsuya, OKUNO Hiroshi G

    全国大会講演論文集  2010.3 

     More details

    Language:Japanese  

    researchmap

  • Real time speaker orientation estimation using a room microphone array

    HARUBARA Takuya, NAKAJIMA Hirofumi, NAKADAI Kazuhiro, KANEDA Yutaka

    IEICE technical report  2010.7 

     More details

    Language:Japanese  

    This paper addresses a real-time sound source orientation estimation system using a 96ch microphone-array. We proposed a beam-forming method with estimation of sound source directivity, and reported orientation estimation of a speech source such as a loudspeaker or an actual human. Furthermore, we showed that the precision of the orientation estimation system is improved to introduce four additional techniques: Amplitude-extraction, correlation-based automatic voice activity detection(VAD), frequency mask and histogram integration. We developed a real-time sound source orientation system. However, the precision of the real-time system is sufficient for practical use. In this paper, we investigate the main causes of the estimation error and propose an advanced real-time orientation estimation system. The experimental results show that the advanced system has lower errors than the previous system by 20°- -30°.

    researchmap

  • Robot Audition : Hands-Free Automatic Speech Recognition under Highly-Noisy Environemnts

    NAKADAI Kazuhiro, OKUNO Hiroshi G

    IEICE technical report  2011.1 

     More details

    Language:Japanese  

    This paper addresses robot audition, which realizes listening capabilities for robots using robot-embedded microphones. For robot audition, we propose real-time sound source separation and automatic speech recognition (ASR) techniques for dynamically changing environments based on microphone array processing, which is applicable to hands-free ASR under highly-noisy environments. Implementation of the proposed techniques is open-sourced as robot audition software called "HARK." We show the effectiveness of these techniques through applications of HARK to robots.

    researchmap

  • Audio-visual musical instrument recognition

    AngelicaLim, 中村圭佑, 中臺一博, 尾形哲也, 奥乃博

    第73回全国大会講演論文集  2011.3 

     More details

    Language:English  

    Is this person playing a violin or a flute? Classification of musical instrument performances is usually carried out using audio features such as spectral coefficients. We propose augmenting the typical audio feature set with visual features. We show that a combination of audio features and video perform better than audio alone, and verify this multimodal recognition approach on a real-time robot platform.

    researchmap

  • 累積頻度重みを適用したパーティクルフィルタによる実時間楽譜追従

    大塚琢馬, 中臺一博, 高橋徹, 尾形哲也, 奥乃博

    第73回全国大会講演論文集  2011.3 

     More details

    Language:Japanese  

    パーティクルフィルタによる楽譜追従は,音響信号と楽譜との距離に基づくパーティクル重みの計算によって追従性能が大きく左右される.従来のベクトル内積計算やシグモイド関数を用いた重み計算手法では,音響信号の非調波成分や楽器の音色のバリエーションにより,楽譜位置推定が正しい場合,誤った場合でそれぞれの重みに大きな差が生じず,最終的に推定された楽譜位置に誤差が含まれるという問題点があった.本稿では,過去に計算された距離の累積頻度から重みを動的に計算し,正しい楽譜位置ではより高い重みを計算する.評価実験では,累積頻度を用いた重み計算法が,従来の重み計算法よりも楽譜追従精度で改善することが確認された.

    researchmap

  • Intelligent Human Tracking based on Information Integration

    NAKAMURA Keisuke, NAKADAI Kazuhiro, INCE Gokhan

    IEICE technical report  2011.5 

     More details

    Language:Japanese  

    Since scene recognition and robot perception have been of great interest, information integration has become a significant research topic in robotics. From the viewpoint of scalability and reusability, utilization of appropriate middleware is a key factor to improve total system performance. This paper presents an integration methodology of multimodal information through constructing an intelligent human tracking system. Our system architecture interoperably combines two different types of middleware ; HARK and ROS. HARK uses dataflow-oriented middleware for real-time processing while ROS is event-driven middleware for easy integration. We confirmed that the proposed architecture realized real-time processing and considerable improvements of noise-robustness in human tracking.

    researchmap

  • 遠隔ユーザの音環境理解を支援するユーザインタフェース

    植田 俊輔, 今井 倫太, 中村 圭佑, 中臺 一博

    JSAI大会論文集  2012 

     More details

    Language:Japanese  

    <p>人間は雑音が多い環境下であってもある程度どこでどのような会話が行われているかを理解する事が出来るが,遠隔操作を行うロボットアバタでは遠隔操作者が遠隔地の音環境を理解する事は困難である.本稿では,雑音環境下でも操作者と遠隔地がインタラクションをスムーズに行うことを支援するユーザインタフェースUI-ALTを提案する.オフライン実験によりUI-ALTは遠隔操作者の雑音環境理解に有用であることが示された.</p>

    researchmap

  • Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

    糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

    全国大会講演論文集  2012.3 

     More details

    Language:Japanese  

    人のギター演奏を対象とした実時間のビートトラッキングでは,シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある.我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では,視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い,手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

    researchmap

  • Kinectによる楽器マスキングを用いた視聴覚統合ビートトラッキング

    糸原達彦, 水本武志, 大塚琢馬, 中臺一博, 尾形哲也, 奥乃博

    第74回全国大会講演論文集  2012.3 

     More details

    Language:Japanese  

    人のギター演奏を対象とした実時間のビートトラッキングでは,シンコペーションのようなビートパターンの複雑さや人の演奏におけるテンポ揺らぎに対応する必要がある.我々はこれまでに音響情報と相関の深い弾き手の軌道を用いた視聴覚統合ビートトラッキングを開発してきた.しかし, ギターと手は色が似ているため, 手の軌道追従及びビートトラッキングの性能は十分ではなかった.本稿では,視聴覚センサに加えて深度センサも持つKinectを用いて,距離による画像マスキングを行い,手の領域を抽出する.本手法により, 手の追従の頑健さが増し, ビートトラッキングの精度が向上することを示す.

    researchmap

  • 2P1-P24 Development of a Sound Soure Localization System for Assisting Group Conversation(Communication Robot)

    Moon Seong-eun, Takagi Kentaro, Kamashima Tsutomu, Nakadai Kazuhiro, Otake Mihoko

    The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)  2013 

     More details

    Language:Japanese  

    This paper presents a sound source localization system that composes a wireless microphone array named Jellyfish-02 and robot audition software HARK. Jellyfish-02 surpasses existing microphone array in design and usability, because it has a cover with rechargeable battery, which can be connected to wireless network. We evaluated sound source localization performance of Jellyfish-02, and investigated the percentage of speech overlapped periods in natural conversation. Prom the results, Jellyfish-02 is potentially applicable for assisting group conversation by measuring duration of speech for each participant.

    researchmap

  • Applying FPGA to Sound Separation by Direction-Pass Filter

    SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

    Technical report of IEICE. VLD  2003.1 

     More details

    Language:Japanese  

    Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and are tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and are tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

    researchmap

  • Applying FPGA to Sound Separation by Direction-Pass Filter

    SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

    IEICE technical report. Computer systems  2003.1 

     More details

    Language:Japanese  

    Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform (FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of IGHz.

    researchmap

  • Applying FPGA to Sound Separation by Direction - Pass Filter

    SUZUKI Noriaki, NAKADAI Kazuhiro, AMANO Hideharu, OKUNO Hiroshi G, KITANO Hiroaki

    情報処理学会研究報告システムLSI設計技術(SLDM)  2003.1 

     More details

    Language:Japanese  

    Reconfigurable systems are efficient for high performance but low cost/power implementation for intelligent systems for robots. In this paper, a part of processing for the direction-pass filter, such as Fast Fourier Transform(FFT), square root, and arc tangent used in auditory system of humanoid robots are implemented on an FPGA, and their peformance is evaluated. Our result shows that FFT, square root and arc tangent implemented on the FPGA of 12MHz are 2.9 times, 2.9 times and 3.3 times faster, respectively, than those in Pentium III of 1GHz.

    researchmap

  • Three Simultaneous Speech Recognition by Applying Missing Feature Theory to Robot Audition System

    山本 俊一, 中臺 一博, 辻野 広司

    人工知能学会全国大会論文集  2004 

     More details

    Language:Japanese  

    researchmap

  • ロボット聴覚へのミッシングフィーチャー理論の適用による三話者同時発話認識

    山本 俊一, 中臺 一博, 辻野 広司, 奥乃 博

    人工知能学会全国大会論文集  2004 

     More details

    Language:Japanese  

    本稿では,ロボットに搭載された2つのマイクで録音された三話者同時発話音声を音源分離とミッシングフィーチャー理論に基づく音声認識によって行う手法を提案する.2体のロボットにおける実験により提案手法の有効性を確認する.

    researchmap

  • G-007 Missing Feature Theory Based Interface of Integrating Sound Source Separation and Automatic Speech Recognition

    Yamamoto Shunichi, Nakadai Kazuhiro, Tsujino Hiroshi, Okuno Hiroshi G

    情報科学技術フォーラム一般講演論文集  2004.8 

     More details

    Language:Japanese  

    researchmap

  • Active Audio - Visual Integration in Real - Time Human Tracking Humanoid SIG

    NAKADAI KAZUHIRO, HIDAI KEN-ICHI, OKUNO HIROSHI G, KITANO HIROAKI

    IPSJ SIG Notes. ICS  2001.10 

     More details

    Language:Japanese  

    This paper describes improvement of auditory processing by active motion and audio-visual integration. Generally, environmental noises and reverberation affect sound source localization and separation in the real world badly. Our real-time human tracking system for humanoid robots attained robust sound source licalization in the real world by active audio-visual integration. Then, we propose a new sound source separation method by active direction pass filter. Our experiments proves that active audio-visual integration is essential to robust perception for extraction of tracking sound source.

    researchmap

  • Research Issues and Current Status of Robot Audition

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    IEICE technical report. Speech  2001.12 

     More details

    Language:Japanese  

    In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

    researchmap

  • Research Issues and Current Status of Robot Audition

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    IEICE technical report. Natural language understanding and models of communication  2001.12 

     More details

    Language:Japanese  

    In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

    researchmap

  • Research Issues and Current Status of Robot Audition

    OKUNO Hiroshi G, NAKADAI Kazuhiro

    IPSJ SIG Notes  2001.12 

     More details

    Language:Japanese  

    In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localize sound sources and recognize auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sound sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.

    researchmap

  • A Sound Source Separation System for Polyphonic Music Based on the Tone Models

    Nakadai Kazuhiro, Kashino Kunio, Tanaka Hidehiko

    IPSJ SIG Notes  1993.4 

     More details

    Language:Japanese  

    A system configuration, implementation and evaluation of a sound source separation system are described. Input of the system is assumed to be a monaural audio signal of ensemble music, and output is MIDI data which has several MIDI channels, each of which is assigned to one kind of musical instruments. The present approach is based on the matching between registered tone models and sound spectrogram derived from the input signal. Experimental results show that more than 85% of the notes are correctly identified by the system on average, under the condition that the number of simultaneous notes in the input is three or less.

    researchmap

  • 音源分離システムにおけるパターン照合モジュールの動的負荷分散を用いた並列実装

    中臺一博, 柏野邦夫, 田中英彦

    情報処理学会研究報告知能と複雑系(ICS)  1994.7 

     More details

    Language:Japanese  

    researchmap

  • OPTIMA : Organized Processing toward Intelligent Music Scene Analysis -General Description of the Process Model-

    Kashino Kunio, Nakadai Kazuhiro, Tanaka Hidehiko

    IPSJ SIG Notes  1994.8 

     More details

    Language:Japanese  

    We describe OPTIMA, a process model for the perceptual sound source separation on computers. Our model consists of four parts: bottom-up processing modules, top-down processing modules, knowledge sources, and a hypothesis network for hierarchical and quantitative integration of multiple bits of information. First we present general description of the model. Since one of the most essential problems in the perceptual sound source separation is integration of multiple bits of information, we then focus our discussion on the hypothesis network: we show that our method has permitted efficient, autonomous and stable construction of an optimal internal model of the outer world.

    researchmap

  • Creation of Single Note Hypotheses in OPTIMA

    中台 一博, 柏野 邦夫, 田中 英彦

    全国大会講演論文集  1994.9 

     More details

    Language:Japanese  

    われわれは、音楽単音記号列生成システムにおける処理モデルとしてOPTIMAを提案した。[1]OPTIMAでは、モジュールが確信度をもった仮説の組を出力する場合、これを確率伝搬によって統合することができる。したがって、音楽単音記号列生成システムのように複数の情報を統合する必要がある場合には、有用な処理モデルであるということができる。OPTIMAの処理のうち本稿で扱う単音仮説生成モジュールでは、各仮説に確信度を与えなければならないため、確信度の与え方が問題である。このような確信度を与える単音仮説生成モジュールとして、音記憶を使用したモジュールが実装されている。このモジュールは音記憶から生成した混合音仮説と入力とのマッチングを行うモジュールであり、和音などの混合音の認識に効果的であった。しかし、一音ごとに音記憶が必要であること、および混合音数の増加にともない計算量が爆発してしまうことなど、効率、精度の面で音記憶だけでは限界があった。そこで、これらの問題を解決するために音色としての本質的な特徴を抽出し、音色空間上に表現を行った。このような音色空間を利用した楽器の類別、認識の研究はニューラルネットワークを使用したものなどがあり、単音に関しては良好な結果が得られている。そこで、本稿では音色空間の利用により、確信度をもった仮説の組を出力し、混合音に対しても認識を行うことができる単音仮説生成法を提案する。この手法では、各単音仮説の確信度は統計的手法により算出することができ、知識は音色ごとに与えられるため、音数に対する知識量の爆発、計算量の爆発を抑えることができる。

    researchmap

  • Creation and Verification of Note Hypotheses in OPTIMA based on Statistical Information

    中臺 一博, 柏野 邦夫, 木下 智義, 田中 英彦

    全国大会講演論文集  1995.3 

     More details

    Language:Japanese  

    われわれは、音楽情景分析における処理モデルとしてOPTIMAを提案し、これに基づく音楽情景分析の実験システムの実装・評価を行った。本稿では、実験システムのうち、周波数成分レベル、単音レベル間の処理を行う単音仮説生成処理部の実装および、評価について述べる。

    researchmap

  • Employment of music scene information in OPTIMA

    木下 智義, 柏野 邦夫, 中臺 一博, 田中 英彦

    全国大会講演論文集  1995.3 

     More details

    Language:Japanese  

    OPTIMAでは、複数の独立したモジュールに確率をもった仮説の組を出力させ、これを確率伝搬によって統合することによって外界の音響的事象に関する最尤推定像を求める。本稿ではOPTIMAにおいて利用される音楽シーン惰報として、拍位置および和音の情報の抽出と利用について議論し、実験システムに対する評価実験の結果を示す。

    researchmap

  • Implementation of OPTIMAOrganized Processing toward Intelligent Music Scene Analysis

    柏野 邦夫, 中臺 一博, 木下 智義, 田中 英彦

    全国大会講演論文集  1995.3 

     More details

    Language:Japanese  

    われわれは、聴覚的情景分析を「知覚的な音」の分離抽出(知覚的音源分離)と構造化の問題と捉え、モノラルの楽器演奏の音響信号を題材として、音楽情景分析(音楽音響信号を対象とする聴覚的情景分析)の処理モデルについて検討を行っている。ここで、知覚的音源分離とは、人間がひとつのものとして知覚または認識するような音響エネルギーのまとまり(これを知覚的な音と呼ぶ)を一つのものとして記号化することを指す。われわれは既に、ベイズの定理に基礎を置く定量的かつ階層的な情報統合のメカニズムを備えた音楽情景分析の処理モデルOPTIMA(Organized Processing toward Intelligent Music Scene Analysis)を提案した。この処理モデルに基づき、音楽情景分析の実験システムを実装し検討を行ったので、本稿でその概要を報告する。

    researchmap

  • Music note recognition based on prediction of notes

    木下 智義, 村岡 秀哉, 田中 英彦

    全国大会講演論文集  1998.3 

     More details

    Language:Japanese  

    researchmap

  • The method of defending system resources against continuous and high-speed setup process in ATM switching system

    WATANABE Hiroshi, NAKADAI Kazuhiro, SATOU Yukio, SAKAGUCHI Zenji, ASHIKAWA Hirotoshi

    IEICE technical report. Computer systems  1999.1 

     More details

    Language:Japanese  

    For operating the reliable data communications, we use the protocol message to control that communications. In case of the continuous and high-speed operating message to the node on purpose, it arises the problem that we cannot offer the service for lack of the resource of the node. In this paper, we propose the effective way that we can defend from that problem automatically. We propose to implement the operation in the software of the ATM node as a basical rule, instead of that the system manager operate manually. That way has the characteristic that we can execute self-defense automatically in the environment of the inter-communication(Ex.the private network as internet). We propose that the way applies TCP on the internet as well.

    researchmap

  • Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

    OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

    IPSJ SIG Notes  2000.3 

     More details

    Language:Japanese  

    Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won't work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

    researchmap

  • Tuning and Evaluation of the Beowuf - Class Cluster ERATO - 1

    OKUNO HIROSHI G, KYODA KOJI M, NAKADAI KAZUHIRO, KITANO HIROAKI

    IPSJ SIG Notes  2000.3 

     More details

    Language:Japanese  

    Beowulf-Class cluster is a logical organization of PC clusters composed of mass-market off-the-shelf hardware and software. The user may have problems that their implementation won't work well in hardware level or their implementation provides quite a poor performance. In this paper, we present a new method to tune and evaluation of the Beowulf-Class cluter by focusing on three levels : (1) network level, (2) message passing system level (e.g., MPI, PVM), and (3) application level. The first two performance is measured by NetPIPE developed by Ames Lab. ScaLAPACK (parallel version of LINPACK) is used as benchmarks for application programs, because it is one of the most common linear algebra subprograms and its evaluation is beneficial for numerical computation users. ScaLAPACK is tuned by using parameters determined by NetPIPE. ERATO-1 Beowulf-class cluster, 32 nodes of Pentium-II 450HHz processor with 256MByte of memory, is tuned by the proposed method. First, a network interface card installed in each ERATO-1 node is indentified as the cause of poor performance and finally ERATO-1 attained 6.76 GFlops with LINPACK benchmark.

    researchmap

  • General Description of OPTIMA : A Process Model of Perceptual Sound Source Separation for Music Scene Analysis

    柏野 邦夫, 中台 一博, 田中 英彦

    全国大会講演論文集  1994.9 

     More details

    Language:Japanese  

    われわれは、モノラルの楽器演奏を対象とする音源分離を題材として、知覚的音源分離システムについて検討を進めている。知覚的音源分離においては、観測データに加え、対象に関する知識や記憶に基づく処理を柔軟に組み合わせて最終的な結果を求めることが本質的な課題である。そこで本稿では、情報統合のメカニズムを備えた知覚的音源分離の処理モデル OPTIMA (Organized Processing toward Intelligent Music Scene Analysis)を提案する。

    researchmap

  • An Optima-based Music Scene Analysis System II : Evaluation of the Information Integration Mechanism

    KASHINO Kunio, NAKADAI Kazuhiro, KINOSHITA Tomoyoshi, TANAKA Hidehiko

    日本音響学会研究発表会講演論文集  1995.3 

     More details

    Language:Japanese  

    researchmap

  • An Optima-based Music Scene Analysis System I : Implementation and Evaluation of Processing Modules

    NAKADAI Kazuhiro, KASHINO Kunio, KINOSHITA Tomoyoshi, TANAKA Hidehiko

    日本音響学会研究発表会講演論文集  1995.3 

     More details

    Language:Japanese  

    researchmap

▼display all

Industrial property rights

▼display all

Awards

  • Best Paper Award

    2023.9  

     More details

  • Fellow

    2023.1   IEEE  

     More details

  • 2021 IEEE/SICE International Symposium on System Integration (SII 2021) Best Paper Finalist Award

    2022.1   IEEE  

     More details

  • 日本ロボット学会 フェロー

    2021.9   日本ロボット学会  

     More details

  • 日本ロボット学会 功労賞

    2021.9   日本ロボット学会  

     More details

  • 双葉電子財団 衛藤細矢記念賞

    2021.5   双葉電子財団  

     More details

  • 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence-2020), Amity Research Award for Significant contribution in the field of Artificial Intelligence

    2021.1  

     More details

  • Amity School of Engineering and Technology, Honorary Professor

    2021.1  

     More details

  • 日本景観生態学会第29回大会ベストポスター賞

    2020.3  

     More details

  • 情報処理学会第81回全国大会奨励賞

    2019.3  

     More details

  • 2019 IEEE/SICE International Symposium on System Integration (SII 2019) Best Paper Finalist Award

    2019.1   IEEE  

     More details

  • best generation award of innovation program

    2018.10   Ministry of Internal Affairs and Communications  

    Kazuhiro Nakadai

     More details

  • The 36th Annual Conference of the Robotics Society of Japan (RSJ 2018) International Session BEST PAPER AWARD

    2018.9   The Robotics Society of Japan  

    Kazuhiro Nakadai

     More details

  • IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017) Best Paper Award Finalist on Safety, Security, and Rescue Robotics (in memory of Motohiro Kisoi)

    2017.9   IEEE  

    Kazuhiro Nakadai

     More details

  • Best Paper Award, Advanced Robotics

    2016.9   The Robotics Society of Japan  

    Kazuhiro Nakadai

     More details

  • Incentive Award

    2016.6   JSAI  

    Kazuhiro Nakadai

     More details

  • IEEE-RAS International Symposium on Safety, Security, and Rescue Robotics (SSRR) Innovative Paper Award

    2015.10   IEEE  

    Kazuhiro Nakadai

     More details

  • IEEE-RAS International Symposium on Safety, Security, and Rescue Robotics Best Demonstration Award

    2015.10   IEEE  

    Kazuhiro Nakadai

     More details

  • Best Paper Award, Advanced Robotics

    2014.9   The Robotics Society of Japan  

    Kazuhiro Nakadai

     More details

  • Best Paper Award (1st Prize), International Conference on Industrial Engineering & Other Applications of Applied Intelligent Systems(IEA/AIE 2013)

    2013.6   International Society of Applied Intelligence (ISAI)  

    Kazuhiro Nakadai

     More details

  • Incentive Award

    2012.6   JSAI  

    Kazuhiro Nakadai

     More details

  • International Conference on Intellignet Robots and Systems (IROS 2011) BEST PAPER Nomination Finalist

    2011.10   IEEE  

    Kazuhiro Nakadai

     More details

  • A Best Paper Award, International Conference on Industrial Engineering & Other Applications of Applied Intelligent Systems(IEA/AIE 2010)

    2010.6   International Society of Applied Intelligence (ISAI)  

    Kazuhiro Nakadai

     More details

  • Incentive Award

    2009.6   JSAI  

    Kazuhiro Nakadai

     More details

  • Best paper award (3rd place)

    2009.6   IEEE Vail Computer Elements Workshop  

    Kazuhiro Nakadai

     More details

  • International Conference on Intelligent Robots and Systems (IROS 2008) New Technology Foundation (NTF) Award For Entertainment Robots and Systems Finalist

    2008.10   IEEE  

    Kazuhiro Nakadai

     More details

  • SI Best Session Award

    2006.12   SICE  

    Kazuhiro Nakadai

     More details

  • Funai Promotion Award

    2003.3   Funai Foundation on Information Technology  

    Kazuhiro Nakadai

     More details

  • International Conference on Intellignet Robots and Systems (IROS 2001) BEST PAPER Nomination Finalist

    2002.10   IEEE  

    Kazuhiro Nakadai

     More details

  • Telecommunication System Technology Award

    2002.3   The Telecommunications Advancement Foundation  

    Kazuhiro Nakadai

     More details

  • Best Paper Award (1st Prize), International Conference on Industrial Engineering & Other Applications of Applied Intelligent Systems(IEA/AIE 2001)

    2001.6   International Society of Applied Intelligence (ISAI)  

    Kazuhiro Nakadai

     More details

  • Best Paper Award, International Conference on Information Society (IS-2000)

    2000.10  

    Kazuhiro Nakadai

     More details

▼display all

Research Projects

  • Smart drone audition: A search and rescue drone system that listens and communicates

    Grant number:22KF0141  2023.3 - 2025.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for JSPS Fellows

      More details

    Grant amount:\2200000 ( Direct Cost: \2200000 )

    researchmap

  • 野鳥行動解析のためのマルチモーダル生態環境理解・解析技術の構築

    Grant number:20H00475  2020.4 - 2023.3

    日本学術振興会  科学研究費助成事業  基盤研究(A)

    中臺 一博, 井手 一郎, 鈴木 麗璽, 森本 元, 松林 志保, 小島 諒介

      More details

    Grant amount:\45500000 ( Direct Cost: \35000000 、 Indirect Cost:\10500000 )

    本研究課題は,ロボット分野で研究開発されてきた「ロボット聴覚技術」を発展させ,視覚処理技術や機械学習技術と統合,生態学・環境学に適用可能な「マルチモーダル環境理解技術」を確立し,野生動物の観測データを質・量ともに数百倍に引き上げる次世代野生動物観測技術の開発により,生態学・環境学を新たなレベルへ導くことをゴールに,野鳥の鳴き声と画像から複数野鳥同時三次元追跡技術を開発し,群れ中の個体間コミュニケーション行動,夜間行動,配偶行動解析に適用すること,実フィールド背景音解析を通じ,音景解析技術を確立,環境・人による野鳥生態系・世代間伝承への影響評価,いずれも手法構築と実フィールド観測・解析の両面から取り組むことを目標としている.初年度については,コロナ禍,ならびにこれに端を発する半導体不足の影響を大きく受け,屋外観測作業が遂行できず,また予定していた新規観測デバイスの構築が遅れた.このため,1年間の繰り越し処理を行ったが,2021年度も大きな状況の好転は見られず,全体として遅延がみられる.この中にあっても,創意工夫を行い,進められる項目について研究を推進し,以下のような実績を上げることができた.
    技術的な実績:複数マイクアレイによる三次元追跡技術,校正技術の構築, カメラ付き長期収録デバイスの開発と長期定点観測開始,音景解析技術として,低次元埋め込み手法構築
    論文的な実績:雑誌論文7, 国際会議11,国内会議22, 受賞5
    その他の実績:本研究課題の国際的なアピール活動として国際会議IEEE/SICE SII 2021 にてオーガナイズドセッション実施,人工知能学会AIチャレンジ研究会で本研究課題をテーマに2回研究会を開催,アウトリーチ活動としてロボット聴覚ソフトウェアHARK講習会を国内外の学会内 (IJCAI2020,人工知能学会合同研究会) で計2回開催.

    researchmap

  • Applications of robot audition techniques to multi-scale observations of ecological dynamics in bird vocalizations

    Grant number:19KK0260  2019.10 - 2023.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Fund for the Promotion of Joint International Research (Fostering Joint International Research (B))

      More details

    Grant amount:\18460000 ( Direct Cost: \14200000 、 Indirect Cost:\4260000 )

    researchmap

  • Audio-Visual Integration to Target Recognition by Drone Audition

    Grant number:17K00365  2017.4 - 2020.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (C)

    Kumon Makoto

      More details

    Grant amount:\4550000 ( Direct Cost: \3500000 、 Indirect Cost:\1050000 )

    In this study, it is considered to recognize targets on the ground from drones with microphones. The target acoustic signal obtained at the drone is generally significantly distorted by the ego-noise, and, hence, it is difficult to recognize the target only by acoustic signals. This study aims to develop the technology to compensate this difficulty by incorporating visual sensor information.
    Acoustic features that contain pauses is fused with visual features that are normally provided sequentially where it is not trivial to associate the visual information with the acoustic target.
    Based on the developed methods, it is shown that audio-visual integration improves the audio target recognition under noisy situation, and as an example, three-dimensional position estimation of moving plural targets by the drone with microphones was achieved.

    researchmap

  • Cognitive Interaction Model of Interaction Gap in Human-Robot Interaction

    Grant number:16H02884  2016.4 - 2020.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

    Imai Michita

      More details

    Grant amount:\16250000 ( Direct Cost: \12500000 、 Indirect Cost:\3750000 )

    Our project studied communication between humans and robots from the viewpoint of timing and gap to achieve natural communication. The first result was to develop a method for estimating the tiredness that a person feels when communicating with a robot. Our method was able to estimate the tiredness of communication by detecting the direction of the human face and improve the quality of the conversation of the robot. Secondly, we constructed a method to imitate human body movements in real-time. Previous researches used a time delay to prevent humans from noticing the body movement imitation. Our study devised a method to change the size of body movement imitation. Our method was able to improve communication with people by imitating body movements.

    researchmap

  • Outdoor Scene Understanding for Bird Song Analysis

    Grant number:16K00294  2016.4 - 2019.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (C)

    Nakadai Kazuhiro

      More details

    Grant amount:\4550000 ( Direct Cost: \3500000 、 Indirect Cost:\1050000 )

    From the singing voice sound signals of wild birds recorded by multiple microphone arrays, we developed outdoor sound environment understanding technology that extracts structured information on what, when and where of bird singing voice events, and that estimates the relationship between wild birds, by integrating robot audition and machine learning technologies. In addition, we have built an outdoor sound environment understanding system for bird song analysis that is easy to use even for a non-expert, and reduce the burden of singing voice analysis work on wild birds that has been performed manually, which resulted in contributing to the field of animal behavioral sciences and bioacoustics.

    researchmap

  • Deployment of Robot Audition Toward Understanding Real World

    Grant number:24220006  2012.5 - 2017.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (S)

    OKuno Hiroshi G

      More details

    Grant amount:\218140000 ( Direct Cost: \167800000 、 Indirect Cost:\50340000 )

    This research project aims at deployment of robot audition even to natural and disastrous environments by enhancing the robot audition software HARK. Once HARK for Windows was released, it has been downloaded about 90K times. Applications of multi-party interaction and music co-player robots demonstrate their feasibility. Robustness of sound source localization for UAV provided by iGSVD-MUSIC and sound-based shape estimation and speech enhancement for hose-shaped robots demonstrate the feasibility of using sounds for search and rescue robots. Acoustic analysis of frog choruses and development of HARKBIrd based on HARK and its evaluation in observing and analyzing bird song communication in actual fields demonstrate the feasibility of acoustical analysis of ecology. Finally, we have established fundamental technologies of robot audition for acoustical understanding of real world.

    researchmap

  • 聴覚インタラクションの実現に向けた実環境ロボット聴覚の研究

    Grant number:24118702  2012.4 - 2014.3

    日本学術振興会  科学研究費助成事業  新学術領域研究(研究領域提案型)

    中臺 一博

      More details

    Grant amount:\9360000 ( Direct Cost: \7200000 、 Indirect Cost:\2160000 )

    人とロボットが実環境で,より自然にインタラクションを行う「人ロボット共生のための聴覚インタラクション」実現のため,実環境ロボット聴覚技術を開発することを目的とし,当該年度は,個別基礎技術の洗練化とその統合技術に取り組んだ.
    (1) 実環境ロボット聴覚のためのセンサ同期技術については,自己雑音推定技術のロボット実機上での評価にフォーカスをあて研究を行った.非負値行列分解をノンパラメトリックベイズモデルを用いて拡張した自己雑音抑圧は,マイクロホン1本で,動作リファレンスを必要としない手法であるため,①マイクロホン間同期処理,②音―動作間同期処理が不要になるというメリットがある.まず,移動台車付ヒューマノイドロボット Hearboで,従来手法の中で高い性能が報告されているテンプレート法と比較を行ったところ,信号対雑音比,信号対妨害音比において,従来手法を上回る性能を確認できた.また,実際に人ロボット共生学のターゲットロボットの一つであるRovbovie Wを用いて評価を行ったところ,Hearbo とほぼ同等の性能が得られた.Robovie W は関節角情報が得られないため,従来法は適用できないことを考慮すると,提案法は,高性能かつ適用範囲が広いといえる.
    (2)よい聞き手ロボット構築のための実環境ロボット聴覚技術については,これまで研究開発を行ってきた,①音声の聞き分けを行うためのノンパラメトリックベイズモデルに基づく音源同定手法,および,② 音環境理解のためのマイクロホンアレイを用いた定位・分離・認識の統合技術を構築し,オープンソースのロボット聴覚ソフトHARK上で動作可能とした.さらに,③ 可視化技術に関しては,千葉大学大武研究室と共同で,卓上型マイクロホンアレイ「くらげ君」を開発し,上述のHARKを動作させることで,発話の方向やタイミングを,直感的でわかりやすく視覚化するツールを構築した.

    researchmap

  • ロボット聴覚の実環境理解に向けた多面的展開

    Grant number:24240035  2012

    日本学術振興会  科学研究費助成事業  基盤研究(A)

    奥乃 博, 加賀美 聡, 糸山 克寿, 公文 誠, 中臺 一博

      More details

    Grant amount:\21060000 ( Direct Cost: \16200000 、 Indirect Cost:\4860000 )

    音は画像と比べ拡散性が強いので,ロボット聴覚による音環境理解は,画像だけでは捉えきれない環境でも理解できる一方,広域から得られる情報の活用方法が課題となる.本研究課題では,既開発のロボット聴覚を基に,実環境音環境理解が可能な安全安心のためのロボット聴覚技術の多面的展開を目的とする.
    具体的には,
    WP1:多様なマイクロフォンコンフィグレーションへの展開,HARK-16の性能向上や分散設置された複数のマイクロホンアレイの同期方法,
    WP2:室内から屋外への展開,室内での音響マップ作成から無人飛行機による空中からの音の取得と音源定位,
    WP3:音声から楽音・環境音を含めた音一般への展開,特にノンパラメトリックベイズ信号処理,音光変換による動物音響学,楽器演奏音からの楽器音実時間分離,環境音の擬音語認識,
    に取り組むことになっていた.研究開始から辞退までの2ヶ月間で,実験装置の準備と,無人ヘリコプタの使用の詳細化,無人ヘリコプタ搭載用のマルチチャネルAD装置の設計,特に,非同期分散マイクの処理を高性能化するための時間情報付き音響データ転送方式の設計を行った.また,
    HARK-Binauralの洗練化,移動音源を対象とした音源定位のベイズ手法の開発,ベイズ手法による突発音や反射音を抑制したMUSIC(Multiple Signal Classification)法の開発,音源の活動状況と音源分離とを同時に推定するノンパラメトリックベイズ手法によるIVA法の開発,楽器音の音モデルのゆらぎを許容する多重演奏曲の楽器音分離法の開発,バンドパスフィルタを用いたカエルホタルの高機能化などに取り組んだ.

    researchmap

  • Computational Auditory Scene Analysis Using Active Audio-Visual Integration in a Dynamically Changing Environment

    Grant number:22700165  2010 - 2012

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Young Scientists (B)

    NAKADAI Kazuhiro

      More details

    Grant amount:\4030000 ( Direct Cost: \3100000 、 Indirect Cost:\930000 )

    A framework for Audio-Visual Integration (AVI), which can provide optimal integration according to quality of audio and visual information obtained from a robot’s camera and microphone, was proposed and implemented. In addition, the proposed framework was extended by proposing “Active Audio Visual Integration (AAVI)”, which improves the quality of audio and visual information using active robot ’ s motion. Preliminary experiments on automatic speech recognition and voice activity detection showed that the AAVI framework worked effectively even in visually and/or auditorily noisy conditions.

    researchmap

  • 音楽を通じた人とロボットの共生

    Grant number:22118502  2010 - 2011

    日本学術振興会  科学研究費助成事業  新学術領域研究(研究領域提案型)

    中臺 一博

      More details

    Grant amount:\11960000 ( Direct Cost: \9200000 、 Indirect Cost:\2760000 )

    H23年度については,これまでに構築した音楽処理に関連する個々の機能(楽譜情報を利用した頑健なビートトラッキング技術,自己雑音抑制技術,Kinectを用いた手の動き検出技術,フルート奏者のフルートの動き検出を利用した曲の開始・終了検出技術,振動子を用いた人・ロボット合奏モデル)を統合して,実機ロボットを用いた合奏デモを構築した.具体的には,人型ロボット2台,演奏者(人間)2名の4者によるカルテットを実現し,ロボットが人に,また人がロボットに合わせることにより調和のとれた人ロボット音楽インタラクションを実現した.また,人の楽器演奏に合わせてテルミンを演奏するロボットを構築し,ロボット分野で最高峰の国際会議であるIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011)のExhibition Sessionや人工知能学会AI-Challenge研究会において実機デモを行い,その有効性を示した.さらに,より人ロボット共生学領域に貢献すべく,ATRで開発した16チャンネル屋内設置型マイクロホンアレイを用いて,複数名が自発的に行う会話に対して,各話者の位置や発話区間を推定する技術を開発した.また,誤推定を測る指標を提案し,その有効性を明らかにした.計画時に提案した音楽インタラクションにとどまらず,マイクロホンアレイを用いたよい聞き手ロボット実現に向けた基礎技術を開発することもでき,計画以上に研究を進めることができた.

    researchmap

  • Development of Robot Audition based on Computational Auditory Scene Analysis

    Grant number:19100003  2007 - 2011

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (S)

    OKUNO Hiroshi, OGATA Tetsuya, KOMASTANI Kazunori, TAKAHASHI Toru, SHIRAMATSU Shun, NAKADAI Kazuhiro, KITAHARA Tetsuro, ITOYAMA Katsutoshi, ASANO Futoshi

      More details

    Grant amount:\119340000 ( Direct Cost: \91800000 、 Indirect Cost:\27540000 )

    Three main features of Computational Auditory Scene Analysis, sound source localization, sound source separation, and recognition of separated sounds, have been developed and their collections are made available as an open-sourced robot audition software called "HARK". As a proof of concepts in this robot audition, we developed "Prince Shotoku" robots that can listen to simultaneous talkers, and a spoken dialogue system that accepts a barge-in utterance of the user. We also developed various technologies to separate musical instrument parts for polyphonic performance, and real-time score following systems. These musical-related technologies are applied to make musical robots to play ensemble with human players

    researchmap

  • audio-visual speech recognition for robots

    Grant number:19700158  2007 - 2008

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Young Scientists (B)

    NAKADAI Kazuhiro

      More details

    Grant amount:\3480000 ( Direct Cost: \3300000 、 Indirect Cost:\180000 )

    researchmap

▼display all