Updated on 2026/03/11

写真a

 
YOKOTA RIO
 
Organization
Institute of Integrated Research Supercomputing Research Center Professor
Title
Professor
External link

Degree

  • Ph.D. (Engineering) ( Keio University )

Research Interests

  • Numerical Analysis

  • Deep Learning

  • High Performance Computing

  • GPU

Research Areas

  • Manufacturing Technology (Mechanical Engineering, Electrical and Electronic Engineering, Chemical Engineering) / Fluid engineering

  • Informatics / Intelligent informatics

  • Informatics / Mathematical informatics

  • Informatics / Computational science

  • Informatics / High performance computing

Education

  • Keio University   Graduate School of Science and Technology   School of Science for Open and Environmental Systems (PhD)

    2005.4 - 2009.3

      More details

  • Keio University   Graduate School of Science and Technology   School of Science for Open and Environmental Systems (Masters)

    2003.4 - 2005.3

      More details

    Country: Japan

    researchmap

  • Keio University   Faculty of Science and Technology   Department of Mechanical Engineering

    1997.4 - 2003.3

      More details

    Country: Japan

    researchmap

Research History

  • Institute of Science Tokyo   Institute of Integrated Research, Supercomputing Research Center   Professor

    2024.10

      More details

  • Tokyo Institute of Technology   Global Scientific Information and Computing Center   Professor

    2023.1 - 2024.9

      More details

  • Tokyo Institute of Technology   Global Scientific Information and Computing Center   Associate Professor

    2015.4 - 2022.12

      More details

  • King Abdullah University of Science and Technology   Applied Mathematics and Computer Science   Research Scientist

    2011.9 - 2015.3

      More details

  • Boston University   Department of Mechanical Engineering   Post-doctoral Research Associate

    2010.9 - 2011.8

      More details

  • University of Bristol   Department of Mathematics   Post-doctoral Research Associate

    2009.2 - 2010.8

      More details

▼display all

Professional Memberships

  • THE JAPAN SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS

      More details

  • THE JAPAN SOCIETY FOR COMPUTATIONAL ENGINEERING AND SCIENCE

      More details

  • INFORMATION PROCESSING SOCIETY OF JAPAN

      More details

  • Ameriacn Society of Mechanical Engineers

      More details

  • Japan Society of Mechanical Engineers

      More details

  • Society for Industrial and Applied Mathematics

      More details

  • Association for Computing Machinery

      More details

  • THE JAPAN SOCIETY OF MECHANICAL ENGINEERS

      More details

  • Association for Computing Machinery

      More details

  • Institute of Electrical and Electronics Engineers

      More details

  • THE JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE

      More details

▼display all

Committee Memberships

  • IEEE International Conference on Cluster Computing (IEEE CLUSTER 2025)   track co-chair  

    2025   

      More details

    Committee type:Academic society

    researchmap

  • Platform for Advanced Scientific Computing (PASC 2025)   domain co-chair  

    2025   

      More details

    Committee type:Academic society

    researchmap

  • 39th IEEE International Parallel and Distributed Processing Symposium, (IPDPS 2025)   program committee, best OSS judge  

    2025   

      More details

    Committee type:Academic society

    researchmap

  • International Conference on Parallel Architectures and Compilation Techniques (PACT 2025)   publicity chair  

    2025   

      More details

    Committee type:Academic society

    researchmap

  • The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2025)   proceedings vice-chair, program committee, AI4S committee  

    2025   

      More details

    Committee type:Academic society

    researchmap

  • Conference on Neural Information Processing Systems (NeurIPS 2025)   reviewer  

    2025   

      More details

    Committee type:Academic society

    researchmap

  • ISC High Performance (ISC 2025)   program committee  

    2025   

      More details

    Committee type:Academic society

    researchmap

  • International Conference on Machine Learning (ICML 2025)   reviewer  

    2025   

      More details

    Committee type:Academic society

    researchmap

  •   SIAM Conference on Linear Algebra (LA24), scientific committee  

    2024   

      More details

  •   The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2024), reviewer  

    2024   

      More details

  •   38th Conference on Neural Information Processing Systems (NeurIPS 2024), reviewer  

    2024   

      More details

  •   The 30th International Conference on Parallel Processing (Euro-Par 2024), program committee  

    2024   

      More details

  •   ACM International Symposium on High-Performance, Parallel and Distributed Computing (HPDC 2024), program committee  

    2024   

      More details

  •   SC Asia (SCA 2024), program committee  

    2024   

      More details

  •   Platform for Advanced Scientific Computing (PASC 2024), program committee  

    2024   

      More details

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2024), poster, ML track, AI4S/TPC WS, LLMs for HPC BoF  

    2024   

      More details

  •   International Conference on Preconditioning Techniques for Scientific and Industrial Applications, (Precond 2024), program committee  

    2024   

      More details

  •   38th IEEE International Parallel and Distributed Processing Symposium, (IPDPS 2024), track co-chair  

    2024   

      More details

  •   ISC High Performance (ISC 2024), posters chair  

    2024   

      More details

  •   The 12th International Conference on Learning Representations (ICLR 2024), reviewer  

    2024   

      More details

  •   The Eleventh International Conference on Learning Representations (ICLR 2023), reviewer  

    2023   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2023), posters, workshops, ML tech paper, panelist, best paper  

    2023   

      More details

  •   37th Conference on Neural Information Processing Systems (NeurIPS 2023), reviewer  

    2023   

      More details

  •   10th International Congress on Industrial and Applied Mathematics (ICIAM 2023), program committee  

    2023   

      More details

  •   Platform for Advanced Scientific Computing (PASC 2023), mini symposia and posters com- mittee, program committee  

    2023   

      More details

  •   The 23rd IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing (CCGrid 2023), program committee  

    2023   

      More details

  •   The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2023), re- viewer  

    2023   

      More details

  •   The 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT 2023), publicity chair Asia  

    2023   

      More details

  •   IEEE International Conference on Cluster Computing (IEEE CLUSTER 2023), publicity chair  

    2023   

      More details

  •   Principles and Practice of Parallel Programming (PPoPP 2023), publicity chair  

    2023   

      More details

  •   37th IEEE International Parallel and Distributed Processing Symposium, (IPDPS 2023), pro- gram committee  

    2023   

      More details

  •   ISC High Performance (ISC 2023), posters deputy chair  

    2023   

      More details

  •   SIAM Conference on Computational Science and Engineering (CSE23), poster judge  

    2023   

      More details

  •   IEEE Cluster (IEEE Cluster 2022), program committee  

    2022   

      More details

    Committee type:Academic society

    researchmap

  •   International Conference on High Performance Computing ini Asia-Pacific Region (HPC Asia 2022), Applications track co-chair  

    2022   

      More details

    Committee type:Academic society

    researchmap

  •   European Conference on Computer Vision (ECCV 2022), reviewer  

    2022   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2022), program committee, workshop chair  

    2022   

      More details

    Committee type:Academic society

    researchmap

  •   49th International Conference on Parallel Processing (ICPP 2022), program committee  

    2022   

      More details

    Committee type:Academic society

    researchmap

  •   The 36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022), program committee  

    2022   

      More details

    Committee type:Academic society

    researchmap

  •   ISC High Performance (ISC 2022), poster committee  

    2022   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE Cluster (IEEE Cluster 2021), program committee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   ISC High Performance (ISC 2021), program committee, best paper committee, steering com- mittee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   The 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), program committee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   International Conference on High Performance Computing ini Asia-Pacific Region (HPC Asia 2021), organizing committee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   SIAM Computational Science and Engineering (SIAM CSE 2021), organizing committee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2021), program committee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   48th International Conference on Parallel Processing (ICPP 2021), program committee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2021), program committee  

    2021   

      More details

    Committee type:Academic society

    researchmap

  •   SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP 2020), program committee  

    2020   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2020), ML track Chair  

    2020   

      More details

    Committee type:Academic society

    researchmap

  •   The 26th International Conference on Parallel Processing (Euro-Par 2020), program committee  

    2020   

      More details

    Committee type:Academic society

    researchmap

  •   International Conference on High Performance Computing ini Asia-Pacific Region (HPC Asia 2020), program committee  

    2020   

      More details

    Committee type:Academic society

    researchmap

  •   48th International Conference on Parallel Processing (ICPP 2019), Applications track chair, Workshop co-organizer  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   SIAM Conference on Computational Science and Engineering (SIAM CSE19), organizing com- mittee  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2019), program committee  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   ISC High Performance (ISC 2019), program committee  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE Cluster (IEEE Cluster 2019), program committee  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   Platform for Advanced Scientific Computing Conference (PASC 2019), program committee  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   The 3rd cross-disciplinary Workshop on Computing Systems, Infrastructures, and Program- ming (xSIG 2019), program committee  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC 2019), ATMG Track Chair  

    2019   

      More details

    Committee type:Academic society

    researchmap

  •   SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP 2018), organizing committee  

    2018   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2018), proceedings chair  

    2018   

      More details

    Committee type:Academic society

    researchmap

  •   ISC High Performance (ISC 2018), proceedings chair  

    2018   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE Cluster (IEEE Cluster 2018), program committee  

    2018   

      More details

    Committee type:Academic society

    researchmap

  •   The 32nd IEEE International Parallel & Distributed Processing Symposium (IPDPS 2018), program committee  

    2018   

      More details

    Committee type:Academic society

    researchmap

  •   18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2018), program committee  

    2018   

      More details

    Committee type:Academic society

    researchmap

  •   SC Asia (SCA 2018), program committee  

    2018   

      More details

    Committee type:Academic society

    researchmap

  •   The 23rd International Conference on Parallel Processing (Euro-Par 2017), program committee  

    2017   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2017), program committee  

    2017   

      More details

    Committee type:Academic society

    researchmap

  •   Platform for Advanced Scientific Computing Conference (PASC 2017), program committee  

    2017   

      More details

    Committee type:Academic society

    researchmap

  •   The 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 2017), program committee  

    2017   

      More details

    Committee type:Academic society

    researchmap

  •   International Conference on Supercomputing (ICS 2017), program committee  

    2017   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE Cluster (IEEE Cluster 2017), program committee  

    2017   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   International Conference on Supercomputing (ICS 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   Platform for Advanced Scientific Computing Conference (PASC 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   ISC High Performance (ISC 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   The 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   The 23rd annual IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE Cluster (IEEE Cluster 2016), program committee  

    2016   

      More details

    Committee type:Academic society

    researchmap

  •   IEEE Cluster (IEEE Cluster 2015), program committee  

    2015   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2015), program committee  

    2015   

      More details

    Committee type:Academic society

    researchmap

  •   The 22nd annual IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2015), program committee  

    2015   

      More details

    Committee type:Academic society

    researchmap

  •   15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2015), program committee  

    2015   

      More details

    Committee type:Academic society

    researchmap

  •   14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014), program committee  

    2014   

      More details

    Committee type:Academic society

    researchmap

  •   The annual IEEE International Conference on High Performance Computing (HiPC 2014), program committee  

    2014   

      More details

    Committee type:Academic society

    researchmap

  •   The 20th International Conference on Parallel Processing (Euro-Par 2014), program committee  

    2014   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2014), program committee  

    2014   

      More details

    Committee type:Academic society

    researchmap

  •   The International Meeting on High-Performance Computing for Computational Science (VEC- PAR 2014), program committee  

    2014   

      More details

    Committee type:Academic society

    researchmap

  •   The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2013), program committee  

    2013   

      More details

    Committee type:Academic society

    researchmap

  •   The 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), program committee  

    2013   

      More details

    Committee type:Academic society

    researchmap

  •   18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2013), program committee  

    2013   

      More details

    Committee type:Academic society

    researchmap

  •   The annual IEEE International Conference on High Performance Computing (HiPC 2013), program committee  

    2013   

      More details

    Committee type:Academic society

    researchmap

  •   15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2013), program committee  

    2013   

      More details

    Committee type:Academic society

    researchmap

  •   The Third International Workshop on Frontier of GPU Computing, program committee  

    2012   

      More details

    Committee type:Academic society

    researchmap

  •   The 12th International Symposium on Parallel and Distributed Computing (ISPDC 2012), program committee  

    2012   

      More details

    Committee type:Academic society

    researchmap

▼display all

Papers

  • Tensor-Core-Optimized Strategies for BLR × Tall-Skinny Matrix Multiplication in BEM

    Akihiro Ida, Kazuya Goto, Rio Yokota, Tasuku Hiraishi, Toshihiro Hanawa, Takeshi Iwashita, Masatoshi Kawai, Satoshi Ohshima, Tetsuya Hoshino

    Proceedings of the Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region   153 - 164   2026.1

     More details

    Publishing type:Research paper (international conference proceedings)   Publisher:ACM  

    DOI: 10.1145/3773656.3773678

    researchmap

  • A Study on the Performance and Usability of Managed Memory and Unified Memory for Accelerating Numerical Calculation Program

    Satoshi Ohshima, Akihiro Ida, Masatoshi Kawai, Takeshi Fukaya, Rio Yokota

    2025 IEEE 18th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)   41 - 48   2025.12

     More details

    Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    DOI: 10.1109/mcsoc67473.2025.00017

    researchmap

  • Variational Learning Finds Flatter Solutions at the Edge of Stability Reviewed International journal

    Rio Yokota

    Annual Conference on Neural Information Processing Systems (NeurIPS)   2025.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Masked Gated Linear Unit Reviewed

    Yukito Tajima, Nakamasa Inoue, Yusuke Sekikawa, Ikuro Sato, Rio Yokota

    Annual Conference on Neural Information Processing Systems (NeurIPS)   2025.12

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-training Reviewed International journal

    Rio Yokota

    EMNLP Findings   2025.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models Reviewed

    Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki

    Conference on Language Modeling (COLM)   2025.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Optimal Sparsity of Mixure-of-Experts Language Models for Reasoning Tasks Reviewed

    Taishi Nakamura, Satoki Ishikawa, Masaki Kawamura, Takumi Okamoto, Daisuke Nohara, Jun Suzuki, Rio Yokota

    ICML 2025 2nd AI for Math Workshop   2025.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Lion Cub: Minimizing Communication Overhead in Distributed Lion Reviewed

    Satoki Ishikawa, Tal Ben-Nun, Brian Van Essen, Rio Yokota, Nikoli Dryden

    ICML 2025 Workshop TTODLer-FM   2025.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers

    Chen Zhuang, Lingqi Zhang, Du Wu, Peng Chen, Jiajun Huang, Xin Liu, Rio Yokota, Nikoli Dryden, Toshio Endo, Satoshi Matsuoka, Mohamed Wahib

    Proceedings of the 39th ACM International Conference on Supercomputing   57 - 72   2025.6

     More details

    Publishing type:Research paper (international conference proceedings)   Publisher:ACM  

    DOI: 10.1145/3721145.3730422

    researchmap

  • Improving LoRA with Variational Learning.

    Bai Cong, Nico Daheim, Yuesong Shen, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

    CoRR   abs/2506.14280   2025.6

     More details

    Publishing type:Research paper (scientific journal)  

    DOI: 10.48550/arXiv.2506.14280

    researchmap

  • On the Interplay Between Precision, Rank, Admissibility, and Iterative Refinement for Hierarchical Low-Rank Matrix Solvers Reviewed

    Thomas Spendlhofer, Qianxiang Ma, Yasuhiro Matsumoto, Rio Yokota

    ISC High Performance   2025.6

     More details

    Language:English  

    researchmap

  • Rewriting Pre-Training Data Boosts LLM Performance in Math and Code.

    Kazuki Fujii, Yukito Tajima, Sakae Mizuki, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Masanari Ohi, Masaki Kawamura, Taishi Nakamura, Takumi Okamoto, Shigeki Ishida, Kakeru Hattori, Youmi Ma, Hiroya Takamura, Rio Yokota, Naoaki Okazaki

    CoRR   abs/2505.02881   2025.5

     More details

    Publishing type:Research paper (scientific journal)  

    DOI: 10.48550/arXiv.2505.02881

    researchmap

  • Quantum Turbulence Coupled with Externally Driven Normal-Fluid Turbulence in Counterflow of Superfluid 4He Reviewed

    Satoshi Yui, Hiromichi Kobayashi, Makoto Tsubota, Rio Yokota

    Journal of the Physical Society of Japan   94 ( 4 )   2025.4

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Physical Society of Japan  

    DOI: 10.7566/jpsj.94.043601

    researchmap

  • Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation Reviewed

    Satoki Ishikawa, Ryo Karakida, Rio Yokota

    The 13th International Conference on Learning Representations (ICLR)   2025.4

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Drop Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization Reviewed

    Taishi Nakamura, Takuya Akiba, Kazuki Fujii, Yusuke Oda, Rio Yokota, Jun Suzuki

    The 13th International Conference on Learning Representations (ICLR)   2025.4

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • SuperGCN: General and Scalable Framework for GCN Training on CPU-powered Supercomputers Reviewed

    Chen Zhuang, Peng Chen, Xin Liu, Nikoli Dryden, Rio Yokota, Toshio Endo, Satoshi Matsuoka, Mohamed Wahib

    ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP)   2025.3

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • 新聞記事と合成データを用いた日本語LLMの継続事前学習

    服部翔, 水木栄, 藤井一喜, 中村泰士, 塩谷泰平, 植木快, 新妻巧朗, 田森秀明, Youmi Ma, 前田航希, 大井聖也, 齋藤幸史郎, 岡本拓己, 石田茂樹, 横田理央, 高村大也, 岡崎直観

    言語処理学会第31回年次大会   2025.3

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 模倣学習による大規模言語モデルの指示チューニング

    Youmi Ma, 水木栄, 藤井一喜, 中村泰士, 大井聖也, 島田比奈理, 塩谷泰平, 齋藤幸史郎, 前田航希, 服部翔, 岡本拓己, 石田茂樹, 横田理央, 高村大也, 岡崎直観

    言語処理学会第31回年次大会   2025.3

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Reviewed

    Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, et al.

    The 31st International Conference on Computational Linguistics (COLING), Industry Track   2025.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • A General and Scalable GCN Training Framework on CPU Supercomputers.

    Chen Zhuang, Peng Chen 0035, Xin Liu 0020, Rio Yokota, Nikoli Dryden, Lingqi Zhang 0001, Toshio Endo, Satoshi Matsuoka, Mohamed Wahib

    PPoPP   566 - 568   2025

     More details

    Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1145/3710848.3710860

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/ppopp/ppopp2025.html#ZhuangCLYD0EMW25

  • On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process

    Shun Iwase, Shuya Takahashi, Nakamasa Inoue, Rio Yokota, Ryo Nakamura, Hirokatsu Kataoka, Eisaku Maeda

    2025

  • Rethinking Image Super-Resolution from Training Data Perspectives Reviewed

    Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki

    European Conference on Computer Vision (ECCV)   2025

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-031-72643-9_2

    researchmap

  • Scaling Backwards: Minimal Synthetic Pre-Training? Reviewed

    Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka

    European Conference on Computer Vision (ECCV)   2025

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-031-72633-0_9

    researchmap

  • A Framework for Seamless Integration and Efficient Continual Pre-Training of Large Language Models Reviewed

    Kazuki Fujii, Taishi Nakamura, Rio Yokota

    SC’24 TPC Workshop   2024.11

     More details

    Language:English   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities Reviewed

    Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Shota Hirai, Sakae Mizuki, Rio Yokota, Naoaki Okazaki

    Conference on Language Modeling COLM   2024.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Building a Large Japanese Web Corpus for Large Language Models Reviewed

    Naoaki Okazaki, Kakeru Hattori, Shota Hirai, Hiroki Iida, Masanari Ohi, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Rio Yokota, Sakae Mizuki

    Conference on Language Modeling COLM   2024.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Formula-Supervised Visual-Geometric Pre-training Reviewed

    Ryosuke Yamada, Kensho Hara, Hirokatsu Kataoka, Koshi Makihara, Nakamasa Inoue, Rio Yokota, Yutaka Satoh

    European Conference on Computer Vision (ECCV)   2024.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-031-72670-5_4

    researchmap

  • 画像超解像における学習データ構築の再考 Reviewed

    大谷 豪, 田所 龍, 山田 亮佑, Yuki M. asano, Iro Laina, Chistian Repprech, 井上 中順, 横田 理央, 片岡 裕雄, 青木 義満

    第27回 画像の認識・理解シンポジウム (MIRU)   2024.8

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Scaling Backwards: Minimal Synthetic Pre-training? Reviewed

    田所 龍, 中村 凌, 山田 亮佑, Yuki M. Asano, Iro Laina, Chistian Repprech, 井上 中順, 横田 理央, 片岡 裕雄

    第27回 画像の認識・理解シンポジウム (MIRU)   2024.8

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 単眼カメラを用いたリアルタイムな3次元マップの変化検出を目的とした密なバンドル調整 Reviewed

    大川快, 櫻田健, 横田理央

    第27回 画像の認識・理解シンポジウム (MIRU)   2024.8

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • An inherently parallel ℋ2-ULV factorization for solving dense linear systems on GPUs Reviewed

    Qianxiang Ma, Rio Yokota

    The International Journal of High Performance Computing Applications   2024.7

     More details

    Authorship:Last author   Publishing type:Research paper (scientific journal)  

    DOI: 10.1177/10943420241242021

    researchmap

  • Variational Learning is Effective for Large Deep Networks Reviewed

    Yuesong Shen, Nico Daheim, Gian Maria Marconi, Peter Nickl, Bai Cong, Bazan Clemen, Emile Marcel Raoul, Rio Yokota, Iryna Gurevych, Daniel Cremers, Mohammad, Emtiyaz Khan, Thomas Möllenhoff

    The 41st International Conference on Machine Learning (ICML)   2024.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Interaction between quantum turbulence and normal-fluid turbulence in superfluid helium Reviewed

    Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Tomokazu Saito, Rio Yokota

    Thirteenth International Symposium on Turbulence and Shear Flow Phenomena (TSFP13)   2024.6

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • SIFTer: Self-improving Synthetic Datasets for Pre-training Classification Models Reviewed

    Ryo Hayamizu, Shota Nakamura, Sora Takashima, Hirokatsu Kataoka, Ikuro Sato, Nakamasa Inoue, Rio Yokota

    CVPR SynData Workshop   2024.6

     More details

    Authorship:Last author, Corresponding author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • One-Shot NASによるBERTのモデル圧縮

    岡本拓己, 横田理央

    人工知能学会全国大会 (第38回)   2024.5

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • When Does Second-Order Optimization Speed Up Training? Reviewed

    Satoki Ishikawa, Rio Yokota

    The 12th International Conference on Learning Representations (ICLR), Tiny paper   2024.5

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • DGEMM on integer matrix multiplication unit Reviewed

    Hiroyuki Ootomo, Katsuhisa Ozaki, Rio Yokota

    The International Journal of High Performance Computing Applications   38 ( 4 )   297 - 313   2024.3

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (scientific journal)   Publisher:SAGE Publications  

    Deep learning hardware achieves high throughput and low power consumption by reducing computing precision and specializing in matrix multiplication. For machine learning inference, fixed-point value computation is commonplace, where the input and output values and the model parameters are quantized. Thus, many processors are now equipped with fast integer matrix multiplication units (IMMU). It is of significant interest to find a way to harness these IMMUs to improve the performance of HPC applications while maintaining accuracy. We focus on computing double-precision equivalent matrix multiplication using the Ozaki scheme, which computes a high-precision matrix multiplication by using lower-precision computing units, and show the advantages and disadvantages of using IMMU. The experiment using integer Tensor Cores shows that we can compute double-precision matrix multiplication faster than cuBLAS and an existing Ozaki scheme implementation on FP16 Tensor Cores on NVIDIA consumer GPUs. Furthermore, we demonstrate accelerating a quantum circuit simulation by up to 4.85 while maintaining the FP64 accuracy.

    DOI: 10.1177/10943420241239588

    researchmap

    Other Link: https://journals.sagepub.com/doi/full-xml/10.1177/10943420241239588

  • 継続事前学習による日本語に強い大規模言語モデルの構築

    藤井一喜, 中村泰士, Mengsay Loem, 飯田大貴, 大井聖也, 服部翔, 平井翔太, 水木栄, 横田理央, 岡崎直観

    言語処理学会第30回年次大会   2024.3

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 大規模言語モデルの分散並列学習

    藤井一喜, 横田理央

    情報処理学会全国大会   2024.3

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 大規模言語モデルの構造探索

    岡本拓己, 横田理央

    情報処理学会全国大会   2024.3

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Swallowコーパス: 日本語大規模ウェブコーパス

    岡崎直観, 服部翔, 平井翔太, 飯田大貴, 大井聖也, 藤井一喜, 中村泰士, Mengsay Loem, 横田理央, 水木栄

    言語処理学会第30回年次大会   2024.3

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 大規模言語モデルの日本語能力の効率的な強化: 継続事前学習における語彙拡張と対訳コーパスの活用

    水木栄, 飯田大貴, 藤井一喜, 中村泰士, Mengsay Loem, 大井聖也, 服部翔, 平井翔太, 横田理央, 岡崎直観

    言語処理学会第30回年次大会   2024.3

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 継続学習を用いた効率の良いマルチリンガル・マルチエキスパートモデルの開発

    中村泰士, 横田理央

    情報処理学会全国大会   2024.3

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Visual SLAM を目的とした深度の一貫性を考慮した密なバンドル調整

    大川 快, 櫻田 健, 横田 理央

    CVIM 2024年1月研究会   2024.1

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Natural Gradient Primal-Dual Method for Decentralized Learning. Reviewed

    Kenta Niwa, Hiro Ishii, Hiroshi Sawada, Akinori Fujino, Noboru Harada, Rio Yokota

    IEEE Trans. Signal Inf. Process. over Networks   10   417 - 433   2024

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1109/TSIPN.2024.3388948

    researchmap

  • Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs.

    Koshiro Saito, Sakae Mizuki, Masanari Ohi, Taishi Nakamura, Taihei Shiotani, Koki Maeda, Youmi Ma, Kakeru Hattori, Kazuki Fujii, Takumi Okamoto, Shigeki Ishida, Hiroya Takamura, Rio Yokota, Naoaki Okazaki

    CoRR   abs/2412.14471   2024

     More details

    Publishing type:Research paper (scientific journal)  

    DOI: 10.48550/arXiv.2412.14471

    researchmap

  • Variational Low-Rank Adaptation Using IVON.

    Bai Cong, Nico Daheim, Yuesong Shen, Daniel Cremers, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

    CoRR   abs/2411.04421   2024

     More details

    Publishing type:Research paper (scientific journal)  

    DOI: 10.48550/arXiv.2411.04421

    researchmap

  • 超解像事前学習における核心的要素の解明

    大谷 豪, 田所 龍, 片岡 裕雄, 井上 中順, 横田 理央, 青木 義満

    ViEW2023   2023.12

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Improving Continual Learning by Accurate Gradient Reconstructions of the Past Reviewed

    Erik Daxberger, Siddharth Swaroop, Kazuki Osawa, Rio Yokota, Richard E. Turner, Jose Miguel Hernandez-Lobato, Mohammad, Emtiyaz Khan

    Transactions on Machine Learning Research   2023.11

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    researchmap

  • Two-way coupled simulation of quantum turbulence and normal-fluid turbulence in superfluid helium-4

    Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Tomokazu Saito, Rio Yokota

    APS DFD 76th Annual Meeting of the Division of Fluid Dynamics   2023.11

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • SegRCDB: Semantic Segmentation via Formula-Driven Supervised Learning Reviewed

    Risa Shinoda, Ryo Hayamizu, Kodai Nakashima, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka

    IEEE/CVF International Conference on Computer Vision (ICCV)   2023.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Pre-training Vision Transformers with Very Limited Synthesized Images Reviewed

    Ryo Nakamura, Sora Takashima, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota

    IEEE/CVF International Conference on Computer Vision (ICCV)   2023.10

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors Reviewed

    Sameer Deshmukh, Rio Yokota, George Bosilca

    ACM Transactions on Mathematical Software   49 ( 3 )   1 - 29   2023.9

     More details

    Publishing type:Research paper (scientific journal)  

    researchmap

  • Two-phase flow of quantum turbulence and normal-fluid turbulence in superfluid helium-4

    Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Rio Yokota

    ERCOFTAC Symposium on Engineering Turbulence Modeling and Measurements (ETMM)   2023.9

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • 超流動ヘリウム4の量子乱流:常流体乱流による渦糸バンドルの形成

    湯井悟志, 小林宏充, 坪田誠, 齋藤智和, 横田理央

    日本物理学会 第78回年次大会   2023.9

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Vortex-filament bundle induced by normal-fluid turbulence in turbulent superfluid helium-4

    Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Tomokazu Saito, Rio Yokota

    International Symposium on Quantum Fluids and Solids (QFS)   2023.8

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves Reviewed

    Sora Takashima, Ryoh Hayamizu, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota

    IEEE/CVF Conference on Computer Vision and Pattern Recognition   2023.6

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • ラージバッチ学習における汎化性能の低下を抑制する正則化手法 Reviewed

    中村 祥大, 横田 理央

    人工知能学会全国大会   2023.6

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and Automatic Precision Selection Reviewed

    Hiroyuki Ootomo, Hidetaka Manabe, Kenji Harada, Rio Yokota

    ISC High Performance   2023.5

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-031-32041-5_14

    researchmap

  • 深層学習における勾配の前処理法に関する検討

    石川 智貴, 横田 理央

    情報処理学会全国大会   2023.3

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 量子渦計算の高速多重極展開法を用いた高速化

    齋藤 智和, 横田 理央

    情報処理学会全国大会   2023.3

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • GPUとA64FXにおけるTransformerの性能比較

    中村 秋海, 横田 理央

    第188回HPC研究発表会   2023.3

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • 敵対的距離学習モジュールを用いた特徴変動に頑健な画像認識のための対照学習

    杉山 佳史, 片岡 裕雄, 横田 理央, 井上 中順

    パターン認識・メディア理解研究会 (PRMU)   2023.3

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • ニュートンフラクタル画像による事前学習効果

    近江 俊樹, 中村 凌, 片岡 裕雄, 井上 中順, 横田 理央

    情報処理学会全国大会   2023.3

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Reducing Shared Memory Footprint to Leverage High Throughput on Tensor Cores and its Flexible API Extension Library Reviewed

    Hiroyuki Ootomo, Rio Yokota

    HPC Asia   2023.2

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • 蒸留画像による事前学習効果についての検討 Reviewed

    田所龍, 片岡裕雄, 川上玲, 横田理央, 井上中順

    ビジョン技術の実利用ワークショップ   2022.12

     More details

    Language:Japanese   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • ASDL: A Unified Interface for Gradient Preconditioning in PyTorch Reviewed

    Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler

    NeurIPS Workshop Order up! The Benefits of Higher-Order Optimization in Machine Learning   2022.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Empirical Study on Optimizer Selection for Out-of-Distribution Generalization Reviewed

    Hiroki Naganuma, Kartik Ahuja, Ioannis Mitliagkas, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato

    NeurIPS Workshop Distshift   abs/2211.08583   2022.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.48550/arXiv.2211.08583

    researchmap

  • QR Factorization of Block Low-Rank Matrices on Multi-Instance GPU Reviewed

    Satoshi Ohshima, Akihiro Ida, Rio Yokota, Ichitaro Yamazaki

    The 23rd International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’22)   2022.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Tensorコアを用いた単精度行列積エミュレーションのアプリケーションでの評価

    大友広幸, 横田 理央

    第185回ハイパフォーマンスコンピューティング研究発表会   2022.12

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Scalable Linear Time Dense Direct Solver for 3-D Problems Without Trailing Sub-Matrix Dependencies Reviewed

    Qianxiang Ma, Sameer Deshmukh, Rio Yokota

    The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22)   2022.11

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • 対称ブロック低ランク行列の精度保証付き固有値問題解法

    伊田 明弘, 荻田 武史, 横田 理央

    日本応用数理学会2022年度年会   2022   2022.9

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    J-GLOBAL

    researchmap

  • 走行動画の大規模自己教師あり学習の検討と計画 Reviewed

    高橋那弥, 八嶋晋吾, 石川康太, 佐藤育郎, 横田理央

    第25回 画像の認識・理解シンポジウム   2022.7

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Replacing Labeled Real-image Datasets with Auto-generated Contours Reviewed

    Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Rio Yokota

    IEEE/CVF Conference on Computer Vision and Pattern Recognition   2022.6

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • TensorCoreを用いた精度補正単精度行列積

    大友 広幸, 横田 理央

    第180回ハイパフォーマンスコンピューティング研究発表会   2022.6

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • OPIRL: Sample Efficient Off-Policy Inverse Reinforcement Learning via Distribution Matching Reviewed

    Hana Hoshino, Kei Ota, Asako Kanezaki, Rio Yokota

    Proceedings of IEEE International Conference on Robotics and Automation   2022.5

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (scientific journal)  

    researchmap

  • Parallel QR Factorization of Block Low-Rank Matrices Reviewed

    Muhammad Ridwan Apriansyah, Rio Yokota

    ACM Transactions on Mathematical Software   2022.5

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1145/3538647

    researchmap

  • Recovering Single Precision Accuracy from Tensor Cores While Surpassing the FP32 Theoretical Peak Performance Reviewed

    Hiroyuki Ootomo, Rio Yokota

    The International Journal of High Performance Computing Application   2022.2

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1177/10943420221090256

    researchmap

  • RePOSE: Real-Time Iterative Rendering and Refinement for 6D Object Pose Estimation Reviewed

    Shun Iwase, Xingyu Liu, Rawal Khirodkar, Rio Yokota, Kris M. Kitani

    Proceedings of the International Conference on Computer Vision   2021.10

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    researchmap

  • Self-supervised Continual Pretraining for Class Incremental Image Classification Reviewed

    Hikaru Nakata, Nakamasa Inoue, Rio Yokota

    Proceedings CVPR CLVISION Workshop (Findings)   2021.6

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • ExaFMM: a high-performance fast multipole method library with C++ and Python interfaces Reviewed

    Tingyu Wang, Rio Yokota, Lorena A. Barba

    The Journal of Open Source Software   6 ( 61 )   3145   2021.5

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.21105/joss.03145

    researchmap

  • Rich Information is Affordable: A Systematic Performance Analysis of Second-order Optimization Using K-FAC Reviewed

    Yuichiro Ueno, Kazuki Osawa, Yohei Tsuji, Akira Naruse, Rio Yokota

    Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining   2145 - 2153   2020.8

     More details

    Publishing type:Research paper (international conference proceedings)   Publisher:ACM  

    DOI: 10.1145/3394486.3403265

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/conf/kdd/2020

  • Scalable and Practical Natural Gradient for Large-Scale Deep Learning Reviewed

    Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Chuan-Sheng Foo, Rio Yokota

    IEEE Transactions on Pattern Analysis and Machine Intelligence   1 - 1   2020.6

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Institute of Electrical and Electronics Engineers ({IEEE})  

    DOI: 10.1109/TPAMI.2020.3004354

    researchmap

  • Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis Reviewed

    Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota

    ACM International Conference Proceeding Series   92 - 101   2020.1

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1145/3368474.3368479

    Web of Science

    Scopus

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1911.00093v1

  • Measuring the Effects to Beneficial Batch Size and Required Iteration by LARS on Neural Network Training

    NAGANUMA Hiroki, TATSURO Ide, YOKOTA Rio

    Proceedings of the Annual Conference of JSAI   JSAI2020   4Rin169 - 4Rin169   2020

     More details

    Language:Japanese   Publisher:The Japanese Society for Artificial Intelligence  

    Deep Neural Networks(DNN), which have extremely large numbers of parameters, have been overwhelming other machine learning methods by using enormous volumes of data for the training. Since the training of DNN costs a significant amount of time for the computation, large-scale parallelization has been employed to reduce the training time. Large-batch training increases the batch size to reduce the number of required iterations and hence speeds up the training. However, recent research has shown that the effect of speed up hits a certain limit as the batch size becomes very large. In this paper, we conduct experiments to study the relationship between the batch size and the number of required iterations as the batch size increases up to the full batch using LARS, a commonly used method to adjust the learning rate. Our results experimentally verify that LARS is superior to other optimization methods in reducing the number of iterations and also in generalization performance.

    DOI: 10.11517/pjsai.jsai2020.0_4rin169

    CiNii Research

    researchmap

  • Regularizing the fast multipole method for use in molecular simulation Reviewed

    D. S. Shamshirgar, R. Yokota, A. K. Tornberg, B. Hess

    Journal of Chemical Physics   151 ( 23 )   2019.12

     More details

    Publishing type:Research paper (scientific journal)  

    DOI: 10.1063/1.5122859

    Scopus

    PubMed

    researchmap

  • Practical deep learning with Bayesian principles Reviewed

    Kazuki Osawa, Siddharth Swaroop, Anirudh Jain, Runa Eschenhagen, Richard E. Turner, Rio Yokota, Mohammad Emtiyaz Khan

    Advances in Neural Information Processing Systems   32   4289 - 4301   2019.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Web of Science

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/conf/nips/2019

  • QR factorization of block low-rank matrices with weak admissibility condition Reviewed

    Akihiro Ida, Hiroshi Nakashima, Tasuku Hiraishi, Ichitaro Yamazaki, Rio Yokota, Takeshi Iwashita

    Journal of Information Processing   27   831 - 839   2019.11

     More details

    Publishing type:Research paper (scientific journal)  

    DOI: 10.2197/ipsjjip.27.831

    Scopus

    researchmap

  • Optimization of numerous small dense-matrix-vector multiplications in h-matrix arithmetic on gpu Reviewed

    Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

    Proceedings - 2019 IEEE 13th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2019   9 - 16   2019.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/MCSoC.2019.00009

    Web of Science

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/mcsoc/mcsoc2019.html#OhshimaYIY19

  • Distributed-memory lattice H-matrix factorization Reviewed

    Ichitaro Yamazaki, Akihiro Ida, Rio Yokota, Jack Dongarra

    International Journal of High Performance Computing Applications   33 ( 5 )   1046 - 1063   2019.9

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1177/1094342019861139

    Web of Science

    Scopus

    researchmap

  • Tensorコアを用いたTSQR Reviewed

    大友 広幸, 横田 理央

    日本応用数理学会年会   2019.9

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Performance optimizations and analysis of distributed deep learning with approximated second-order optimization method Reviewed

    Yohei Tsuji, Kazuki Osawa, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka

    ACM International Conference Proceeding Series   21 - 8   2019.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1145/3339186.3339202

    Web of Science

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/icppw/icppw2019.html#TsujiOUNYM19

  • Extreme scale FMM-accelerated boundary integral equation solver for wave scattering Reviewed

    Mustafa Abduljabbar, Mohammed Al Farhan, Noha Al-Harthi, Rui Chen, Rio Yokota, Hakan Bagci, David Keyes

    SIAM Journal on Scientific Computing   41 ( 3 )   C245 - C268   2019.6

     More details

    Publishing type:Research paper (scientific journal)  

    DOI: 10.1137/18M1173599

    Scopus

    researchmap

  • Large-scale distributed second-order optimization using kronecker-factored approximate curvature for deep convolutional neural networks Reviewed

    Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka

    Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition   2019-June   12351 - 12359   2019.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/CVPR.2019.01264

    Web of Science

    Scopus

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1811.12019v5

  • Exhaustive study of hierarchical allreduce patterns for large messages between GPUs Reviewed

    Yuichiro Ueno, Rio Yokota

    Proceedings - 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2019   430 - 439   2019.5

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/CCGRID.2019.00057

    Web of Science

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2019.html#UenoY19

  • A Performance Improvement Approach for Second-Order Optimization in Large Mini-batch Training. Reviewed

    Hiroki Naganuma, Rio Yokota

    19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing(CCGRID)   696 - 703   2019.5

     More details

    Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    DOI: 10.1109/CCGRID.2019.00092

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2019.html#NaganumaY19

  • Fisher情報行列の解析に基づく大規模深層学習のための二次最適化手法

    大沢 和樹, 横田 理央, Chuan-Sheng Foo, Vijay Chandrasekhar

    第81回情報処理学会全国大会講演論文集   2019 ( 1 )   45 - 46   2019.2

     More details

    Language:Japanese  

    画像データセットImageNetを始めとする巨大データセットを用いる大規模深層学習においては,膨大な学習時間が最適なパラメータ探索の障害となっている.学習時間の短縮を目的とした既存研究では,コスト関数の最小化に単純な一次最適化手法が用いられ,計算機の性能に頼った高速化手法が提案されてきた.一方で,自然勾配法は深層学習における効率的な二次最適化手法として知られているが,パラメータ数に依存するFisher情報行列の計算がボトルネックとなり,応用は限られていた.本研究では,これまで明らかにされてこなかった大規模深層学習におけるFisher情報行列の解析に基づき,より効率的な二次最適化手法を提案する.

    CiNii Books

    researchmap

  • Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis. Reviewed

    Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota

    CoRR   abs/1911.00093   2019

     More details

  • Highly productive, high-performance application frameworks for Post-Petascale computing Reviewed

    Naoya Maruyama, Takayuki Aoki, Kenjiro Taura, Rio Yokota, Mohamed Wahib, Motohiko Matsuda, Keisuke Fukuda, Takashi Shimokawabe, Naoyuki Onodera, Michel Müller, Shintaro Iwasaki

    Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project   77 - 98   2018.12

     More details

    Language:English   Publishing type:Part of collection (book)  

    DOI: 10.1007/978-981-13-1924-2_5

    Scopus

    researchmap

  • 自然勾配近似法を用いた大規模並列深層学習におけるハイパーパラメータ最適化 Reviewed

    長沼大樹, 岩瀬 駿, 郭 林昇, 中田 光, 横田 理央

    第17回情報科学技術フォーラム   2018.9

     More details

    Authorship:Last author   Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)  

    researchmap

  • Performance of Hierarchical-matrix BiCGStab Solver on GPU Clusters Reviewed

    Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack Dongarra

    Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018   930 - 939   2018.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/IPDPS.2018.00102

    Web of Science

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/ipps/ipdps2018.html#YamazakiAIOTYD18

  • Fast multipole preconditioners for sparse matrices arising from elliptic equations Reviewed

    Huda Ibeid, Rio Yokota, Jennifer Pestana, David Keyes

    Computing and Visualization in Science   18 ( 6 )   213 - 229   2018.3

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Springer Verlag  

    DOI: 10.1007/s00791-017-0287-5

    Scopus

    arXiv

    researchmap

  • Accelerating Convolutional Neural Networks Using Low Precision Arithmetic Reviewed

    Hiroki Naganuma, Rio Yokota

    HPC Asia   2018.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Optimization of hierarchical matrix computation on GPU Reviewed

    Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   10776 LNCS   274 - 292   2018

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-319-69953-0_16

    Web of Science

    Scopus

    researchmap

    Other Link: https://dblp.uni-trier.de/db/conf/scfa/scfa2018.html#OhshimaYIY18

  • Fast Multipole Preconditioners for Sparse Matrices Arising from Elliptic Equations Reviewed

    Huda Ibeid, Rio Yokota, Jennifer Pestana, David Keyes

    Computing and Visualization in Science   Vol. 18 ( No. 6 )   pp. 213 - 229   2017.11

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    researchmap

  • Accelerating matrix multiplication in deep learning by using low-rank approximation Reviewed

    Kazuki Osawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota

    Proceedings - 2017 International Conference on High Performance Computing and Simulation, HPCS 2017   186 - 192   2017.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/HPCS.2017.37

    Web of Science

    Scopus

    researchmap

  • Communication Reducing Algorithms for Distributed Heirarchical N-Body Methods Reviewed

    Mustafa AbdulJabbar, George Markomanolis, Huda Ibeid, Rio Yokota, David Keyes

    32nd International Conference, ISC High Performance   2017.6

     More details

  • Communication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions Reviewed

    Mustafa Abduljabbar, George Markomanolis, Huda Ibeid, Rio Yokota, David Keyes

    Lecture Notes in Computer Science   10266   79 - 96   2017.2

     More details

    Language:English   Publisher:Springer Verlag  

    DOI: 10.1007/978-3-319-58667-0_5

    Scopus

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1702.05459v1

  • Compute-Memory Tradeoff in Hierarchical Low-Rank Approximation Methods Reviewed

    SIAM Conference on Computational Science and Engineering   2017.2

     More details

  • Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation Reviewed

    Rio Yokota, Huda Ibeid, David Keyes

    EIGENVALUE PROBLEMS: ALGORITHMS, SOFTWARE AND APPLICATIONS IN PETASCALE COMPUTING (EPASA 2015)   117   267 - 286   2017

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-319-62426-6_17

    Web of Science

    Scopus

    researchmap

  • Performance evaluation of computation and communication kernels of the fast multipole method on intel manycore architecture Reviewed

    Mustafa Abduljabbar, Mohammed Al Farhan, Rio Yokota, David Keyes

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   10417   553 - 564   2017

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:Springer Verlag  

    DOI: 10.1007/978-3-319-64203-1_40

    Scopus

    researchmap

  • Evaluating the compression efficiency of the filters in convolutional neural networks Reviewed

    Kazuki Osawa, Rio Yokota

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   10614 LNCS   459 - 466   2017

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-319-68612-7_52

    Web of Science

    Scopus

    researchmap

  • Tapas: An Implicitly Parallel ProgrammingFramework For Hierarchical N-body Algorithms Reviewed

    Fukuda Keisuke, Maruyama Naoya, Yokota Rio, Taura Kenjiro, MATSUOKA SATOSHI

    The 22nd IEEE International Conference on Parallel And Distributed Systems   1100 - 1109   2016.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/ICPADS.2016.143

    Web of Science

    researchmap

  • Fast Multipole Preconditioners for Sparse Matrices Arising from Elliptic Equations Reviewed

    IBEID Huda, YOKOTA Rio, PESTANA Jennifer, KEYES David

    Computing and Visualization in Science   2016.12

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    researchmap

  • Tradeoff Between FMM and H^2(HSS) matrices Invited Reviewed

    Yokota Rio

    Journal of the Japan Society for Computational Engineering and Science   21 ( 4 )   6 - 8   2016.10

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    researchmap

  • Communication Optimization of Distributed Memory FMM for Large Scale Boundary Element Methods Invited Reviewed

    Yokota Rio

    Simulation   35 ( 3 )   2016.9

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)   Publisher:Japan Society for Simulation Technology  

    researchmap

  • A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms Reviewed

    Huda Ibeid, Rio Yokota, David Keyes

    International Journal of High Performance Computing Applications   30 ( 4 )   423 - 437   2016.6

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1177/1094342016634819

    Web of Science

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1405.6362v1

  • Portability of the Performance of FMM Invited Reviewed

    21   3p   2016.5

     More details

    Language:Japanese  

    researchmap

  • Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation Reviewed

    Rio Yokota, Huda Ibeid, David Keyes

    Lecture Notes in Computational Science and Engineering   117   267 - 286   2016.2

     More details

    Language:English   Publisher:Springer Verlag  

    DOI: 10.1007/978-3-319-62426-6_17

    Scopus

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1602.02244v1

  • FMM と H^2(HSS) 行列のトレードオフについて Reviewed

    横田理央

    計算工学   21 ( 4 )   3498 - 3501   2016

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    CiNii Books

    researchmap

  • Scaling FMM with data-driven OpenMP tasks on multicore architectures Reviewed

    Amer, A., Matsuoka, S., Pericàs, M., Maruyama, N., Taura, K., Rio Yokota, Balaji, P.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   9903 LNCS   156 - 170   2016

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1007/978-3-319-45550-1_12

    Web of Science

    researchmap

  • Preconditioning Sparse Matrices Using a Highly Scalable Fast Multipole Method Reviewed

    3rd International Workshops on Advances in Computational Mechanics   2015.10

     More details

  • Multi-Level Restricted Maximum Likelihood Covariance Estimation and Kriging for Large Non-Gridded Spatial Datasets Reviewed

    Julio E. Castrillon-Candas, Marc G. Genton, Rio Yokota

    Spatial Statistics   18 ( 18 )   105 - 124   2015.4

  • Fast Multipole Preconditioners for Sparse Linear Solvers Reviewed

    Huda Ibeid, Rio Yokota, Jennifer Pestana, David Keyes

    Proceedings of the 11th World Congress on Computational Mechanics   2014.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Communication Complexity of the Fast Multipole Method and its Algebraic Variants Invited Reviewed

    Rio Yokota, George Turkiyyah, David Keyes

    Supercomputing Frontiers and Innovations   1 ( 1 )   63 - 84   2014.6

     More details

    Language:English  

    A combination of hierarchical tree-like data structures and data access
    patterns from fast multipole methods and hierarchical low-rank approximation of
    linear operators from H-matrix methods appears to form an algorithmic path
    forward for efficient implementation of many linear algebraic operations of
    scientific computing at the exascale. The combination provides asymptotically
    optimal computational and communication complexity and applicability to large
    classes of operators that commonly arise in scientific computing applications.
    A convergence of the mathematical theories of the fast multipole and H-matrix
    methods has been underway for over a decade. We recap this mathematical
    unification and describe implementation aspects of a hybrid of these two
    compelling hierarchical algorithms on hierarchical distributed-shared memory
    architectures, which are likely to be the first to reach the exascale. We
    present a new communication complexity estimate for fast multipole methods on
    such architectures. We also show how the data structures and access patterns of
    H-matrices for low-rank operators map onto those of fast multipole, leading to
    an algebraically generalized form of fast multipole that compromises none of
    its architecturally ideal properties.

    DOI: 10.14529/jsfi140104

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1406.1974v1

  • Petascale molecular dynamics simulation using the fast multipole method on K computer Reviewed

    Ohno Yousuke, Yokota Rio, Koyama Hiroshi, Morimoto Gentaro, Hasegawa Aki, Masumoto Gen, Okimoto Noriaki, Hirano Yoshinori, Ibeid Huda, Ibeid Huda, Narumi Tetsu, Taiji Makoto

    Computer Physics Communications   185 ( 10 )   2575 - 2585   2014.6

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1016/j.cpc.2014.06.004

    Web of Science

    researchmap

  • N-Body Methods

    Yokota, R., AbdulJabbar, M.

    High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches   2014

     More details

    Publishing type:Research paper (scientific journal)   Publisher:High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches  

    DOI: 10.1016/B978-0-12-802118-7.00010-8

    Scopus

    researchmap

  • Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM Reviewed

    Abdelhalim Amer, Naoya Maruyama, Miquel Pericàs, Kenjiro Taura, Rio Yokota, Satoshi Matsuoka

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   7905   255 - 266   2013

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-642-38750-0_19

    Scopus

    researchmap

  • An FMM Based on Dual Tree Traversal for Many-core Architectures Reviewed

    Rio Yokota

    Journal of Algorithms and Computational Technology   7 ( 3 )   301 - 324   2012.9

     More details

    Language:English  

    The present work attempts to integrate the independent efforts in the fast
    N-body community to create the fastest N-body library for many-core and
    heterogenous architectures. Focus is placed on low accuracy optimizations, in
    response to the recent interest to use FMM as a preconditioner for sparse
    linear solvers. A direct comparison with other state-of-the-art fast N-body
    codes demonstrates that orders of magnitude increase in performance can be
    achieved by careful selection of the optimal algorithm and low-level
    optimization of the code. The current N-body solver uses a fast multipole
    method with an efficient strategy for finding the list of cell-cell
    interactions by a dual tree traversal. A task-based threading model is used to
    maximize thread-level parallelism and intra-node load-balancing. In order to
    extract the full potential of the SIMD units on the latest CPUs, the inner
    kernels are optimized using AVX instructions. Our code -- exaFMM -- is an order
    of magnitude faster than the current state-of-the-art FMM codes, which are
    themselves an order of magnitude faster than the average FMM code.

    DOI: 10.1260/1748-3018.7.3.301

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1209.3516v3

  • Petascale turbulence simulation using a highly parallel fast multipole method Reviewed

    Yokota Rio, Barba Lorena, Barba Lorena, Narumi Tetsu, Yasuoka Kenji

    Computer Physics Communications   184 ( 3 )   445 - 455   2012.9

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1016/j.cpc.2012.09.011

    Web of Science

    Scopus

    researchmap

  • 4096GPUを用いた4096³規模の一様等方性乱流の渦法解析

    横田 理央, Barba Lorena, 成見 哲

    Tsubame ESJ. : e-science journal   6   1 - 6   2012.7

     More details

    Language:Japanese   Publisher:東京工業大学学術国際情報センター  

    researchmap

  • Data-Driven Fast Multipole Method on Distributed Memory Systems with Hardware Accelerators Reviewed

    Hatem Ltaief, Rio Yokota

    21st International Conference on Domain Decomposition Methods   2012.6

     More details

    Authorship:Last author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Scalable Force Directed Graph Layout Algorithms Using Fast Multipole Methods Reviewed

    Enas Yunis, Rio Yokota, Aron Ahmadia

    Proceedings of the 11th International Symposium on Parallel and Distributed Computing   2012.6

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • A Parallel Numerical Simulation of Dust Particles Using Direct Numerical Simulation Reviewed

    Hoang Vu Nguyen, Rio Yokota, Georgiy Stenchikov

    Proceedings of the European Geosciences Union General Assembly   2012.4

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Data-Driven Execution of Fast Multipole Methods Reviewed

    Hatem Ltaief, Rio Yokota

    Concurrency and Computation: Practice and Experience   26 ( 11 )   1935 - 1946   2012.3

  • Optimization of Molecular Dynamics Core Program on the K computer Reviewed

    Yousuke Ohno, Rio Yokota, Hiroshi Koyama, Gentaro Morimoto, Aki Hasegawa, Gen Masumoto, Tetsu Narumi, Makoto Taiji

    Proceedings of JSST 2012 International Conference on Simulation Technology   2012

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • FMM Tree Construction on GPUs Reviewed

    YOKOTA Rio

    Ensemble   14 ( 2 )   85 - 89   2012

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    DOI: 10.11436/mssj.14.85

    researchmap

  • A Task Parallel Implementation of Fast Multipole Methods Reviewed

    Kenjiro Taura, Jun Nakashima, Rio Yokota, Naoya Maruyama

    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC)   617 - 625   2012

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1109/SC.Companion.2012.86

    Web of Science

    researchmap

  • Scalable Fast Multipole Methods for Vortex Element Methods Reviewed

    Qi Hu, Nail A. Gumerov, Rio Yokota, Lorena Barba, Ramani Duraiswami

    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC)   2012

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Web of Science

    researchmap

  • Scalable Fast Multipole Methods for Vortex Element Methods Reviewed

    Qi Hu, Nail A. Gumerov, Rio Yokota, Lorena Barba, Ramani Duraiswami

    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC)   1408 - +   2012

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    Web of Science

    researchmap

  • Petascale Turbulence Simulation Using FMM

    Rio Yokota, Tetsu Narumi, Lorena Barba, Kenji Yasuoka

    IPSJ SIG Notes   2011 ( 29 )   1 - 8   2011.11

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    Fast multipole methods (FMM) were originally developed for accelerating N-body problems in astrophysics and other particle based methods. A recent trend in HPC has been to use FMMs in unconventional application areas. We have performed a 20483 turbulence calculation using an FMM designed for large scale GPU systems. The proposed method uses a hybridization of the treecode and FMM, and combines the data-parallel treecode with the O(N) FMM. The run on TSUBAME 2.0 using 4096 GPUs achieved 74 % parallel efficiency, and the sustained performance reached 1.01 PFlops.

    CiNii Books

    researchmap

  • FMM-based vortex method for simulation of isotropic turbulence on GPUs, compared with a spectral method Reviewed

    Rio Yokota, L. A. Barba

    Computers and Fluids   80 ( 80 )   17 - 27   2011.10

  • Hierarchical N-body simulations with auto-tuning for heterogeneous systems Reviewed

    Rio Yokota, Lorena A. Barba

    Computing in Science and Engineering   14 ( 3 )   30 - 39   2011.8

  • A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems Reviewed

    Rio Yokota, Lorena Barba

    International Journal of High Performance Computing Applications   26 ( 4 )   337 - 346   2011.6

  • Comparing the treecode with FMM on GPUs for vortex particle simulations of a leapfrogging vortex ring Reviewed

    Rio Yokota, L. A. Barba

    COMPUTERS & FLUIDS   45 ( 1 )   155 - 161   2011.6

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1016/j.compfluid.2010.11.029

    Web of Science

    researchmap

  • Fast Multipole Method vs. Spectral Methods for the Simulation of Isotropic Turbulence on GPUs Reviewed

    Rio Yokota, Lorena Barba

    Proceedings of the 23rd International Conference on Parallel Computational Fluid Dynamics   2011.5

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Vortex methods for the simulation of turbulent flows Reviewed

    YOKOTA Rio, OBI Shinnosuke

    Journal of Fluid Science and Technology   6 ( 1 )   14 - 29   2011.1

     More details

  • N-body Simulation and FMM on the Large-scale GPU Cluster at Nagasaki University Reviewed

    YOKOTA Rio, HAMADA Tsuyoshi

    Journal of the Japan Society for Computational Engineering and Science   15 ( 4 )   2416 - 2419   2010.10

     More details

    Language:Japanese   Publishing type:Research paper (scientific journal)  

    CiNii Books

    researchmap

  • Biomolecular electrostatics using a fast multipole BEM on up to 512 GPUs and a billion unknowns Reviewed

    Rio Yokota, Jaydeep P. Bardhan, Matthew G. Knepley, L. A. Barba, Tsuyoshi Hamada

    COMPUTER PHYSICS COMMUNICATIONS   182 ( 6 )   1272 - 1283   2010.7

  • Performance of the Fast Multipole Method on GPUs Using Various Kernels Reviewed

    Rio Yokota, Lorena Barba

    Proceedings of the 9th World Congress on Computational Mechanics   2010.7

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Comparing vortex methods and finite difference methods in a homogeneous turbulent shear flow Reviewed

    R. Yokota, S. Obi

    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS   63 ( 7 )   828 - 846   2010.7

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    DOI: 10.1002/fld.2102

    Web of Science

    researchmap

  • PetRBF--A parallel O(N) algorithm for radial basis function interpolation Reviewed

    Rio Yokota, L. A. Barba, Matthew G. Knepley

    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING   199 ( 25 )   1793 - 1804   2009.9

  • Fast Multipole Methods on GPUs for the Meshfree Simulation of Turbulence Reviewed

    Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Kenji Yasuoka, Shinnosuke Obi

    Proceedings of the 10th US National Congress on Computational Mechanics   2009.7

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • DNS of Homogeneous Turbulence Using Vortex Methods Accelerated by the FMM on a Cluster of GPUs Reviewed

    Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Kenji Yasuoka, Shinnosuke Obi

    Proceedings of the 21st International Conference on Parallel Compuational Fluid Dynamics   2009.5

     More details

    Authorship:Lead author  

    researchmap

  • Validation of Vortex Methods for a Turbulent Channel Flow

    Yokota Rio, Obi Shinnosuke

    2009   253 - 253   2009

     More details

    Language:Japanese  

    The vortex method is applied to the calculation of a turbulent channel flow of Re_b=5600, and the results are compared with a finite difference calculation. The fast multipole method was modified for the two way periodic boundary condition. The particle strength exchange was selected as the viscous diffusion scheme. The wall vorticity flux is calculated exactly, using a Neumann condition for the vorticity equation at the wall. The mean velocity profile agrees quantitatively between the vortex method and finite difference method.

    CiNii Books

    researchmap

  • Lagrangian Vortex Methods in Turbulent Channel Flows Reviewed

    R. Yokota, K. Fukagata, S. Obi

    ADVANCES IN TURBULENCE XII - PROCEEDINGS OF THE 12TH EUROMECH EUROPEAN TURBULENCE CONFERENCE   132   893 - 893   2009

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    DOI: 10.1007/978-3-642-03085-7_214

    Web of Science

    researchmap

  • 2002 Numerical simulation of biological cell behavior in a micro channel by using CIP-Level Set method Reviewed

    TAMURA Shuichi, YOKOTA Rio, FUKAGATA Koji

    The Proceedings of The Computational Mechanics Conference   2009 ( 0 )   546 - 547   2009

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    We developed a numerical method based on the Level Set method, which can simulate biological cell behavior in micro channels. The isotropic compliant wall model is used to model the cell membrane and the surface force is calculated from the local displacement of membrane computed by using the Level Set function. The biological cell model behavior is investigated in a channel with sudden expansion and contraction. The shape of the cell model is found to be elongated in the contracted region and semicircle in the expanded region. The collision between two cell models in a T-junction is also investigated. When the cell models touch with each other, their boundaries overlap. A repulsion model is proposed to avoid this overlap.

    DOI: 10.1299/jsmecmd.2009.22.546

    CiNii Books

    researchmap

  • Meshfree Simulation of Turbulence Using the Fast Multipole Methods on GPUs Reviewed

    Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Kenji Yasuoka, Shinnosuke Obi

    Proceedings of the 22nd Symposium on Computational Fluid Dynamics   2008.12

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Direct Numerical Simulation of Homogeneous Shear Flow Using Vortex Methods Reviewed

    Rio Yokota, Shinnosuke Obi

    Proceedings of the 4th International Conference on Vortex Flows and Vortex Models   2008.4

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Mesh-Free Simulation of the Homogeneous Shear Flow Using Vortex Methods Reviewed

    Rio Yokota, Shinnosuke Obi

    Proceedings of the 23rd IIS Turbulence and Shear Flow Dynamics Symposium   2008.3

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • The study of colliding vortex rings using a special-purpose computer and FMM Reviewed

    SHEEL Tarun, YOKOTA Rio, YASUOKA Kenji, OBI Shinnosuke

    Transactions of the Japan Society for Computational Engineering and Science   20080003   20080003   2008.2

     More details

    Language:English  

    researchmap

  • Vortex Method Calculation of a Turbulent Channel Flow

    Yokota Rio, Fukagata Koji, Obi Shinnosuke

    2008   337 - 337   2008

     More details

    Language:Japanese  

    The vortex method is applied to the calculation of a turbulent channel flow of Re_b=5600, and the results are compared with a finite difference calculation. The mean velocity profile agrees quantitatively between the vortex method and finite difference method for the duration of one walkout time.

    CiNii Books

    researchmap

  • Calculation of isotropic turbulence using a pure Lagrangian vortex method Reviewed

    R. Yokota, T. K. Sheel, S. Obi

    JOURNAL OF COMPUTATIONAL PHYSICS   226 ( 2 )   1589 - 1606   2007.10

     More details

  • Pure Lagrangian vortex methods for the simulation of decaying isotropic turbulence Reviewed

    YOKOTA RIO, OBI SHINNOSUKE

    Proceedings of 5th Int. Symp. Turbulent Shear Flow Phenomena   2007.8

     More details

    Language:English   Publishing type:Research paper (conference, symposium, etc.)  

    一様等方性乱流場を渦法により計算し、従来行なわれることのなかったDNSとの直接的な比較を通じて、双方の手法の比較を行なうとともに、渦法でこれまで評価が十分にされなかったエネルギー保存性などについて検討を加えた。

    researchmap

  • Vortex Methods for the Calculation of Homogeneous Shear Flows

    Yokota Rio, Obi Shinnosuke

    2007   195 - 195   2007

     More details

    Language:Japanese  

    The vortex method is applied to the calculation of a homogeneous shear flow of Re_λ=25, and the results are compared with a finite difference calculation. The fast multipole method was modified for the shear periodic boundary condition. The core spreading method and particle strength exchange were selected as the viscous diffusion scheme. The time evolution of the energy spectrum, kinetic energy and enstrophy are shown for these different cases. The component energy ratio and particle density distribution were also examined for all cases.

    CiNii Books

    researchmap

  • 662 Mesh-free Turbulence Simulation Using Vortex Methods

    YOKOTA Rio, OBI Shinnosuke

    The Proceedings of Conference of Tokai Branch   2007 ( 0 )   323 - 324   2007

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    DOI: 10.1299/jsmetokai.2007.56.323

    CiNii Books

    researchmap

  • 1202 Calculation of Fluid Structure Interaction using VEM and BEM(1)

    Yokota Rio, Obi Shinnosuke

    The Proceedings of the Fluids engineering conference   2006 ( 0 )   _1202 - a_   2006

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    Fluid structure interaction of a circular cylinder is simulated by the coupling of a Vortex Element Method and Boundary Element Method. Both methods are volume-mesh-free, and thus alleviates the burden on handeling moving boundary problems in general. The rigid vibration of the cylinder is examined first, and compared with experimental studies. Then, the elastic deformation is considered.

    DOI: 10.1299/jsmefed.2006._1202-a_

    CiNii Books

    researchmap

  • 2016 Simulation of a Wake using a 3-D Vortex Element Method

    YOKOTA Rio, OBI Shinnosuke

    The proceedings of the JSME annual meeting   2006 ( 0 )   31 - 32   2006

     More details

    Language:Japanese   Publisher:The Japan Society of Mechanical Engineers  

    The wake of a bluff body is calculated using a 3-D Vortex Method and turbulence statistics are calculated from the results, which are then compared with measurements by Particle Image Velocimetry and a URANS calculation. It is shown that when compared to a 2-D Vortex Method calculation, which was previously performed by the authors, the results of the 3-D calculation is much closer to the PIV measurements and URANS calculation.

    DOI: 10.1299/jsmemecjo.2006.1.0_31

    CiNii Books

    researchmap

  • Vortex flow simulation of multiple bluff bodies

    YOKOTA Rio, TOKAI Norihiko, OBI Shinnosuke

    The Proceedings of The Computational Mechanics Conference   2004 ( 0 )   731 - 732   2004

     More details

    Language:Japanese   Publishing type:Research paper (conference, symposium, etc.)   Publisher:The Japan Society of Mechanical Engineers  

    乱流の数値予測法の比較検証を目的として、一様流中に置かれた非流線型物体周りの流れを渦法と非定常RANSで計算した。2次元の計算に関しては、計算精度と時間のいずれの面でも両者の間には大きな差は見られなかった。

    DOI: 10.1299/jsmecmd.2004.17.731

    CiNii Books

    researchmap

▼display all

Books

  • High Performance Parallelism Pearls

    YOKOTA Rio( Role: Joint editorN-body methods on Xeon Phi coprocessors)

    Morgan Kaufmann  2014.11 

     More details

  • GPU Computing Gems

    Morgan Kaufmann  2011  ( ISBN:0123849888

     More details

  • GPU Computing GEMS

    Morgan Kaufmann  2011  ( ISBN:0123849888

     More details

MISC

  • Attempt to improve the computational performance by multi-processes GPU execution

    大島聡史, 伊田明弘, 河合直聡, 深谷猛, 横田理央, 山崎市太郎

    計算工学講演会論文集(CD-ROM)   29   2024

  • CUDA Fortran+MIG+UVMを用いたBLR行列QR分解の大規模高速化

    大島聡史, 伊田明弘, 河合直聡, 横田理央, 山崎市太郎

    情報処理学会研究報告(Web)   2023 ( HPC-190 )   2023

  • マルチインスタンスGPU上におけるBLR行列のQR分解

    大島聡史, 伊田明弘, 横田理央, 山崎市太郎

    日本応用数理学会年会講演予稿集(CD-ROM)   2022   2022

  • 確率的重み付け平均法のラージバッチ学習における有用性の検証

    所畑, 貴大, 長沼, 大樹, 横田, 理央

    第82回全国大会講演論文集   2020 ( 1 )   359 - 360   2020.2

     More details

    Language:Japanese  

    近年の深層ニューラルネットワークモデルの学習には膨大なパラメータやデータを用いるため、学習時間が増加する傾向にあり、学習の高速化が喫緊の課題である。単純に一度に用いるデータ量を増やすことで高速化を図るラージバッチ学習では、スモールバッチ学習でのNoiseの影響が少なくなるため、汎化性能の低いSharpな解への収束へと陥ることが経験的に示されている。本研究では、モデルのパラメータを確率的に平均化する手法であるSWA(Stochastic Weight Averaging)をラージバッチ学習に適用することで、汎化性能劣化問題への改善効果を検証する。

    CiNii Books

    CiNii Research

    researchmap

  • 大規模並列深層学習のための目的関数の平滑化

    長沼, 大樹, 横田, 理央

    第81回全国大会講演論文集   2019 ( 1 )   315 - 316   2019.2

     More details

    Language:Japanese  

    深層学習では極めて膨大な学習データを用いて学習することで他の機械学習手法を圧倒する高い性能を発揮している一方, その膨大な計算時間のため,大規模並列化によって学習時間を短縮するのが喫緊の課題である. 深層学習における問題は訓練データとの誤差を表す関数の最小化問題に帰結するが, 近年の研究によって,大規模並列化に伴うバッチサイズの増加により得られる学習モデルの汎化性能が劣化することが示されている. 本研究ではこの問題の解決方法として目的関数に対する平滑化に着目し, バッチサイズの増加を伴っても汎化性能を劣化させない目的関数の平滑化手法について検証を行う.

    CiNii Books

    CiNii Research

    researchmap

  • GPUによる階層型行列計算法の高速化に向けた多数の小密行列ベクトル積計算の最適化

    大島聡史, 山崎市太郎, 伊田明弘, 横田理央

    日本応用数理学会年会講演予稿集(CD-ROM)   2019   2019

  • Software Auto-Tuning for Hierarchical Matrix Computation

    大島 聡史, 山崎 市太郎, 伊田 明弘, 横田 理央

    計算工学講演会論文集 Proceedings of the Conference on Computational Engineering and Science   23   2018.6

     More details

    Language:Japanese   Publisher:日本計算工学会  

    J-GLOBAL

    researchmap

  • Preface

    Rio Yokota, Michèle Weiland, John Shalf, Sadaf Alam

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   11203 LNCS   V   2018

     More details

  • Preface

    Rio Yokota, Weigang Wu

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   10776 LNCS   V - VI   2018

     More details

  • Preface

    Rio Yokota, Michèle Weiland, David Keyes, Carsten Trinitis

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)   10876 LNCS   V - VI   2018

     More details

  • Accelerating Convolutional Neural Networks Using Low-Rank Tensor Decomposition

    117 ( 238 )   1 - 6   2017.10

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • Improvement of speed using low precision arithmetic in deep learning and performance evaluation of accelerator Reviewed

    117 ( 238 )   101 - 107   2017.10

     More details

    Authorship:Last author   Language:Japanese  

    CiNii Books

    researchmap

  • Accelerating Convolutional Neural Networks using Low-Rank Approximation

    22   5p   2017.5

     More details

    Language:Japanese  

    researchmap

  • 階層型行列計算のGPU向け最適化

    大島聡史, 山崎市太郎, 伊田明弘, 横田理央

    日本応用数理学会年会講演予稿集(CD-ROM)   2017   2017

  • A Matrix-free Preconditioner for the Helmholtz Equation based on the Fast Multipole Method

    Huda Ibeid, Rio Yokota, David Keyes

    2016.8

     More details

    Fast multipole methods (FMM) were originally developed for accelerating
    $N$-body problems for particle-based methods. FMM is more than an $N$-body
    solver, however. Recent efforts to view the FMM as an elliptic Partial
    Differential Equation (PDE) solver have opened the possibility to use it as a
    preconditioner for a broader range of applications. FMM can solve Helmholtz
    problems with optimal $\mathcal{O}(N \log N)$ complexity, has compute-bound
    inner kernels, and highly asynchronous communication patterns. The combination
    of these features makes FMM an interesting candidate as a preconditioner for
    sparse solvers on architectures of the future. The use of FMM as a
    preconditioner allows us to use lower order multipole expansions than would be
    required as a solver because individual solves need not be accurate. This
    reduces the amount of computation and communication significantly and makes the
    time-to-solution competitive with state-of-the-art preconditioners.
    Furthermore, the high asynchronicity of FMM allows it to scale to much larger
    core counts than factorization-based and multilevel methods. We describe our
    tests in reproducible details with freely available codes.

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1608.02461v1

  • Asynchronous Execution of the Fast Multipole Method Using Charm++

    Mustafa AbdulJabbar, Rio Yokota, David Keyes

    2014.5

     More details

    Fast multipole methods (FMM) on distributed mem- ory have traditionally used
    a bulk-synchronous model of com- municating the local essential tree (LET) and
    overlapping it with computation of the local data. This could be perceived as
    an extreme case of data aggregation, where the whole LET is communicated at
    once. Charm++ allows a much finer control over the granularity of
    communication, and has a asynchronous execution model that fits well with the
    structure of our FMM code. Unlike previous work on asynchronous fast N-body
    methods such as ChaNGa and PEPC, the present work performs a direct comparison
    against the traditional bulk-synchronous approach and the asynchronous approach
    using Charm++. Furthermore, the serial performance of our FMM code is over an
    order of magnitude better than these previous codes, so it is much more
    challenging to hide the overhead of Charm++.

    arXiv

    researchmap

    Other Link: http://arxiv.org/pdf/1405.7487v1

  • Towards a Dataflow FMM using the OmpSs Programming Model

    2012 ( 12 )   1 - 7   2012.9

     More details

    Language:English  

    CiNii Books

    researchmap

  • Parallelizing ExaFMM with MassiveThreads Task Parallel Library and Its Evaluation

    2012 ( 13 )   1 - 13   2012.7

     More details

    Language:Japanese  

    CiNii Books

    researchmap

  • Turbulence Simulation Using 4096³ Vortex Particles on 4096 GPUs

    Yokota Rio, Barba Lorena, Narumi Tetsu

    Tsubame ESJ. : e-science journal   6   17 - 22   2012.7

     More details

    Language:English   Publisher:東京工業大学学術国際情報センター  

    researchmap

▼display all

Presentations

  • Matrices in Deep Neural Networks and How to Compute Them in Parallel Invited

    Rio Yokota

    IEEE CLUSTER 2022  2022.9 

     More details

    Event date: 2022.9

    Language:English   Presentation type:Oral presentation (keynote)  

    researchmap

  • 深層学習における2次最適化の汎化性能の検証

    石井央, 横田理央

    第84回情報処理学会全国大会  2022.3 

     More details

    Event date: 2022.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Vision Transformerにおけるバッチサイズの汎化性能への影響

    中村秋海, 横田理央

    第84回情報処理学会全国大会  2022.3 

     More details

    Event date: 2022.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 42 TFlops Hierarchical N-body Simulation on GPUs with Applications in both Astrophysics and Turbulence International conference

    HAMADA Tsuyoshi, YOKOTA Rio, NITADORI Keigo, NARUMI Tetsu, YASUOKA Kenji, TAIJI Makoto, OGURI Kiyoshi

    Supercomputing 2009  2009.11 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Lagrangian Vortex Methods in Turbulent Channel Flows International conference

    YOKOTA Rio, FUKAGATA Koji, OBI Shinnosuke

    12th EUROMECH European Turbulence Conference  2009.9 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Validation of Vortex Methods in a Turblent Channel Flow

    YOKOTA Rio, OBI Shinnosuke

    Annual meeting of the JSFM  2009.9 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Fast Multipole Methods on GPUs for the Meshfree Simulation of Turbulence International conference

    YOKOTA Rio, NARUMI Tetsu, SAKAMAKI Ryuji, YASUOKA Kenji, OBI Shinnosuke

    10th US National Congress on Computational Mechanics  2009.7 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Comparing the treecode with FMM on GPUs for vortex particle simulations of a leapfrogging vortex ring

    22nd International Conference on Paral lel Compuational Fluid Dynamics  2010 

     More details

  • Lagrangian simulation of turbulence using vortex methods

    2nd International Workshops on Advances in Computational Mechanics  2010 

     More details

  • Range of Applications for the Fast Multipole Method on GPUs

    Accelerated Computing  2010 

     More details

  • DNS of Homogeneous Turbulence Using Vortex Methods Accelerated by the FMM on a Cluster of GPUs International conference

    YOKOTA Rio, NARUMI Tetsu, SAKAMAKI, Ryuji, KAMEOKA Shun, YASUOKA Kenji, OBI Shinnosuke

    21st International Conference on Parallel Compuational Fluid Dynamics  2009.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • 42 TFlops Hierarchical N-body Simulation on GPUs with Applications in both Astrophysics and Turbulence

    Supercomputing 2009  2009 

     More details

  • Lagrangian Vortex Methods in Turbulent Channel Flows

    12th EUROMECH European Turbulence Conference  2009 

     More details

    Presentation type:Poster presentation  

    researchmap

  • Vortex Method Simulation of Turbulent Channel Flow International conference

    YOKOTA Rio, OBI Shinnosuke

    Annual meeting of the JSFM  2008.9 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Direct numerical simulation of homogeneous shear flow using vortex methods International conference

    YOKOTA Rio, OBI Shinnosuke

    4th International Conference on Vortex Flows and Vortex Models  2008.4 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Mesh-free simulation of the homogeneous shear flow using vortex methods

    YOKOTA Rio, OBI Shinnosuke

    23rd IIS Turbulence and Shear Flow Dynamics Symposium  2008.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Meshfree Simulation of Turbulence Using the Fast Multipole Methods on GPUs

    22nd Symposium on Computational Fluid Dynamics  2008 

     More details

  • Validation of Vortex Methods in a Turblent Channel Flow

    Annual meeting of the JSFM  2009 

     More details

  • Fast Multipole Methods on GPUs for the Meshfree Simulation of Turbulence

    10th US National Congress on Computational Mechanics  2009 

     More details

  • DNS of Homogeneous Turbulence Using Vortex Methods Accelerated by the FMM on a Cluster of GPUs

    21st International Conference on Parallel Compuational Fluid Dynamics  2009 

     More details

  • Meshfree Simulation of Turbulence Using the Fast Multipole Methods on GPUs

    YOKOTA Rio, NARUMI Tetsu, SAKAMAKI Ryuji, KAMEOKA Shun, YASUOKA Kenji, OBI Shinnosuke

    22nd Symposium on Computational Fluid Dynamics  2008.12 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Fast N-body Methods on Many-core and Heterogenous Systems International conference

    YOKOTA Rio

    International Workshop on Computational Science and Numerical Analysis  2012.3 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Tokyo  

    researchmap

  • Petascale Turbulence Simulation Using FMM International conference

    NARUMI Tetsu, YOKOTA Rio, BARBA Lorena, YASUOKA Kenji

    HOKKE-19  2011.11 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Petaflops Scale Turbulence Simulation on TSUBAME 2.0 International conference

    YOKOTA Rio

    GPU@BU Workshop  2011.11 

     More details

    Language:English   Presentation type:Public lecture, seminar, tutorial, course, or other speech  

    researchmap

  • Parameter Tuning of a Hybrid Treecode-FMM on GPUs International conference

    YOKOTA Rio, BARBA Lorena

    The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems  2011.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • A Parallel Numerical Simulation of Dust Particles Using Direct Numerical Simulation International conference

    NGUYEN Hoang, YOKOTA Rio, Stenchikov Gera

    European Geosciences Union General Assembly  2012.4 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Large Scale Multi-GPU FMM for Bioelectrostatics

    SIAM Conference on Computational Science and Engineering  2011 

     More details

  • Fast multipole method vs. spectral methods for the simulation of isotropic turbulence on GPUs International conference

    YOKOTA Rio, BARBA Lorena

    23rd International Conference on Parallel Computational Fluid Dynamics  2011.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Large Scale Multi-GPU FMM for Bioelectrostatics International conference

    YOKOTA Rio, BARBA Lorena

    Presentations SIAM Conference on Computational Science and Engineering  2011.2 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • 12 Steps to a Fast Multipole Method on GPUs International conference

    YOKOTA Rio

    Pan-American Advanced Studies Institute  2011.1 

     More details

    Language:English   Presentation type:Public lecture, seminar, tutorial, course, or other speech  

    researchmap

  • Parameter Tuning of a Hybrid Treecode-FMM on GPUs

    The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems  2011 

     More details

  • Performance of the fast multipole method on GPUs using various kernels International conference

    YOKOTA Rio, BARBA Lorena

    9th World Congress on Computational Mechanics  2010.7 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • RBF interpolation using Gaussians with domain decomposition on GPUs International conference

    YOKOTA Rio, BARBA Lorena

    SIAM annual meeting  2010.7 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Comparing the treecode with FMM on GPUs for vortex particle simulations of a leapfrogging vortex ring International conference

    YOKOTA Rio, BARBA Lorena

    22nd International Conference on Paral lel Compuational Fluid Dynamics  2010.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Lagrangian simulation of turbulence using vortex methods International conference

    YOKOTA Rio, OBI Shinnosuke

    2nd International Workshops on Advances in Computational Mechanics  2010.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • 12 Steps to a Fast Multipole Method on GPUs

    Pan-American Advanced Studies Institute  2011 

     More details

  • (Really) Fast macromolecular electrostatics – fast algorithms, open software and accelerated computing International conference

    YOKOTA Rio, BARDHAN Jaydeep, KNEPLEY Matt, BARBA Lorena

    ACS Division of Physical Chemistry 240th National Meeting  2010.8 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Vortex flow simulation of multipole bluff bodies

    19th Symposium on Computational Fluid Dynamics  2005 

     More details

  • Vortex flow simulation of multipole bluff bodies

    YOKOTA Rio, TOKAI Norihiko, OBI Shinnosuke

    17th Conference on Computational Mechanics  2004.11 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Vortex flow simulation of multipole bluff bodies

    17th Conference on Computational Mechanics  2004 

     More details

  • Vortex flow simulation of multipole bluff bodies

    YOKOTA Rio, OBI Shinnosuke

    19th Symposium on Computational Fluid Dynamics  2005.12 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Vortex flow simulation of multipole bluff bodies International conference

    YOKOTA Rio, OBI Shinnosuke

    3rd International Conference on Vortex Flows and Vortex Models  2005.11 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Vortex flow simulation of multipole bluff bodies

    3rd International Conference on Vortex Flows and Vortex Models  2005 

     More details

  • Vortex Method Simulation of Turbulent Channel Flow

    Annual meeting of the JSFM  2008 

     More details

  • Direct numerical simulation of homogeneous shear flow using vortex methods

    4th International Conference on Vortex Flows and Vortex Models  2008 

     More details

  • Vortex methods for the calculation of homogeneous shear flows

    YOKOTA Rio, OBI Shinnosuke

    Annual meeting of the JSFM  2007.8 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Mesh-free turbulence simulation using vortex methods

    YOKOTA Rio, OBI Shinnosuke

    56th Conference of the JSME Tokai Branch  2007.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Computation of wing-tip vortex by a three-dimensional vortex method

    21st Symposium on Computational Fluid Dynamics  2007 

     More details

  • Meshfree direct numerical simulation of turbulence using the vortex method on parallel MDGRAPE-3 boards along with the fast multipole method

    Next-Generation Supercomputing Symposium  2007 

     More details

    Presentation type:Poster presentation  

    researchmap

  • Mesh-free simulation of the homogeneous shear flow using vortex methods

    23rd IIS Turbulence and Shear Flow Dynamics Symposium  2008 

     More details

  • Computation of wing-tip vortex by a three-dimensional vortex method

    SATO Akira, YOKOTA Rio, OBI Shinnosuke

    21st Symposium on Computational Fluid Dynamics  2007.12 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Mesh-free direct numerical simulation of turbulence using the vortex method on parallel MDGRAPE-3 boards along with the fast multipole method

    YOKOTA Rio, NARUMI Tetsu, YASUOKA Kenji, EBISUZAKI Toshikazu, OBI Shinnosuke

    Next-Generation Supercomputing Symposium  2007.10 

     More details

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • Pure Lagrangian vortex methods for the simulation of decaying isotropic turbulence International conference

    YOKOTA Rio, OBI Shinnosuke

    International Symposium on Turbulence and Shear Flow Phenomena  2007.8 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Calculation of the decay of colliding turbulent vortex rings

    4th International Conference on Vortex Flows and Vortex Models  2008 

     More details

  • Pure Lagrangian vortex methods for the simulation of decaying isotropic turbulence

    5th International Symposium on Turbulence and Shear Flow Phenomena  2007 

     More details

  • Simulation of homogeneous isotropic turbulence using the vortex method

    20th Symposium on Computational Fluid Dynamics  2006 

     More details

  • Calculation of fluid structure interaction using VEM and BEM

    Conference of the JSME Fluid Engineering Division  2006 

     More details

  • Simulation of a wake using 3-D vortex element method

    Annual Meeting of the JSME  2006 

     More details

  • Vortex flow simulation between multipole bridge decks

    Whither Turbulence Prediction and Control  2006 

     More details

  • Simulation of homogeneous isotropic turbulence using the vortex method,

    YOKOTA Rio, OBI Shinnosuke

    20th Symposium on Computational Fluid Dynamics  2006.12 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Calculation of fluid structure interaction using VEM and BEM

    YOKOTA Rio, OBI Shinnosuke

    Con- ference of the JSME Fluid Engineering Division  2006.10 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Simulation of a wake using a 3-D vortex element method

    YOKOTA Rio, OBI Shinnosuke

    Annual Meeting of the JSME  2006.9 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Vortex flow simulation between multiple bridge decks International conference

    YOKOTA Rio, OBI Shinnosuke

    Whither Turbulence Prediction and Control  2006.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Vortex methods for the calculation of homogeneous shear flows

    Annual meeting of the JSFM  2007 

     More details

  • Meshfree turbulence simulation using vortex methods

    56th Conference of the JSME Tokai Branch  2007 

     More details

  • Performance of the fast multipole method on GPUs using various kernels

    9th World Congress on Computational Mechanics  2010 

     More details

  • Fast N-body Methods as a Compute-Bound Preconditioner for Sparse Solvers on GPUs International conference

    YOKOTA Rio

    GPU Technology Conference  2014.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Fast Multipole Method Preconditioning International conference

    PESTANA Jennifer, YOKOTA Rio, IBEID Huda, KEYES David

    International Conference On Preconditioning Techniques For Scientific And Industrial Applications  2013.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM International conference

    Abdelhalim Amer, Naoya Maruyama, Miquel Pericas, Kenjiro Taura, Rio Yokota, Satoshi Matsuoka

    International Supercomputing Conference  2013.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Leipzig  

    researchmap

  • Advances in Fast Multipole Methods for Scalable Electrostatics Calculations Invited International conference

    YOKOTA Rio

    Workshop: Electrostatics methods in Molecular Simulation  2013.5 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    researchmap

  • ExaFMM – a Testbed for Comparing Various Implementations of the FMM International conference

    YOKOTA Rio

    SIAM Conference on Computational Science and Engineering  2015.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Fast Multipole Preconditioners for Sparse Linear Solvers International conference

    IBEID Huda, YOKOTA Rio, KEYES David

    11th World Congress on Computational Mechanics  2014.7 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Communication Complexity of the Fast Multipole Method and its Alge- braic Variants International conference

    YOKOTA Rio, KEYES David

    CBMS-NSF Conference: Fast Direct Solvers for Elliptic PDEs  2014.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • High Performance Numerical Algorithms for Seismic and Reservoir Sim- ulations International conference

    LTAIEF Hatem, YOKOTA Rio

    GPU Technology Conference  2014.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Various Implementations of FMM and Their Performance on Future Architectures International conference

    YOKOTA Rio

    Multi-resolution Interactions Workshop  2015.8 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Fast Multipole Method as Preconditioner International conference

    Huda Ibeid, Jennifer Pestana, Rio Yokota, David Keyes

    SIAM Conference on Computational Science and Engineering  2015.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Salt Lake City  

    researchmap

  • Recent Trends in Hierarchical N-body Methods on GPUs International conference

    YOKOTA Rio, BARBA Lorena

    GPU Technology Conference  2012.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Scaling Fast Multipole Methods up to 4000 GPUs Invited International conference

    YOKOTA Rio, NARUMI Tetsu, BARBA Lorena, YASUOKA Kenji

    ATIP/A*CRC Workshop on Accelerator Technologies for High Performance Computing  2012.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Running Fast Multipole Method on the Full Node of TSUBAME and K computer International conference

    YOKOTA Rio

    Scalable Hierarchical Algorithms for Extreme Computing  2012.4 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Petascale Fast Multipole Methods on GPUs Invited International conference

    YOKOTA Rio

    GPU Technology Conference Japan  2012.7 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Petascale Fast Multipole Methods on GPUs Invited International conference

    YOKOTA Rio

    The 11th International Symposium on Parallel and Distributed Computing  2012.6 

     More details

    Language:English   Presentation type:Oral presentation (keynote)  

    researchmap

  • Scalable Force Directed Graph Layout Algorithms Using Fast Multipole Methods International conference

    YUNIS Enas, YOKOTA Rio, AHMADIA Aron

    The 11th International Symposium on Parallel and Distributed Computing  2012.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Data-Driven Fast Multipole Method on Distributed Memory Systems with Hardware Accelerators International conference

    LTAIEF Hatem, YOKOTA Rio

    21st International Conference on Domain Decomposition Methods  2012.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Investigating New Numerical Techniques for Reservoir Simulations on GPUs International conference

    ABDELFETTAH Ahmad, LTAIEF Hatem, YOKOTA Rio

    GPU Technology Conference  2013.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Fast Multipole Method as a Preconditioner International conference

    IBEID Huda, YOKOTA Rio, KEYES David

    SIAM Conference on Computational Science and Engineering  2013.2 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • A Task Parallelism Meets Fast Multipole Methods International conference

    TAURA Kenjiro, NAKASHIMA Jun, YOKOTA Rio, MARUYAMA, Naoya

    Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems  2012.11 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Compute-Memory Tradeoff in Hierarchical Low-Rank Approximation Methods International conference

    Rio Yokota

    SIAM Conference on Computational Science and Engineering  2017.2 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Atlanta  

    researchmap

  • Energy Conservation of Fast Multipole Methods in Classical Molecular Dynamics Simulations Invited International conference

    Rio Yokota

    7th AICS International Symposium  2017.2 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Kobe  

    researchmap

  • Communication Reducing Algorithms for Distributed Heirarchical N-Body Methods International conference

    Mustafa AbdulJabbar, George Markomanolis, Huda Ibeid, Rio Yokota, David Keyes

    32nd International Conference, ISC High Performance  2017.6 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Frankfurt  

    researchmap

  • Accelerating Convolutional Neural Networks Using Low-Rank Approximation

    Kazuki Oosawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota

    22nd Conference of Japan Computational Engineering Society  2017.5 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Omiya  

    researchmap

  • Using Low-Rank Approximation in Convolutional Neural Networks

    Yoshifumi Motoyama, Toshio Endo, SATOSHI MATSUOKA, Rio Yokota, Keisuke Fukuda

    158th Research Presentation Seminar in High Performance Computing  2017.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Atami  

    researchmap

  • Acceleration of Matrix Multiplication in Deep Learning Using Low-Rank Approximation

    Akira Sekiya, Kazuki Oosawa, Hiroki Naganuma, Rio Yokota

    158th Research Presentation Seminar in High Performance Computing  2017.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Atami  

    researchmap

  • Performance Evaluation of Computation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architecture International conference

    Mustafa AbdulJabbar, Mohammed Al Farhan, Rio Yokota, David Keyes

    3rd International European Conference on Parallel and Distributed Computing  2017.8 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Gotingen  

    researchmap

  • Optimization of Hierarchical Matrix Computations on a Cluster of GPUs

    Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

    Summer United Workshops on Parallel, Distributed and Cooperative Processing  2017.7 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Akita  

    researchmap

  • Accelerating Matrix Multiplication in Deep Learning by Using Low-Rank Approximation International conference

    Kazuki Oosawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota

    The 2017 International Conference on High Performance Computing & Simulation  2017.7 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Genoa  

    researchmap

  • Hierarchical Low-Rank Approximations at Extreme Scale Invited International conference

    Rio Yokota

    32nd International Conference, ISC High Performance  2017.6 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Frankfurt  

    researchmap

  • Fast Multipole Method as a Matrix-free Hierarchical Low-rank Approximation International conference

    YOKOTA Rio

    International Workshop on Eigenvalue Problems  2015.9 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • A Common API for Fast Multipole Methods International conference

    YOKOTA Rio

    Accelerate Data Analytics and Computing Workshop  2016.1 

     More details

    Language:English   Presentation type:Symposium, workshop panel (nominated)  

    researchmap

  • Tuning Parameters in FMM Invited

    YOKOTA Rio

    Seventh Symposium on Automatic Tuning Technology and its Application  2015.12 

     More details

    Language:Japanese   Presentation type:Oral presentation (invited, special)  

    researchmap

  • Comparison of FMM and HSS at Large Scale International conference

    YOKOTA Rio, ROUET, Francois-Henri, LI Xiaoye

    SIAM Conference on Applied Linear Algebra  2015.10 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Preconditioning Sparse Matrices Using a Highly Scalable Fast Multipole Method International conference

    YOKOTA Rio, IBEID Huda, KEYES David

    3rd International Workshops on Advances in Computational Mechanics  2015.10 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Improving Data Locality of Fast Multipole Methods International conference

    YOKOTA Rio

    Third Workshop on Programming Abstractions for Data Locality  2016.10 

     More details

    Language:English   Presentation type:Symposium, workshop panel (public)  

    researchmap

  • Fast Multipole Method Library for Multiple Architectures and its Application to Molecular and Fluid Simulations

    YOKOTA Rio

    8th Symposium of the Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures  2016.7 

     More details

    Language:Japanese   Presentation type:Symposium, workshop panel (public)  

    researchmap

  • Perforamance Portability of FMM

    YOKOTA Rio

    21st Conference of Japan Computational Engineering Society  2016.5 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • A Matrix-Free Preconditioner for Elliptic Solvers Based on the Fast Multipole Method International conference

    Huda Ibeid, Rio Yokota, David Keyes

    SIAM Conference on Parallel Processing for Scientific Computing  2016.4 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Paris  

    researchmap

  • Tapas: An Implicitly Parallel ProgrammingFramework For Hierarchical N-body Algorithms International conference

    Keisuke Fukuda, Motohiko Matsuda, Naoya Maruyama, Rio Yokota, Kenjiro Taura, Satoshi Matsuoka

    The 22nd IEEE International Conference on Parallel And Distributed Systems  2016.12 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Wuhan  

    researchmap

  • O(N)で並列性の高い密行列のLU分解 Invited

    横田理央

    RIMS 研究集会 「数値解析が拓く次世代情報社会~エッジから富岳まで~」  2022.12 

     More details

    Language:Japanese   Presentation type:Oral presentation (invited, special)  

    researchmap

  • 人工画像を用いたVision Transformerの大規模事前学習 Invited

    横田理央

    社会的課題解決型データサイエンス・AI研究推進体シンポジウム  2022.9 

     More details

    Language:Japanese   Presentation type:Oral presentation (invited, special)  

    researchmap

  • 人工画像を用いたVision Transformerの大規模事前学習 Invited

    横田理央

    DENSO IT LAB x TOKYO TECH Discussion Night in MIRU  2022.7 

     More details

    Language:Japanese   Presentation type:Oral presentation (invited, special)  

    researchmap

  • Smoothing of the Objective Function for Large Scale Parallel Deep Learning

    Hiroki Naganuma, Rio Yokota

    The 81st National Convention of IPSJ  2019.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Fukuoka  

    researchmap

  • Scaling Deep Learning to Thousands of GPUs Invited International conference

    Rio Yokota

    HPC 2018  2018.7 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Cetraro  

    researchmap

  • Energy Conserving Fast Multipole Methods for the Calculation of Long-range Interactions Invited International conference

    Rio Yokota

    Mathematics in Action: Modeling and analysis in molecular biology and electro- physiology  2018.6 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Suzhou  

    researchmap

  • Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU clusters International conference

    Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack Dongarra

    32nd IEEE International Parallel & Distributed Processing Symposium  2018.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Vancouver  

    researchmap

  • Optimization of Hierarchical Matrix Computation on GPU International conference

    Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

    SC Asia  2018.3 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Singapore  

    researchmap

  • Variational Inference in Deep Learning Using Natural Gradient Descent

    Hikaru Nakata, Kazuki Osawa, Rio Yokota

    The 81st National Convention of IPSJ  2019.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Fukuoka  

    researchmap

  • Optimization Methods for Large Scale Distributed Deep Learning Invited International conference

    Rio Yokota

    IPAM Workshop I: Big Data Meets Large-Scale Computing  2018.9 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Los Angeles  

    researchmap

  • Hyper-parameter Tuning of Approximate Natural Gradient Methods for Highly Parallel Distributed Deep Learning

    Hiroki Naganuma, Shun Iwase, Linsho Kaku, Hikaru Nakata, Rio Yokota

    Forum on Information Technology 2018  2018.9 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Fukuoka  

    researchmap

  • Early Application Results on TSUBAME 3 Invited International conference

    Rio Yokota

    Smoky Mountains Computational Sciences and Engineering Conference  2018.8 

     More details

    Language:English   Presentation type:Oral presentation (invited, special)  

    Venue:Gatlinburg  

    researchmap

  • Second Order Optimization for Large Scale Parallel Deep Learning Through Analysis of the Fisher Information Matrix

    Kazuki Osawa, Rio Yokota, Chuan-Sheng Foo, Vijay Chandrasekhar

    The 81st National Convention of IPSJ  2019.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Fukuoka  

    researchmap

  • Batched QR Decomposition Using TensorCores

    Hiroyuki Ootomo, Rio Yokota

    The 81st National Convention of IPSJ  2019.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Fukuoka  

    researchmap

  • Acceleration of Convolutional Neural Networks Using Low-Rank Tensor Decomposition

    K. Osawa, A. Sekiya, H. Naganuma, R. Yokota

    Pattern Recognition and Media Understanding  2017.10 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Kumamoto  

    researchmap

  • Evaluating the Compression Efficiency of the Filters in Convolutional Neural Networks International conference

    Kazuki Oosawa, Rio Yokota

    The 26th International Conference on Artificial Neural Networks  2017.9 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Sardinia  

    researchmap

  • Acceleration of Compressed Models in Deep Learning Using Half Precision Arithmetic

    H. Naganuma, K. Osawa, A. Sekiya, R. Yokota

    Japan Society for Industrial and Applied Mathematics Annual Meeting  2017.9 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Tokyo  

    researchmap

  • Distributed Learning of Deep Neural Networks Using the Kronecker Factorization of the Fisher Information Matrix

    Hiroyuki Otomo, Kazuki Osawa, Rio Yokota

    The 163rd Workshop on High Performance Computing  2018.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Ehime  

    researchmap

  • Accelerating Convolutional Neural Networks Using Low Precision Arithmetic International conference

    Hiroki Naganuma, Rio Yokota

    HPC Asia  2018.1 

     More details

    Language:English   Presentation type:Poster presentation  

    Venue:Tokyo  

    researchmap

  • Verification of Low-precision Arithmetic for the Acceleration of Convolutional Neural Networks

    Hiroki Naganuma, Rio Yokota

    GTC Japan  2017.12 

     More details

    Language:Japanese   Presentation type:Poster presentation  

    Venue:Tokyo  

    researchmap

  • Evaluating the Performance of Deep Learning with Low Precision Arithmetic

    H. Naganuma, A. Sekiya, K. Osawa, H. Otomo, Y. Kuwamura, R. Yokota

    Pattern Recognition and Media Understanding  2017.10 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Kumamoto  

    researchmap

  • Deep Learning Using Kronecker-factored Approximation of Fisher Matrix

    Hiroyuki Ohtomo, Kazuki Osawa, Rio yokota

    The 80th National Convention of IPSJ  2018.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Tokyo  

    researchmap

  • Can we use Hierarchical Low-Rank Approximation for Deep Learning? Invited International conference

    Rio Yokota

    HPC Saudi 2018  2018.3 

     More details

    Language:English   Presentation type:Oral presentation (keynote)  

    Venue:Jeddah  

    researchmap

  • Hyper-parameter Tuning for Approximate Natural Gradient Methods

    Yuji Kuwamura, Kazuki Osawa, Rio Yokota

    The 80th National Convention of IPSJ  2018.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Tokyo  

    researchmap

  • Self-supervised Continual Pretraining for Class Incremental Image Classification

    Hikaru Nakata, Nakamasa Inoue, Rio Yokota

    Proc. CVPR CLVISION Workshop (Findings)  2021.6 

     More details

    Language:English  

    researchmap

  • Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis

    Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota

    HPC Asia 2020  2020.1 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Practical Deep Learning with Bayesian Principles International conference

    Osawa, K, Swaroop, S, Jain, A, Eschenhagen, R, Turner, R. E, Yokota, R, Khan, M. E

    The 33rd Conference on Neural Information Processing Systems  2019.12 

     More details

    Language:English   Presentation type:Poster presentation  

    Venue:Vancouver  

    researchmap

  • On Empirical Analysis of Layer-wised Learning Rate Schedule

    Hiroki Naganuma, Rio Yokota

    ACML 2019 Workshop on Statistics & Machine Learning Researchers  2019.11 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • TSQR on TensorCores

    Hiroyuki Ootomo, Rio Yokota

    The International Conference for High Performance Computing, Networking, Storage, and Analysis  2019.11 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Randomized SVD on TensorCores

    Hiroyuki Ootomo, Rio Yokota

    ISC High Performance 2020  2020.6 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Distributed Memory Task-Based Block Low Rank Direct Solver

    Sameer Deshmukh, Rio Yokota

    ISC High Performance 2020  2020.6 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • QR Decomposition of Block Low-Rank Matrices

    Muhammad Ridwan Apriansyah, Rio Yokota

    HPC Asia 2020  2020.1 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Distributed Memory Task-Based Block Low Rank Direct Solver

    Sameer Deshmukh, Rio Yokota

    HPC Asia 2020  2020.1 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Runtime System for GPU-based Hierarchical LU factorization

    Qianxing Ma, Rio Yokota

    The International Conference for High Performance Computing, Networking, Storage, and Analysis  2019.11 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Optimization of Numerous Small Dense-Matrix?Vector Multiplications in H-matrix Arithmetic on GPU International conference

    S. Ohshima, I. Yamazaki, A. Ida, R. Yokota

    Auto-Tuning for Multicore and GPU (ATMG) In conjunction with the IEEE MCSoC-19  2019.10 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Singapore  

    researchmap

  • A Performance Improvement Approach for Second-Order Optimization in Large Mini-batch Training International conference

    Hiroki Naganuma, Rio Yokota

    2nd High Performance Machine Learning Workshop CCGrid2019 (HPML2019)  2019.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Lacarna  

    researchmap

  • Exhaustive Study of Hierarchical AllReduce Patterns for Large Messages Between GPUs International conference

    Yuichiro Ueno, Rio Yokota

    19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)  2019.5 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Lacarna  

    researchmap

  • Improving the Generalization Gap in Large-batch Training Using Noise Injection

    Hiroki Naganuma, Rio Yokota

    IEICE General Conference  2019.3 

     More details

    Language:Japanese   Presentation type:Poster presentation  

    Venue:Tokyo  

    researchmap

  • Second Order Optimization for Large Scale Parallel Deep Learning Invited

    Rio Yokota, Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse

    IEICE General Conference  2019.3 

     More details

    Language:Japanese   Presentation type:Oral presentation (invited, special)  

    Venue:Tokyo  

    researchmap

  • GPU Implementation of TSQR Using Tensor Cores

    H. Ootomo, R. Yokota

    The 170th Workshop on High Performance Computing  2019.7 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Kitami  

    researchmap

  • Flexible and Simplistic Hierarchical Matrix-Based Fast Direct Solver

    P. Spalthoff, R. Yokota

    The 170th Workshop on High Performance Computing  2019.7 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Kitami  

    researchmap

  • Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks International conference

    Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka

    IEEE/CVF Conference on Computer Vision and Pattern Recognition  2019.6 

     More details

    Language:English   Presentation type:Poster presentation  

    Venue:Long Beach  

    researchmap

  • Effectiveness of Smoothing for Large-batch Training Using Natural Gradient Descent

    Hiroki Naganuma, Rio Yokota

    The 3rd Cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSIG)  2019.5 

     More details

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:Yokohama  

    researchmap

  • Performance Optimizations and Analysis of Distributed Deep Learning with Approximated Second-Order Optimization Method

    Yohei Tsuji, Kazuki Osawa, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka

    International Conference on Parallel Processing: The 1st Workshop on Parallel and Distributed Machine Learning, 48th International Conference on Parallel Processing  2019.8 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • Range of Applications for the Fast Multipole Method on GPUs International conference

    YOKOTA Rio

    Accelerated Computing  2010.1 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • (Really) Fast macromolecular electrostatics – fast algorithms, open software and accelerated computing

    ACS Division of Physical Chemistry 240th National Meeting  2010 

     More details

  • RBF interpolation using Gaussians with domain decomposition on GPUs

    SIAM annual meeting  2010 

     More details

▼display all

Awards

  • ACM Gordon Bell Prize (price/performance)

    2009  

     More details

Research Projects

  • 次世代計算機の潜在能力を引き出すための科学技術ソフトウェアの刷新

    Grant number:25H01109  2025.4 - 2028.3

    日本学術振興会  科学研究費助成事業  基盤研究(A)

    横田 理央

      More details

    Grant amount:\46020000 ( Direct Cost: \35400000 、 Indirect Cost:\10620000 )

    researchmap

  • 大規模言語モデル(LLM:Large Language Model)を活用した医薬品等の有効性・安全 性評価のためのアウトカム抽出の方法論の確立に向けた研究

    Grant number:24AC0401  2024.4 - 2027.3

    厚生労働省  厚生労働科学研究費  応用研究

    武藤 学, 松本 繁巳, 中島 貴子, 黒田 知宏, 吉原 博幸, 小林 慎治, 粂 直人, 横田 理央, 加藤 康之

      More details

  • 低ランク構造行列法の適用範囲拡大と多様な計算アーキテクチャの活用

    Grant number:24K02949  2024.4 - 2027.3

    日本学術振興会  科学研究費助成事業  基盤研究(B)

    伊田 明弘, 横田 理央, 塙 敏博, 岩下 武史, 大島 聡史, 星野 哲也, 平石 拓, 河合 直聡

      More details

    Grant amount:\18590000 ( Direct Cost: \14300000 、 Indirect Cost:\4290000 )

    researchmap

  • Next-generation high-performance linear solver for future computational science and engineering

    Grant number:23H00462  2023.4 - 2027.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (A)

      More details

    Grant amount:\45500000 ( Direct Cost: \35000000 、 Indirect Cost:\10500000 )

    researchmap

  • 深層生成モデルを活用した構成的なパターン認識・理解

    Grant number:23H00490  2023.4 - 2026.3

    日本学術振興会  科学研究費助成事業  基盤研究(A)

    篠田 浩一, 井上 中順, 横田 理央, 川上 玲, 佐藤 育郎

      More details

    Grant amount:\47190000 ( Direct Cost: \36300000 、 Indirect Cost:\10890000 )

    本研究課題では,識別の対象(インスタンス)を属性の集合(束)とみなし,特徴量空間においてその特徴を属性ごとに分解する.そして,これらの属性特徴からインスタンスを再合成する過程で属性特徴を最適化することで,各属性を高精度で識別し,かつ,外れ値に対し頑健な識別手法を実現することを目的としている。このために深層生成モデルと高密度な属性アノテーションに基づく学習手法を開発する.従来研究の多くが対象とその属性が一対一に対応する平坦な意味構造を仮定していたのに対し,本研究は多くの属性が複雑に絡み合う対象における複数の属性を同時に識別することを可能にする.新しい属性やクラスの創発も視野に入れる.より具体的には、深層学習を用いた「合成による識別」のアプローチにより,構成的なパターン認識・理解を行う方法論を確立する.人の動作認識,話者・感情認識,マルチモーダル認識の3つのタスクで横断的に評価し,従来に比べ高い識別性能を目指す.初年度である本年度は、人の動作認識、話者・感情認識、マルチモーダル認識の各々の課題において、評価データベースの構築と、ベースライン方式の開発を行った。これらと並行して、比較的小規模なタスクで、拡散モデルなどの生成モデルを用いて識別を行う方式の開発を行った。また、ニューラル構造探索などを用いて生成モデルの効率的な学習を行う方式も開発した。特に、センサーと映像のマルチモーダル認識における基本方式の構築、およびデータベース構築、人間の歩容認識の基本方式の開発、マルチモーダル感情認識の基本方式の開発を行った。

    researchmap

  • Functional improvement of mixing promotion in fluid equipment by elucidating the universal statistical law of two-phase turbulence

    Grant number:22H01403  2022.4 - 2026.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

      More details

    Grant amount:\17290000 ( Direct Cost: \13300000 、 Indirect Cost:\3990000 )

    researchmap

  • Functional improvement of mixing promotion in fluid equipment by elucidating the universal statistical law of two-phase turbulence

    Grant number:23K22674  2022.4 - 2026.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

      More details

    Grant amount:\17290000 ( Direct Cost: \13300000 、 Indirect Cost:\3990000 )

    researchmap

  • Fast and Accurate Eigenvalue Calculations by Hierarchical Low-rank Approximation and its Application to Large-scale Electronic Structure Calculations

    Grant number:22H03598  2022.4 - 2025.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)  Grant-in-Aid for Scientific Research (B)

    Rio Yokota, Akihiro Ida, Takeshi Ogita, Takeo Hoshi

      More details

    Authorship:Principal investigator 

    Grant amount:\17680000 ( Direct Cost: \13600000 、 Indirect Cost:\4080000 )

    researchmap

  • Fast and accurate eigenvalue calculations by hierarchical low-rank approximation and its application to large-scale electronic structure calculations

    Grant number:23K24854  2022.4 - 2025.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

      More details

    Grant amount:\17680000 ( Direct Cost: \13600000 、 Indirect Cost:\4080000 )

    researchmap

  • A New Bayes Duality Principle for Adaptive, Robust, Life-long Learning of AI

    Grant number:JY210177nn  2021.10 - 2027.3

    JST  CREST 

    Emtiyaz Khan, Kenichi Bannai, Rio Yokota, Julyan Arbel

      More details

    Authorship:Coinvestigator(s) 

    researchmap

  • Construction of numerical linear algebra based on lattice H-matrices and its high-performance implementation on modern architectures

    Grant number:21H03447  2021.4 - 2024.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research  Grant-in-Aid for Scientific Research (B)

      More details

    Grant amount:\17290000 ( Direct Cost: \13300000 、 Indirect Cost:\3990000 )

    researchmap

  • Application of Unconventional Linear Algebra Techniques to Continuous Learning in Supergiant Neural Networks

    Grant number:20K20624  2020.7 - 2023.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Research (Pioneering)  Grant-in-Aid for Challenging Research (Pioneering)

      More details

    Grant amount:\25350000 ( Direct Cost: \19500000 、 Indirect Cost:\5850000 )

    researchmap

  • Life-Long Deep Learning using Bayesian Principles

    Grant number:20H04247  2020.4 - 2023.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)  Grant-in-Aid for Scientific Research (B)

      More details

    Grant amount:\18200000 ( Direct Cost: \14000000 、 Indirect Cost:\4200000 )

    researchmap

  • Linear Solvers for Machine Learning Hardware

    Grant number:18H03248  2018.4 - 2021.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)  Grant-in-Aid for Scientific Research (B)

    Yokota Rio

      More details

    Grant amount:\16900000 ( Direct Cost: \13000000 、 Indirect Cost:\3900000 )

    The trend in computer architecture has now shifted from general purpose accelerators to specialized hardware for machine learning. The present work focuses on the affinity between hierarchical low-rank approximation methods, and low-precision arithmetic units and tensor product accelerators in machine learning processors to develop a suitable linear algebra library for future architectures. In FY2018, we ported our H-matrix library to use batched MAGMA operations in order to take advantage of the tensor product accelerators. In FY2019, we optimized the inner kernels of the H-matrix by making use of TensorCores. In FY2020, we extended this work to recover the accuracy when using TensorCores and measured the energy efficiency.

    researchmap

  • Enhancement of H-matrix library and optimization for next generation supercomputers

    Grant number:17H01749  2017.4 - 2020.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)  Grant-in-Aid for Scientific Research (B)

    Ida Akihiro

      More details

    Grant amount:\18850000 ( Direct Cost: \14500000 、 Indirect Cost:\4350000 )

    In this study, we enhanced the HACApK, which is a library for H-matrices: the dynamic load balancing technique is introduced for H-matrix generation, and algorithms of H-matrix-vector multiplication for GPU computing and mixed-precision computing are developed and implemented. These new implementations are several to ten times faster than existing HACApK. We proposed a novel variant of low-rank structured matrices, called “lattice H-matrices”, which allow the construction of efficient operation and communication patterns compared to the conventional H-matrices. In numerical experiments for performing H-matrix-vector multiplications, the lattice H-matrices is several tens of times faster than the normal H-matrices when several thousand processes are used. Moreover, we developed an LU decomposition method based on the lattice H-matrices, and a QR decomposition method for the BLR matrices which is a simple version of lattice H-matrices. We also proposed their parallelization algorithms.

    researchmap

  • Acceleration and economization of deep learning algorithms for image processing in social infrastructure

    2016.11 - 2022.3

    Japan Science and Technology Agency  CREST 

    SHINODA Koichi

      More details

    Grant type:Competitive

    researchmap

  • 性能と生産性を両立するエクサスケールコンピュータ向け階層型粒子法フレームワーク

    Grant number:16H02827  2016.4 - 2019.3

    日本学術振興会  科学研究費助成事業 基盤研究(B)  基盤研究(B)

    丸山 直也, 横田 理央, 田浦 健次朗

      More details

    Grant amount:\16900000 ( Direct Cost: \13000000 、 Indirect Cost:\3900000 )

    今日のペタスケールシステムの千倍の性能を目指したエクサスケールスーパーコンピュータでは、計算機アーキテクチャの質的および量的な変化が不可避であり、それに従って既存のアプリケーションの大幅な書き換えが必須となる。本研究では頻出基本数値計算手法である粒子法に着目し、アーキテクチャの変更の度にアプリケーションを変更することなく高性能を達成可能なソフトウェア基盤技術を確立することを目的として研究開発を進めた。これは、アーキテクチャ非依存にアプリケーション開発が可能なプログラミングフレームワークに基づき、並列化および性能最適化を自動化することを狙ったものである。今年度は本フレームワークの第一版としてCPUおよびGPUに対応したフレームワークを開発した。本フレームワークはC++テンプレートメタプログラミングに基づき、FMM等の階層的粒子法を簡便に記述可能なプログラミングモデルを提供する。ユーザプログラムはテンプレート展開によってCPU用の並列コードやCUDAを用いたGPU用コードへと自動的に変換されるため、対象プロセッサ用に別途プログラムを作成する必要がない。また、MPIを用いた複数ノード向け並列化もフレームワークによって自動的になされるため、単一のユーザプログラムによって単一プロセッサからスーパーコンピュータクラスの大規模システムまで統一的に動作させることが可能である。また、本フレームワークの実装には高性能を達成するためにFMMアルゴリズムの高性能実装技術に関する研究成果や軽量マルチスレッドランタイムであるMassiveThreadsが活用されており、人手による実装に近い性能が自動的に達成できている。

    researchmap

  • Grant-in-Aid for Scientific Research (B)

    2016.4 - 2019.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research 

    MARUYAMA Naoya

      More details

    Grant type:Competitive

    researchmap

  • Large scale iterative solvers by combining FMM and H-matrices

    Grant number:16H05859  2016.4 - 2018.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (A)  Grant-in-Aid for Young Scientists (A)

    Yokota Rio, Li Xiaoye S., Keyes David E.

      More details

    Grant amount:\6630000 ( Direct Cost: \5100000 、 Indirect Cost:\1530000 )

    In FY2016, we extended the FMM to H-matrices and developed a LU decomposition code using H-matrices. The dual tree traversal of exaFMM was used to determine the block cluster tree for arbitrary admissibility conditions, which allowed tasked based parallelization of the compression part of the H-matrix code. In FY2017, we further optimized inner kernels of the H-matrix code and compared H-matrices with multigrid for real applications. The use of batched MAGMA enabled us to maximize the performance of GPUs even for small matrices. The advantage of H-matrices over multigrid depends on the condition number of the matrix, while the H-matrix becomes advantageous as the degree of parallelism increases.

    researchmap

  • Grant-in-Aid for Encouragement of Young Scientists (A)

    2016.4 - 2018.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research 

    YOKOTA Rio

      More details

    Authorship:Principal investigator  Grant type:Competitive

    researchmap

  • Grant-in-Aid for Research Activity start-up

    2015.8 - 2017.3

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research 

    YOKOTA Rio

      More details

    Authorship:Principal investigator  Grant type:Competitive

    researchmap

  • エクサスケーラブルな大規模連立一次方程式の前処理としてのFMMの代数学的拡張

    Grant number:15H06196  2015.8 - 2016.3

    日本学術振興会  科学研究費助成事業 研究活動スタート支援  研究活動スタート支援

    横田 理央

      More details

    Grant amount:\1430000 ( Direct Cost: \1100000 、 Indirect Cost:\330000 )

    次世代計算機上で既に性能が出ると分かっている階層的 N 体アルゴリズムを出発点にとり,それを任意の連立一次方程式を扱えるソルバへと徐々に拡張した.平成27年度には,Poisson 方程式しか解くことのできない現在の FMM を Helmholtz 方程式や Stokes 方程式へと拡張し,流体解析のみならず構造・電磁場・音響解析へも適用できるようにした。また、それぞれの方程式を同等の計算条件,計算機環境の下で multigrid 法やHSS行列と直接比較し,いままでやられてこなかった手法間の定量的な優位性の評価を行った.Multigrid 法との比較においては Poisson 方程式に比べ Helmholtz 方程式は FMM の優位性が顕著であった.これは Helmholtz 方程式が高周波を含む場合に multigrid 法の収束性が著しく低下するのに対して,FMM の収束性がさほど低下しないことが原因である.HSS行列との比較ではセットアップのオーバーヘッドが小さい FMM が HSS に比べて合計の計算時間で有利になるという結果が得られた.特に2次元 Laplace 方程式においてその差は顕著で FMM が約1000倍高速であった.スケーラビリティのベンチマークにおいては FMM は Cray XC40 の 131,072 コアを用いた計算で良好な並列化効率が得られ,4000億点規模の計算を数秒で行うことができた.これは,FMM の計算としては世界最大規模であり,最速の計算でもあると思われる.

    researchmap

▼display all

Teaching Experience

  • Computer Networks

    Institution:Tokyo Institute of Technology, School of Computing, Department of Computer Science

     More details

  • 5類リテラシ

     More details

  • 高性能科学技術計算

     More details

  • 計算機ネットワーク

     More details

  • 5th Academic Group Literacy

    Institution:Tokyo Institute of Technology, School of Computing, Department of Computer Science

     More details

  • High Performance Scientific Computing

    Institution:Tokyo Institute of Technology, School of Computing, Department of Computer Science

     More details

▼display all