Faculty Profiles - YOKOTA RIO

写真a

YOKOTA RIO

Organization

Institute of Integrated Research Supercomputing Research Center Professor

Homepage

http://www.rio.scrc.iir.isct.ac.jp/en/index.html

External link

Degree

Ph.D. (Engineering) ( Keio University )

Research Interests

Numerical Analysis
Deep Learning
High Performance Computing
GPU

Research Areas

Manufacturing Technology (Mechanical Engineering, Electrical and Electronic Engineering, Chemical Engineering) / Fluid engineering
Informatics / Intelligent informatics
Informatics / Mathematical informatics
Informatics / Computational science
Informatics / High performance computing

Education

Keio University Graduate School of Science and Technology School of Science for Open and Environmental Systems (PhD)

2005.4 - 2009.3

　 More details

researchmap
Keio University Graduate School of Science and Technology School of Science for Open and Environmental Systems (Masters)

2003.4 - 2005.3

　 More details

Country： Japan

researchmap
Keio University Faculty of Science and Technology Department of Mechanical Engineering

1997.4 - 2003.3

　 More details

Country： Japan

researchmap

Research History

RIKEN Center for Computational Science, AI for Science Platform Division, AI for Science Foundation Model Research Team Team Principal

2025.4

　 More details

researchmap
Institute of Science Tokyo Institute of Integrated Research, Supercomputing Research Center Professor

2024.10

　 More details

researchmap
National Institute of Informatics Research and Development Center for Large Language Models Scientific Director

2024.4

　 More details

researchmap
Tokyo Institute of Technology Global Scientific Information and Computing Center Professor

2023.1 - 2024.9

　 More details

researchmap
Tokyo Institute of Technology Global Scientific Information and Computing Center Associate Professor

2015.4 - 2022.12

　 More details

researchmap
King Abdullah University of Science and Technology Applied Mathematics and Computer Science Research Scientist

2011.9 - 2015.3

　 More details

researchmap
Boston University Department of Mechanical Engineering Post-doctoral Research Associate

2010.9 - 2011.8

　 More details

researchmap
University of Bristol Department of Mathematics Post-doctoral Research Associate

2009.2 - 2010.8

　 More details

researchmap

▼display all

Professional Memberships

THE JAPAN SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS

　 More details

researchmap
THE JAPAN SOCIETY FOR COMPUTATIONAL ENGINEERING AND SCIENCE

　 More details

researchmap
INFORMATION PROCESSING SOCIETY OF JAPAN

　 More details

researchmap
Ameriacn Society of Mechanical Engineers

　 More details

researchmap
Japan Society of Mechanical Engineers

　 More details

researchmap
Society for Industrial and Applied Mathematics

　 More details

researchmap
Association for Computing Machinery

　 More details

researchmap
THE JAPAN SOCIETY OF MECHANICAL ENGINEERS

　 More details

researchmap
Association for Computing Machinery

　 More details

researchmap
Institute of Electrical and Electronics Engineers

　 More details

researchmap
THE JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE

　 More details

researchmap

▼display all

Committee Memberships

IEEE International Conference on Cluster Computing (IEEE CLUSTER 2025) track co-chair

2025

　 More details

Committee type：Academic society

researchmap
International Conference on Machine Learning (ICML 2025) reviewer

2025

　 More details

Committee type：Academic society

researchmap
ISC High Performance (ISC 2025) program committee

2025

　 More details

Committee type：Academic society

researchmap
Platform for Advanced Scientific Computing (PASC 2025) domain co-chair

2025

　 More details

Committee type：Academic society

researchmap
39th IEEE International Parallel and Distributed Processing Symposium, (IPDPS 2025) program committee, best OSS judge

2025

　 More details

Committee type：Academic society

researchmap
International Conference on Parallel Architectures and Compilation Techniques (PACT 2025) publicity chair

2025

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2025) proceedings vice-chair, program committee, AI4S committee

2025

　 More details

Committee type：Academic society

researchmap
Conference on Neural Information Processing Systems (NeurIPS 2025) reviewer

2025

　 More details

Committee type：Academic society

researchmap
SIAM Conference on Linear Algebra (LA24), scientific committee

2024

　 More details

researchmap
ACM International Symposium on High-Performance, Parallel and Distributed Computing (HPDC 2024), program committee

2024

　 More details

researchmap
SC Asia (SCA 2024), program committee

2024

　 More details

researchmap
Platform for Advanced Scientific Computing (PASC 2024), program committee

2024

　 More details

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2024), poster, ML track, AI4S/TPC WS, LLMs for HPC BoF

2024

　 More details

researchmap
International Conference on Preconditioning Techniques for Scientific and Industrial Applications, (Precond 2024), program committee

2024

　 More details

researchmap
38th IEEE International Parallel and Distributed Processing Symposium, (IPDPS 2024), track co-chair

2024

　 More details

researchmap
ISC High Performance (ISC 2024), posters chair

2024

　 More details

researchmap
The 12th International Conference on Learning Representations (ICLR 2024), reviewer

2024

　 More details

researchmap
The 30th International Conference on Parallel Processing (Euro-Par 2024), program committee

2024

　 More details

researchmap
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2024), reviewer

2024

　 More details

researchmap
38th Conference on Neural Information Processing Systems (NeurIPS 2024), reviewer

2024

　 More details

researchmap
The Eleventh International Conference on Learning Representations (ICLR 2023), reviewer

2023

　 More details

Committee type：Academic society

researchmap
Platform for Advanced Scientific Computing (PASC 2023), mini symposia and posters com- mittee, program committee

2023

　 More details

researchmap
The 23rd IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing (CCGrid 2023), program committee

2023

　 More details

researchmap
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2023), re- viewer

2023

　 More details

researchmap
The 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT 2023), publicity chair Asia

2023

　 More details

researchmap
IEEE International Conference on Cluster Computing (IEEE CLUSTER 2023), publicity chair

2023

　 More details

researchmap
Principles and Practice of Parallel Programming (PPoPP 2023), publicity chair

2023

　 More details

researchmap
37th IEEE International Parallel and Distributed Processing Symposium, (IPDPS 2023), pro- gram committee

2023

　 More details

researchmap
ISC High Performance (ISC 2023), posters deputy chair

2023

　 More details

researchmap
SIAM Conference on Computational Science and Engineering (CSE23), poster judge

2023

　 More details

researchmap
10th International Congress on Industrial and Applied Mathematics (ICIAM 2023), program committee

2023

　 More details

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2023), posters, workshops, ML tech paper, panelist, best paper

2023

　 More details

researchmap
37th Conference on Neural Information Processing Systems (NeurIPS 2023), reviewer

2023

　 More details

researchmap
Platform for Advanced Scientific Computing (PASC 2023), mini symposia and posters committee, program committee

2023

　 More details

researchmap
37th IEEE International Parallel and Distributed Processing Symposium, (IPDPS 2023), program committee

2023

　 More details

researchmap
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2023), reviewer

2023

　 More details

researchmap
IEEE CLUSTER (IEEE CLUSTER 2022), program committee

2022

　 More details

Committee type：Academic society

researchmap
ISC High Performance (ISC 2022), poster committee

2022

　 More details

Committee type：Academic society

researchmap
The 36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022), program committee

2022

　 More details

Committee type：Academic society

researchmap
International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2022), Applications track co-chair

2022

　 More details

Committee type：Academic society

researchmap
European Conference on Computer Vision (ECCV 2022), reviewer

2022

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2022), program committee, workshop chair

2022

　 More details

Committee type：Academic society

researchmap
49th International Conference on Parallel Processing (ICPP 2022), program committee

2022

　 More details

Committee type：Academic society

researchmap
IEEE CLUSTER (IEEE CLUSTER 2021), program committee

2021

　 More details

Committee type：Academic society

researchmap
ISC High Performance (ISC 2021), program committee, best paper committee, steering com- mittee

2021

　 More details

Committee type：Academic society

researchmap
The 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), program committee

2021

　 More details

Committee type：Academic society

researchmap
International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2021), organizing committee

2021

　 More details

Committee type：Academic society

researchmap
SIAM Computational Science and Engineering (SIAM CSE 2021), organizing committee

2021

　 More details

Committee type：Academic society

researchmap
IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2021), program committee

2021

　 More details

Committee type：Academic society

researchmap
48th International Conference on Parallel Processing (ICPP 2021), program committee

2021

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2021), program committee

2021

　 More details

Committee type：Academic society

researchmap
ISC High Performance (ISC 2021), program committee, best paper committee, steering committee

2021

　 More details

researchmap
SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP 2020), program committee

2020

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2020), ML track Chair

2020

　 More details

Committee type：Academic society

researchmap
The 26th International Conference on Parallel Processing (Euro-Par 2020), program committee

2020

　 More details

Committee type：Academic society

researchmap
International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020), program committee

2020

　 More details

Committee type：Academic society

researchmap
48th International Conference on Parallel Processing (ICPP 2019), Applications track chair, Workshop co-organizer

2019

　 More details

Committee type：Academic society

researchmap
SIAM Conference on Computational Science and Engineering (SIAM CSE19), organizing com- mittee

2019

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2019), program committee

2019

　 More details

Committee type：Academic society

researchmap
ISC High Performance (ISC 2019), program committee

2019

　 More details

Committee type：Academic society

researchmap
IEEE CLUSTER (IEEE CLUSTER 2019), program committee

2019

　 More details

Committee type：Academic society

researchmap
Platform for Advanced Scientific Computing Conference (PASC 2019), program committee

2019

　 More details

Committee type：Academic society

researchmap
The 3rd cross-disciplinary Workshop on Computing Systems, Infrastructures, and Program- ming (xSIG 2019), program committee

2019

　 More details

Committee type：Academic society

researchmap
IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC 2019), ATMG Track Chair

2019

　 More details

Committee type：Academic society

researchmap
The 3rd cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSIG 2019), program committee

2019

　 More details

researchmap
SIAM Conference on Computational Science and Engineering (SIAM CSE 2019), organizing committee

2019

　 More details

researchmap
SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP 2018), organizing committee

2018

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2018), proceedings chair

2018

　 More details

Committee type：Academic society

researchmap
ISC High Performance (ISC 2018), proceedings chair

2018

　 More details

Committee type：Academic society

researchmap
IEEE CLUSTER (IEEE CLUSTER 2018), program committee

2018

　 More details

Committee type：Academic society

researchmap
The 32nd IEEE International Parallel & Distributed Processing Symposium (IPDPS 2018), program committee

2018

　 More details

Committee type：Academic society

researchmap
18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2018), program committee

2018

　 More details

Committee type：Academic society

researchmap
SC Asia (SCA 2018), program committee

2018

　 More details

Committee type：Academic society

researchmap
The 23rd International Conference on Parallel Processing (Euro-Par 2017), program committee

2017

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2017), program committee

2017

　 More details

Committee type：Academic society

researchmap
Platform for Advanced Scientific Computing Conference (PASC 2017), program committee

2017

　 More details

Committee type：Academic society

researchmap
The 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 2017), program committee

2017

　 More details

Committee type：Academic society

researchmap
International Conference on Supercomputing (ICS 2017), program committee

2017

　 More details

Committee type：Academic society

researchmap
IEEE CLUSTER (IEEE CLUSTER 2017), program committee

2017

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
International Conference on Supercomputing (ICS 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
Platform for Advanced Scientific Computing Conference (PASC 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
ISC High Performance (ISC 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
The 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
The 23rd annual IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
IEEE CLUSTER (IEEE CLUSTER 2016), program committee

2016

　 More details

Committee type：Academic society

researchmap
IEEE CLUSTER (IEEE CLUSTER 2015), program committee

2015

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2015), program committee

2015

　 More details

Committee type：Academic society

researchmap
The 22nd annual IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2015), program committee

2015

　 More details

Committee type：Academic society

researchmap
15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2015), program committee

2015

　 More details

Committee type：Academic society

researchmap
14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014), program committee

2014

　 More details

Committee type：Academic society

researchmap
The annual IEEE International Conference on High Performance Computing (HiPC 2014), program committee

2014

　 More details

Committee type：Academic society

researchmap
The 20th International Conference on Parallel Processing (Euro-Par 2014), program committee

2014

　 More details

Committee type：Academic society

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2014), program committee

2014

　 More details

Committee type：Academic society

researchmap
The International Meeting on High-Performance Computing for Computational Science (VEC- PAR 2014), program committee

2014

　 More details

Committee type：Academic society

researchmap
The International Meeting on High-Performance Computing for Computational Science (VECPAR 2014), program committee

2014

　 More details

researchmap
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2013), program committee

2013

　 More details

Committee type：Academic society

researchmap
The 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013), program committee

2013

　 More details

Committee type：Academic society

researchmap
18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2013), program committee

2013

　 More details

Committee type：Academic society

researchmap
The annual IEEE International Conference on High Performance Computing (HiPC 2013), program committee

2013

　 More details

Committee type：Academic society

researchmap
15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2013), program committee

2013

　 More details

Committee type：Academic society

researchmap
The Third International Workshop on Frontier of GPU Computing, program committee

2012

　 More details

Committee type：Academic society

researchmap
The 12th International Symposium on Parallel and Distributed Computing (ISPDC 2012), program committee

2012

　 More details

Committee type：Academic society

researchmap

▼display all

Papers

Loss and Optimizer as Two Essential Mechanisms Behind Knowledge Distillation Reviewed

Satoki Ishikawa, Sameer Satish Deshmukh, Sakina Fatima, Takumi Honda, Rio Yokota

ICML Workshop HiLD 2026.7

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Monotonic and Non-Monotonic Length-Performance Relationships in Reasoning RL Reviewed

Daisuke Nohara, Taishi Nakamura, Rio Yokota

ICML 2026 Workshop AdaptFM 2026.7

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
PowerCLIP: Powerset Alignment for Fine-Grained Contrastive Pre-Training Reviewed

Masaki Kawamura, Nakamasa Inoue, Rintaro Yanagi, Hirokatsu Kataoka, Rio Yokota

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Building Effective Japanese Medical LLMs with an Open Recipe for Domain Adaptation through Continued Pre-training Reviewed

Akiko Aizawa, Yuki Arase, Fei Cheng, Jiahao Huang, Zhiyi Huang, Junfeng Jiang, Teruhito Kanazawa, Daisuke Kawahara, Kazuma Kobayashi, Takashi Kodama, Sadao Kurohashi, Yusuke Oda, Yuma Tsuta, Zhen Wan, Zhishen Yang and Rio Yokota

International Conference on Language Resources and Evaluation (LREC) 10405 - 10423 2026.5

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.63317/47uvbxqph5ph

researchmap
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks Reviewed

Taishi Nakamura, Satoki Ishikawa, Masaki Kawamura, Takumi Okamoto, Daisuke Nohara, Jun Suzuki, Rio Yokota

International Conference on Learning Representations (ICLR) 2026.4

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Qwen3-Swallow: Continual Pre-Training of Japanese–English Thinking LLMs with Megatron-LM Reviewed

Kazuki Fujii, Yukito Tajima, Masaki Kawamura, Sakae Mizuki, Taishi Nakamura, Hinari Shimada, Masanari Ohi, Taihei Shiotani, Koshiro Saito, Takumi Okamoto, Shigeki Ishida, Youmi Ma, Hiroya Takamura, Rio Yokota, Naoaki Okazaki

GPU Technology Conference, San Jose 2026.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Swallow LLM: Continual Pre-Training and RL for Sovereign AI Reviewed

Kazuki Fujii, Rio Yokota

GPU Technology Conference (GTC) 2026.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Tensor-Core-Optimized Strategies for BLR × Tall-Skinny Matrix Multiplication in BEM Reviewed

Akihiro Ida, Kazuya Goto, Rio Yokota, Tasuku Hiraishi, Toshihiro Hanawa, Takeshi Iwashita, Masatoshi Kawai, Satoshi Ohshima, Tetsuya Hoshino

Proceedings of the Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region 153 - 164 2026.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ACM

DOI： 10.1145/3773656.3773678

researchmap
FedPM: Federated Learning Using Preconditioned Mixing of Local Parameters Reviewed

Hiro Ishii, Kenta Niwa, Hiroshi Sawada, Akinori Fujino, Noboru Harada, Rio Yokota

The 40th Annual AAAI Conference on Artificial Intelligence 2026.1

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
A Study on the Performance and Usability of Managed Memory and Unified Memory for Accelerating Numerical Calculation Program Reviewed

Satoshi Ohshima, Akihiro Ida, Masatoshi Kawai, Takeshi Fukaya, Rio Yokota

2025 IEEE 18th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) 41 - 48 2025.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/mcsoc67473.2025.00017

researchmap
Variational Learning Finds Flatter Solutions at the Edge of Stability Reviewed International journal

Rio Yokota

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025.12

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Masked Gated Linear Unit Reviewed

Yukito Tajima, Nakamasa Inoue, Yusuke Sekikawa, Ikuro Sato, Rio Yokota

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025.12

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-training Reviewed International journal

Rio Yokota

EMNLP Findings 11469 - 11488 2025.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.18653/v1/2025.findings-emnlp.615

researchmap
Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models Reviewed

Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki

Conference on Language Modeling (COLM) 2025.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks Reviewed

Taishi Nakamura, Satoki Ishikawa, Masaki Kawamura, Takumi Okamoto, Daisuke Nohara, Jun Suzuki, Rio Yokota

ICML 2025 2nd AI for Math Workshop 2025.7

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Lion Cub: Minimizing Communication Overhead in Distributed Lion Reviewed

Satoki Ishikawa, Tal Ben-Nun, Brian Van Essen, Rio Yokota, Nikoli Dryden

ICML 2025 Workshop TTODLer-FM 2025.7

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers Reviewed

Chen Zhuang, Lingqi Zhang, Du Wu, Peng Chen, Jiajun Huang, Xin Liu, Rio Yokota, Nikoli Dryden, Toshio Endo, Satoshi Matsuoka, Mohamed Wahib

Proceedings of the 39th ACM International Conference on Supercomputing 57 - 72 2025.6

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ACM

DOI： 10.1145/3721145.3730422

researchmap
Improving LoRA with Variational Learning.

Bai Cong, Nico Daheim, Yuesong Shen, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

CoRR abs/2506.14280 2025.6

　More details

Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2506.14280

researchmap
On the Interplay Between Precision, Rank, Admissibility, and Iterative Refinement for Hierarchical Low-Rank Matrix Solvers Reviewed

Thomas Spendlhofer, Qianxiang Ma, Yasuhiro Matsumoto, Rio Yokota

ISC High Performance 2025.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code. Reviewed

Kazuki Fujii, Yukito Tajima, Sakae Mizuki, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Masanari Ohi, Masaki Kawamura, Taishi Nakamura, Takumi Okamoto, Shigeki Ishida, Kakeru Hattori, Youmi Ma, Hiroya Takamura, Rio Yokota, Naoaki Okazaki

CoRR abs/2505.02881 2025.5

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2505.02881

researchmap

Other Link： https://dblp.uni-trier.de/db/journals/corr/corr2505.html#abs-2505-02881
Quantum Turbulence Coupled with Externally Driven Normal-Fluid Turbulence in Counterflow of Superfluid ⁴He Reviewed

Satoshi Yui, Hiromichi Kobayashi, Makoto Tsubota, Rio Yokota

Journal of the Physical Society of Japan 94 ( 4 ) 043601 2025.4

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Physical Society of Japan

DOI： 10.7566/jpsj.94.043601

researchmap
Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation Reviewed

Satoki Ishikawa, Ryo Karakida, Rio Yokota

The 13th International Conference on Learning Representations (ICLR) 2025.4

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Drop Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization Reviewed

Taishi Nakamura, Takuya Akiba, Kazuki Fujii, Yusuke Oda, Rio Yokota, Jun Suzuki

The 13th International Conference on Learning Representations (ICLR) 2025.4

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
SuperGCN: General and Scalable Framework for GCN Training on CPU-powered Supercomputers Reviewed

Chen Zhuang, Peng Chen, Xin Liu, Nikoli Dryden, Rio Yokota, Toshio Endo, Satoshi Matsuoka, Mohamed Wahib

ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) 2025.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
新聞記事からつくる時事と社会に強い日本語LLM

Kakeru Hattori, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Taihei Shiotani, Kai Ueki, Takuro Niitsuma, Akira Kawabata, Hideaki Tamori, Youmi Ma, Koki Maeda, Masanari Ohi, Koshiro Saito, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki

2025.3

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
模倣学習による大規模言語モデルの指示チューニング

Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki

2025.3

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Reviewed

Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, et al.

The 31st International Conference on Computational Linguistics (COLING), Industry Track 2025.1

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code Reviewed

Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo

The 31st International Conference on Computational Linguistics (COLING), Industry Track, Abu Dhabi, UAE 2025.1

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Scaling Backwards: Minimal Synthetic Pre-Training? Reviewed

Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka

European Conference on Computer Vision (ECCV) 153 - 171 2025

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-031-72633-0_9

researchmap
On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process Reviewed

Shun Iwase, Shuya Takahashi, Nakamasa Inoue, Rio Yokota, Ryo Nakamura, Hirokatsu Kataoka, Eisaku Maeda

27th International Conference on Pattern Recognition (ICPR) 95 - 109 2025

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-031-78389-0_7

researchmap
Rethinking Image Super-Resolution from Training Data Perspectives Reviewed

Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki

European Conference on Computer Vision (ECCV) 19 - 36 2025

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-031-72643-9_2

researchmap
A Framework for Seamless Integration and Efficient Continual Pre-Training of Large Language Models Reviewed

Kazuki Fujii, Taishi Nakamura, Rio Yokota

SC’24 TPC Workshop 2024.11

　More details

Language：English Publishing type：Research paper (conference, symposium, etc.)

researchmap
Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities Reviewed

Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Shota Hirai, Sakae Mizuki, Rio Yokota, Naoaki Okazaki

Conference on Language Modeling COLM 2024.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Building a Large Japanese Web Corpus for Large Language Models Reviewed

Naoaki Okazaki, Kakeru Hattori, Shota Hirai, Hiroki Iida, Masanari Ohi, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Rio Yokota, Sakae Mizuki

Conference on Language Modeling COLM 2024.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Formula-Supervised Visual-Geometric Pre-training Reviewed

Ryosuke Yamada, Kensho Hara, Hirokatsu Kataoka, Koshi Makihara, Nakamasa Inoue, Rio Yokota, Yutaka Satoh

European Conference on Computer Vision (ECCV) 57 - 74 2024.9

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-031-72670-5_4

researchmap
画像超解像における学習データ構築の再考 Reviewed

Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M. asano, Iro Laina, Chistian Repprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki

2024.8

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
単眼カメラを用いたリアルタイムな３次元マップの変化検出を目的とした密なバンドル調整 Reviewed

Kai Okawa, Ken Sakurada, Rio Yokota

2024.8

　More details

Authorship：Last author Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Scaling Backwards: Minimal Synthetic Pre-training? Reviewed

Ryu Tadokoro, Ryo Nakamura, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Chistian Repprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka

第27回画像の認識・理解シンポジウム (MIRU) 2024.8

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
An Inherently Parallel H^2-ULV Factorization for Solving Dense Linear Systems on GPUs Reviewed

Qianxiang Ma, Rio Yokota

International Journal of High Performance Computing Applications 38 ( 4 ) 314 - 336 2024.7

　More details

Authorship：Last author Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1177/10943420241242021

researchmap
Variational Learning is Effective for Large Deep Networks Reviewed

Yuesong Shen, Nico Daheim, Gian Maria Marconi, Peter Nickl, Bai Cong, Bazan Clemen, Emile Marcel Raoul, Rio Yokota, Iryna Gurevych, Daniel Cremers, Mohammad, Emtiyaz Khan, Thomas Möllenhoff

The 41st International Conference on Machine Learning (ICML) 2024.7

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Interaction between quantum turbulence and normal-fluid turbulence in superfluid helium Reviewed

Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Tomokazu Saito, Rio Yokota

Thirteenth International Symposium on Turbulence and Shear Flow Phenomena (TSFP13) 2024.6

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
SIFTer: Self-improving Synthetic Datasets for Pre-training Classification Models Reviewed

Ryo Hayamizu, Shota Nakamura, Sora Takashima, Hirokatsu Kataoka, Ikuro Sato, Nakamasa Inoue, Rio Yokota

CVPR SynData Workshop 2024.6

　More details

Authorship：Last author,　Corresponding author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
When Does Second-Order Optimization Speed Up Training? Reviewed

Satoki Ishikawa, Rio Yokota

The 12th International Conference on Learning Representations (ICLR), Tiny paper 2024.5

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
DGEMM on integer matrix multiplication unit Reviewed

Hiroyuki Ootomo, Katsuhisa Ozaki, Rio Yokota

The International Journal of High Performance Computing Applications 38 ( 4 ) 297 - 313 2024.3

　More details

Authorship：Last author Language：English Publishing type：Research paper (scientific journal) Publisher：SAGE Publications

Deep learning hardware achieves high throughput and low power consumption by reducing computing precision and specializing in matrix multiplication. For machine learning inference, fixed-point value computation is commonplace, where the input and output values and the model parameters are quantized. Thus, many processors are now equipped with fast integer matrix multiplication units (IMMU). It is of significant interest to find a way to harness these IMMUs to improve the performance of HPC applications while maintaining accuracy. We focus on computing double-precision equivalent matrix multiplication using the Ozaki scheme, which computes a high-precision matrix multiplication by using lower-precision computing units, and show the advantages and disadvantages of using IMMU. The experiment using integer Tensor Cores shows that we can compute double-precision matrix multiplication faster than cuBLAS and an existing Ozaki scheme implementation on FP16 Tensor Cores on NVIDIA consumer GPUs. Furthermore, we demonstrate accelerating a quantum circuit simulation by up to 4.85 while maintaining the FP64 accuracy.

DOI： 10.1177/10943420241239588

researchmap

Other Link： https://journals.sagepub.com/doi/full-xml/10.1177/10943420241239588
Variational Low-Rank Adaptation Using IVON. Reviewed

Bai Cong, Nico Daheim, Yuesong Shen, Daniel Cremers, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

CoRR abs/2411.04421 2024

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2411.04421

researchmap

Other Link： https://dblp.uni-trier.de/db/journals/corr/corr2411.html#abs-2411-04421
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs. Reviewed

Koshiro Saito, Sakae Mizuki, Masanari Ohi, Taishi Nakamura, Taihei Shiotani, Koki Maeda, Youmi Ma, Kakeru Hattori, Kazuki Fujii, Takumi Okamoto, Shigeki Ishida, Hiroya Takamura, Rio Yokota, Naoaki Okazaki

CoRR abs/2412.14471 2024

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.48550/arXiv.2412.14471

researchmap

Other Link： https://dblp.uni-trier.de/db/journals/corr/corr2412.html#abs-2412-14471
Natural Gradient Primal-Dual Method for Decentralized Learning. Reviewed

Kenta Niwa, Hiro Ishii, Hiroshi Sawada, Akinori Fujino, Noboru Harada, Rio Yokota

IEEE Trans. Signal Inf. Process. over Networks 10 417 - 433 2024

　More details

Authorship：Last author Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1109/TSIPN.2024.3388948

researchmap
SegRCDB: Semantic Segmentation via Formula-Driven Supervised Learning Reviewed

Risa Shinoda, Ryo Hayamizu, Kodai Nakashima, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka

IEEE/CVF International Conference on Computer Vision (ICCV) 19997 - 20006 2023.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/iccv51070.2023.01835

researchmap
Pre-training Vision Transformers with Very Limited Synthesized Images Reviewed

Ryo Nakamura, Sora Takashima, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota

IEEE/CVF International Conference on Computer Vision (ICCV) 20303 - 20312 2023.10

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/iccv51070.2023.01862

researchmap
Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors Reviewed

Sameer Deshmukh, Rio Yokota, George Bosilca

ACM Transactions on Mathematical Software 49 ( 3 ) 1 - 29 2023.9

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1145/3595178

researchmap
Computing Eigenvalue of Symmetric H^2-Matrices in Linear Time with Slicing the Spectrum Reviewed

Muhammad Ridwan Apriansyah, Rio Yokota

International Conference on Parallel Processing (ICPP) 11 - 20 2023.8

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3605573.3605607

researchmap
O(N) Distributed Direct Factorization of Structured Dense Matrices Using Runtime Systems Reviewed

Sameer Deshmukh, Rio Yokota, George Bosilca

International Conference on Parallel Processing (ICPP) 1 - 10 2023.8

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3605573.3605606

researchmap
数式ドリブン教師あり学習によるセマンティックセグメンテーション Reviewed

Risa Shinoda, Ryo Hayamizu, Kodai Nakashima, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka

2023.7

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
学習過程における形状・テクスチャ偏重度の推移と事前学習データセットとの関係について Reviewed

Shuya Takahashi, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Eisaku Maeda

2023.7

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
単眼カメラを用いたリアルタイムな3次元マップの変化検出を目的とした密なバンドル調整 Reviewed

Kai Okawa, Ken Sakurada, Rio Yokota

2023.7

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Formula-Supervised Visual-Geometric Pre-training Reviewed

Ryosuke Yamada, Kensho Hara, Hirokatsu Kataoka, Koshi Makihara, Nakamasa Inoue, Rio Yokota, Yutaka Satoh

第26回画像の認識・理解シンポジウム (MIRU), ロングオーラル 2023.7

　More details

Language：English Publishing type：Research paper (conference, symposium, etc.)

researchmap
Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves Reviewed

Sora Takashima, Ryoh Hayamizu, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota

IEEE/CVF Conference on Computer Vision and Pattern Recognition 18579 - 18588 2023.6

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/cvpr52729.2023.01782

researchmap
ラージバッチ学習における汎化性能の低下を抑制する正則化手法 Reviewed

Shota Nakamura, Rio Yokota

2023.6

　More details

Authorship：Last author Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
受容野の自動最適化によるモードに適応的な Transformer の開発 Reviewed

Takuya Asakura, Nakamasa Inoue, Rio Yokota, Koichi Shinoda

JSAI2023 4I3OS1b05 2023.6

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

DOI： 10.11517/pjsai.JSAI2023.0_4I3OS1b05

researchmap
Mixed-Precision Random Projection for RandNLA on Tensor Cores Reviewed

Hiroyuki Ootomo, Rio Yokota

Platform for Advanced Scientific Computing (PASC) 1 - 11 2023.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3592979.3593413

researchmap
Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and Automatic Precision Selection Reviewed

Hiroyuki Ootomo, Hidetaka Manabe, Kenji Harada, Rio Yokota

ISC High Performance 259 - 276 2023.5

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-031-32041-5_14

researchmap
敵対的距離学習モジュールを用いた特徴変動に頑健な画像認識のための対照学習

Yoshifumi Sugiyama, Hirokatsu Kataoka, Rio Yokota, Nakamasa Inoue

2023.3

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
画像識別における形状・テクスチャ偏重度と二重降下現象の関係について Reviewed

Shuya Takahashi, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Eisaku Maeda

2023.3

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Reducing Shared Memory Footprint to Leverage High Throughput on Tensor Cores and its Flexible API Extension Library Reviewed

Hiroyuki Ootomo, Rio Yokota

HPC Asia 1 - 8 2023.2

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3578178.3578238

researchmap
Improving Continual Learning by Accurate Gradient Reconstructions of the Past Reviewed

Erik Daxberger, Siddharth Swaroop, Kazuki Osawa, Rio Yokota, Richard E. Turner, Jose Miguel Hernandez-Lobato, Mohammad Emtiyaz Khan

Transactions on Machine Learning Research 2023.2

　More details

Language：English Publishing type：Research paper (scientific journal)

researchmap
Empirical Study on Optimizer Selection for Out-of-Distribution Generalization Reviewed

Hiroki Naganuma, Kartik Ahuja, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato, Ioannis Mitliagkas

Transactions on Machine Learning Research 2023.1

　More details

Language：English Publishing type：Research paper (scientific journal)

researchmap
蒸留画像による事前学習効果についての検討 Reviewed

Ryu Tadokoro, Hirokatsu Kataoka, Rei Kawakami, Rio Yokota, Nakamasa Inoue

2022.12

　More details

Language：Japanese Publishing type：Research paper (international conference proceedings)

researchmap
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch Reviewed

Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler

NeurIPS Workshop Order up! The Benefits of Higher-Order Optimization in Machine Learning 2022.12

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Empirical Study on Optimizer Selection for Out-of-Distribution Generalization Reviewed

Hiroki Naganuma, Kartik Ahuja, Ioannis Mitliagkas, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato

NeurIPS Workshop Distshift abs/2211.08583 2022.12

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.48550/arXiv.2211.08583

researchmap

Other Link： https://dblp.uni-trier.de/db/journals/corr/corr2211.html#abs-2211-08583
QR Factorization of Block Low-Rank Matrices on Multi-Instance GPU Reviewed

Satoshi Ohshima, Akihiro Ida, Rio Yokota, Ichitaro Yamazaki

The 23rd International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’22) 359 - 369 2022.12

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-031-29927-8_28

researchmap
Informative Sample-Aware Proxy for Deep Metric Learning Reviewed

Aoyu Li, Ikuro Sato, Kohta Ishikawa, Rei Kawakami, Rio Yokota

ACM Multimedia Asia 1 - 11 2022.12

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3551626.3564942

researchmap
Scalable Linear Time Dense Direct Solver for 3-D Problems Without Trailing Sub-Matrix Dependencies Reviewed

Qianxiang Ma, Sameer Deshmukh, Rio Yokota

The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22) 1 - 12 2022.11

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/sc41404.2022.00088

researchmap
Parallel QR Factorization of Block Low-Rank Matrices Reviewed

Muhammad Ridwan Apriansyah, Rio Yokota

ACM Transactions on Mathematical Software, (2022). https://doi.org/10.1145/3538647 48 ( 3 ) 1 - 28 2022.9

　More details

Authorship：Last author Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1145/3538647

researchmap
Recovering Single Precision Accuracy from Tensor Cores While Surpassing the FP32 Theoretical Peak Performance Reviewed

Hiroyuki Ootomo, Rio Yokota

The International Journal of High Performance Computing Application 36 ( 4 ) 475 - 491 2022.7

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1177/10943420221090256

researchmap
Replacing Labeled Real-image Datasets with Auto-generated Contours Reviewed

Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Rio Yokota

IEEE/CVF Conference on Computer Vision and Pattern Recognition 21200 - 21209 2022.6

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/cvpr52688.2022.02055

researchmap
OPIRL: Sample Efficient Off-Policy Inverse Reinforcement Learning via Distribution Matching Reviewed

Hana Hoshino, Kei Ota, Asako Kanezaki, Rio Yokota

Proceedings of IEEE International Conference on Robotics and Automation 448 - 454 2022.5

　More details

Authorship：Last author Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1109/icra46639.2022.9811660

researchmap
Scalable and Practical Natural Gradient for Large-Scale Deep Learning Reviewed

Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Chuan-Sheng Foo, Rio Yokota

IEEE Transactions on Pattern Analysis and Machine Intelligence 2020;PP:10.1109/TPAMI.2020.3004354 44 ( 1 ) 404 - 415 2022.1

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Institute of Electrical and Electronics Engineers ({IEEE})

DOI： 10.1109/tpami.2020.3004354

researchmap
RePOSE: Real-Time Iterative Rendering and Refinement for 6D Object Pose Estimation Reviewed

Shun Iwase, Xingyu Liu, Rawal Khirodkar, Rio Yokota, Kris M. Kitani

Proceedings of the International Conference on Computer Vision 2021.10

　More details

Language：English Publishing type：Research paper (scientific journal)

researchmap
RePOSE: Fast 6D Object Pose Refinement via Deep Texture Rendering Reviewed

Shun Iwase, Xingyu Liu, Rawal Khirodkar, Rio Yokota, Kris M. Kitani

International Conference on Computer Vision 3283 - 3292 2021.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/iccv48922.2021.00329

researchmap
Self-supervised Continual Pretraining for Class Incremental Image Classification Reviewed

Hikaru Nakata, Nakamasa Inoue, Rio Yokota

Proceedings CVPR CLVISION Workshop (Findings) 2021.6

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
ExaFMM: a high-performance fast multipole method library with C++ and Python interfaces Reviewed

Tingyu Wang, Rio Yokota, Lorena A. Barba

The Journal of Open Source Software, 6(61):3145 (2021). 10.21105/joss.03145 6 ( 61 ) 3145 - 3145 2021.5

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.21105/joss.03145

researchmap
Rich Information is Affordable: A Systematic Performance Analysis of Second-order Optimization Using K-FAC Reviewed

Yuichiro Ueno, Kazuki Osawa, Yohei Tsuji, Akira Naruse, Rio Yokota

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2145 - 2153 2020.8

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ACM

DOI： 10.1145/3394486.3403265

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/conf/kdd/2020
Distributed Memory Task-Based Block Low Rank Direct Solver Reviewed

Sameer Deshmukh, Rio Yokota

ISC High Performance 2020 2020.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Distributed Memory Task-Based Block Low Rank Direct Solver Reviewed

Sameer Deshmukh, Rio Yokota

ISC High Performance 2020 2020.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Randomized SVD on TensorCores Reviewed

Hiroyuki Ootomo, Rio Yokota

ISC High Performance 2020 2020.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
巨大行列とAI Reviewed

横田理央

数学セミナー 59 ( 2 ) 29 - 33 2020.2

　More details

Language：English Publisher：日本評論社

researchmap
Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis Reviewed

Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota

ACM International Conference Proceeding Series 92 - 101 2020.1

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3368474.3368479

Web of Science

Scopus

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1911.00093v1
QR Decomposition of Block Low-Rank Matrices Reviewed

Muhammad Ridwan Apriansyah, Rio Yokota

HPC Asia 2020 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
スーパーコンピューティングコンテスト2019 Reviewed

横田理央

数学セミナー 59 ( 1 ) 44 - 49 2020.1

　More details

Language：English Publisher：日本評論社

researchmap
Distributed Memory Task-Based Block Low Rank Direct Solver Reviewed

Sameer Deshmukh, Rio Yokota

HPC Asia 2020 (poster) 2020.1

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Measuring the Effects to Beneficial Batch Size and Required Iteration by LARS on Neural Network Training

NAGANUMA Hiroki, TATSURO Ide, YOKOTA Rio

Proceedings of the Annual Conference of JSAI JSAI2020 4Rin169 - 4Rin169 2020

　More details

Language：Japanese Publisher：The Japanese Society for Artificial Intelligence

Deep Neural Networks(DNN), which have extremely large numbers of parameters, have been overwhelming other machine learning methods by using enormous volumes of data for the training. Since the training of DNN costs a significant amount of time for the computation, large-scale parallelization has been employed to reduce the training time. Large-batch training increases the batch size to reduce the number of required iterations and hence speeds up the training. However, recent research has shown that the effect of speed up hits a certain limit as the batch size becomes very large. In this paper, we conduct experiments to study the relationship between the batch size and the number of required iterations as the batch size increases up to the full batch using LARS, a commonly used method to adjust the learning rate. Our results experimentally verify that LARS is superior to other optimization methods in reducing the number of iterations and also in generalization performance.

DOI： 10.11517/pjsai.jsai2020.0_4rin169

CiNii Research

researchmap
Regularizing the fast multipole method for use in molecular simulation Reviewed

D. S. Shamshirgar, R. Yokota, A. K. Tornberg, B. Hess

Journal of Chemical Physics 151 ( 23 ) 234113 2019.12

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1063/1.5122859

Scopus

PubMed

researchmap
Practical deep learning with Bayesian principles Reviewed

Kazuki Osawa, Siddharth Swaroop, Anirudh Jain, Runa Eschenhagen, Richard E. Turner, Rio Yokota, Mohammad Emtiyaz Khan

Advances in Neural Information Processing Systems 32 4289 - 4301 2019.12

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

Scopus

researchmap

Other Link： http://papers.nips.cc/paper/8681-practical-deep-learning-with-bayesian-principles
QR factorization of block low-rank matrices with weak admissibility condition Reviewed

Akihiro Ida, Hiroshi Nakashima, Tasuku Hiraishi, Ichitaro Yamazaki, Rio Yokota, Takeshi Iwashita

Journal of Information Processing 27 831 - 839 2019.11

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.2197/ipsjjip.27.831

Scopus

researchmap
TSQR on TensorCores Reviewed

Hiroyuki Ootomo, Rio Yokota

The International Conference for High Performance Computing, Networking, Storage, and Analysis 2019.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
On Empirical Analysis of Layer-wised Learning Rate Schedule Reviewed

Hiroki Naganuma, Rio Yokota

ACML 2019 Workshop on Statistics & Machine Learning Researchers 2019.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Runtime System for GPU-based Hierarchical LU factorization Reviewed

Qianxing Ma, Rio Yokota

The International Conference for High Performance Computing, Networking, Storage, and Analysis 2019.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Optimization of numerous small dense-matrix-vector multiplications in h-matrix arithmetic on gpu Reviewed

Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

Proceedings - 2019 IEEE 13th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2019 9 - 16 2019.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/MCSoC.2019.00009

Web of Science

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/mcsoc/mcsoc2019.html#OhshimaYIY19
Distributed-memory lattice H-matrix factorization Reviewed

Ichitaro Yamazaki, Akihiro Ida, Rio Yokota, Jack Dongarra

International Journal of High Performance Computing Applications 33 ( 5 ) 1046 - 1063 2019.9

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1177/1094342019861139

Web of Science

Scopus

researchmap
Tensorコアを用いたTSQR Reviewed

Hiroyuki Ootomo, Rio Yokota

2019.9

　More details

Authorship：Last author Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Performance optimizations and analysis of distributed deep learning with approximated second-order optimization method Reviewed

Yohei Tsuji, Kazuki Osawa, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka

ACM International Conference Proceeding Series ( 21 ) 21 - 8 2019.8

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/3339186.3339202

Web of Science

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/icppw/icppw2019.html#TsujiOUNYM19
Extreme scale FMM-accelerated boundary integral equation solver for wave scattering Reviewed

Mustafa Abduljabbar, Mohammed Al Farhan, Noha Al-Harthi, Rui Chen, Rio Yokota, Hakan Bagci, David Keyes

SIAM Journal on Scientific Computing 41 ( 3 ) C245 - C268 2019.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1137/18M1173599

Scopus

researchmap
Large-scale distributed second-order optimization using kronecker-factored approximate curvature for deep convolutional neural networks Reviewed

Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2019-June 12351 - 12359 2019.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/CVPR.2019.01264

Web of Science

Scopus

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1811.12019v5
Effectiveness of Smoothing for Large-batch Training Using Natural Gradient Descent Reviewed

Hiroki Naganuma, Rio Yokota

The 3rd Cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSIG) 2019.5

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
A Performance Improvement Approach for Second-Order Optimization in Large Mini-batch Training. Reviewed

Hiroki Naganuma, Rio Yokota

19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing(CCGRID) 696 - 703 2019.5

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/CCGRID.2019.00092

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2019.html#NaganumaY19
Exhaustive study of hierarchical allreduce patterns for large messages between GPUs Reviewed

Yuichiro Ueno, Rio Yokota

Proceedings - 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2019 430 - 439 2019.5

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/CCGRID.2019.00057

Web of Science

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2019.html#UenoY19
Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis. Reviewed

Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota

CoRR abs/1911.00093 2019

　More details

researchmap
Highly productive, high-performance application frameworks for Post-Petascale computing Reviewed

Naoya Maruyama, Takayuki Aoki, Kenjiro Taura, Rio Yokota, Mohamed Wahib, Motohiko Matsuda, Keisuke Fukuda, Takashi Shimokawabe, Naoyuki Onodera, Michel Müller, Shintaro Iwasaki

Advanced Software Technologies for Post-Peta Scale Computing: The Japanese Post-Peta CREST Research Project 77 - 98 2018.12

　More details

Language：English Publishing type：Part of collection (book)

DOI： 10.1007/978-981-13-1924-2_5

Scopus

researchmap
自然勾配近似法を用いた大規模並列深層学習におけるハイパーパラメータ最適化 Reviewed

Hiroki Naganuma, Shun Iwase, Linsho Kaku, Hikaru Nakata, Rio Yokota

2018.9

　More details

Authorship：Last author Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Performance of Hierarchical-matrix BiCGStab Solver on GPU Clusters Reviewed

Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack Dongarra

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018 930 - 939 2018.8

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/IPDPS.2018.00102

Web of Science

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/ipps/ipdps2018.html#YamazakiAIOTYD18
Fast multipole preconditioners for sparse matrices arising from elliptic equations Reviewed

Huda Ibeid, Rio Yokota, Jennifer Pestana, David Keyes

Computing and Visualization in Science 18 ( 6 ) 213 - 229 2018.3

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Springer Verlag

DOI： 10.1007/s00791-017-0287-5

Scopus

arXiv

researchmap
Accelerating Convolutional Neural Networks Using Low Precision Arithmetic Reviewed

Hiroki Naganuma, Rio Yokota

HPC Asia 2018.1

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Optimization of hierarchical matrix computation on GPU Reviewed

Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10776 LNCS 274 - 292 2018

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-319-69953-0_16

Web of Science

Scopus

researchmap

Other Link： https://dblp.uni-trier.de/db/conf/scfa/scfa2018.html#OhshimaYIY18
Verification of Low-precision Arithmetic for the Acceleration of Convolutional Neural Networks Reviewed

Hiroki Naganuma, Rio Yokota

GTC Japan 2017.12

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Accelerating Convolutional Neural Networks Using Low-Rank Tensor Decomposition Reviewed

Kazuki Osawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota

117 ( 238 ) 1 - 6 2017.10

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Accelerating matrix multiplication in deep learning by using low-rank approximation Reviewed

Kazuki Osawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota

Proceedings - 2017 International Conference on High Performance Computing and Simulation, HPCS 2017 186 - 192 2017.9

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/HPCS.2017.37

Web of Science

Scopus

researchmap
Acceleration of Compressed Models in Deep Learning Using Half Precision Arithmetic Reviewed

H. Naganuma, K. Osawa, A. Sekiya, R. Yokota

Japan Society for Industrial and Applied Mathematics Annual Meeting 2017.9

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Optimization of Hierarchical Matrix Computations on a Cluster of GPUs Reviewed

Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

Summer United Workshops on Parallel, Distributed and Cooperative Processing 2017.7

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Communication Reducing Algorithms for Distributed Heirarchical N-Body Methods Reviewed

Mustafa AbdulJabbar, George Markomanolis, Huda Ibeid, Rio Yokota, David Keyes

32nd International Conference, ISC High Performance 2017.6

　More details

researchmap
Accelerating Convolutional Neural Networks Using Low-Rank Approximation Reviewed

Kazuki Oosawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota

22nd Conference of Japan Computational Engineering Society 22 2017.5

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Communication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions Reviewed

Mustafa Abduljabbar, George Markomanolis, Huda Ibeid, Rio Yokota, David Keyes

Lecture Notes in Computer Science 10266 79 - 96 2017.2

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Springer Verlag

DOI： 10.1007/978-3-319-58667-0_5

Scopus

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1702.05459v1
Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation Reviewed

Rio Yokota, Huda Ibeid, David Keyes

EIGENVALUE PROBLEMS: ALGORITHMS, SOFTWARE AND APPLICATIONS IN PETASCALE COMPUTING (EPASA 2015) 117 267 - 286 2017

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-319-62426-6_17

Web of Science

Scopus

researchmap
Performance evaluation of computation and communication kernels of the fast multipole method on intel manycore architecture Reviewed

Mustafa Abduljabbar, Mohammed Al Farhan, Rio Yokota, David Keyes

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10417 553 - 564 2017

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Springer Verlag

DOI： 10.1007/978-3-319-64203-1_40

Scopus

researchmap
Evaluating the compression efficiency of the filters in convolutional neural networks Reviewed

Kazuki Osawa, Rio Yokota

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10614 LNCS 459 - 466 2017

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-319-68612-7_52

Web of Science

Scopus

researchmap
Tapas: An Implicitly Parallel ProgrammingFramework For Hierarchical N-body Algorithms Reviewed

Fukuda Keisuke, Maruyama Naoya, Yokota Rio, Taura Kenjiro, MATSUOKA SATOSHI

The 22nd IEEE International Conference on Parallel And Distributed Systems 1100 - 1109 2016.12

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ICPADS.2016.143

DOI： 10.1109/icpads.2016.0145

Web of Science

researchmap
Communication Optimization of Distributed Memory FMM for Large Scale Boundary Element Methods Invited Reviewed

Yokota Rio

Simulation 35 ( 3 ) 147 - 153 2016.9

　More details

Language：Japanese Publishing type：Research paper (scientific journal) Publisher：Japan Society for Simulation Technology

researchmap
Fast Multipole Method as a Matrix-free Hierarchical Low-rank Approximation Reviewed

Rio Yokota

International Workshop on Eigenvalue Problems 2016.9

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms Reviewed

Huda Ibeid, Rio Yokota, David Keyes

International Journal of High Performance Computing Applications 30 ( 4 ) 423 - 437 2016.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1177/1094342016634819

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1405.6362v1
FMM と H^2(HSS) 行列のトレードオフについて Reviewed

横田理央

計算工学 21 ( 4 ) 3498 - 3501 2016

　More details

Language：Japanese Publishing type：Research paper (scientific journal)

CiNii Books

researchmap
Scaling FMM with data-driven OpenMP tasks on multicore architectures Reviewed

Amer, A., Matsuoka, S., Pericàs, M., Maruyama, N., Taura, K., Rio Yokota, Balaji, P.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9903 LNCS 156 - 170 2016

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1007/978-3-319-45550-1_12

Web of Science

researchmap
Preconditioning Sparse Matrices Using a Highly Scalable Fast Multipole Method Reviewed

Rio Yokota, Huda Ibeid, David Keyes

3rd International Workshops on Advances in Computational Mechanics 2015.10

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Multi-Level Restricted Maximum Likelihood Covariance Estimation and Kriging for Large Non-Gridded Spatial Datasets Reviewed

Julio E. Castrillon-Candas, Marc G. Genton, Rio Yokota

Spatial Statistics 18 ( 18 ) 105 - 124 2015.4

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.spasta.2015.10.006

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1504.00302v2
N-body methods Reviewed

Mustafa AbdulJabbar, Rio Yokota

2014.11

　More details

Language：English Publisher：Morgan Kaufmann

DOI： 10.1016/B978-0-12-802118-7.00010-8

researchmap
Fast Multipole Preconditioners for Sparse Linear Solvers Reviewed

Huda Ibeid, Rio Yokota, Jennifer Pestana, David Keyes

Proceedings of the 11th World Congress on Computational Mechanics 2014.7

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Communication Complexity of the Fast Multipole Method and its Algebraic Variants Invited Reviewed

Rio Yokota, George Turkiyyah, David Keyes

Supercomputing Frontiers and Innovations 1 ( 1 ) 63 - 84 2014.6

　More details

Language：English Publishing type：Research paper (scientific journal)

A combination of hierarchical tree-like data structures and data access
patterns from fast multipole methods and hierarchical low-rank approximation of
linear operators from H-matrix methods appears to form an algorithmic path
forward for efficient implementation of many linear algebraic operations of
scientific computing at the exascale. The combination provides asymptotically
optimal computational and communication complexity and applicability to large
classes of operators that commonly arise in scientific computing applications.
A convergence of the mathematical theories of the fast multipole and H-matrix
methods has been underway for over a decade. We recap this mathematical
unification and describe implementation aspects of a hybrid of these two
compelling hierarchical algorithms on hierarchical distributed-shared memory
architectures, which are likely to be the first to reach the exascale. We
present a new communication complexity estimate for fast multipole methods on
such architectures. We also show how the data structures and access patterns of
H-matrices for low-rank operators map onto those of fast multipole, leading to
an algebraically generalized form of fast multipole that compromises none of
its architecturally ideal properties.

DOI： 10.14529/jsfi140104

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1406.1974v1
Petascale molecular dynamics simulation using the fast multipole method on K computer Reviewed

Ohno Yousuke, Yokota Rio, Koyama Hiroshi, Morimoto Gentaro, Hasegawa Aki, Masumoto Gen, Okimoto Noriaki, Hirano Yoshinori, Ibeid Huda, Ibeid Huda, Narumi Tetsu, Taiji Makoto

Computer Physics Communications 185 ( 10 ) 2575 - 2585 2014.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.cpc.2014.06.004

Web of Science

researchmap
Fast N-body Methods as a Compute-Bound Preconditioner for Sparse Solvers on GPUs Reviewed

YOKOTA Rio

GPU Technology Conference 2014.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
High Performance Numerical Algorithms for Seismic and Reservoir Simulations Reviewed

LTAIEF Hatem, YOKOTA Rio

GPU Technology Conference 2014.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Fast Multipole Method Preconditioning Reviewed

PESTANA Jennifer, YOKOTA Rio, IBEID Huda, KEYES David

International Conference On Preconditioning Techniques For Scientific And Industrial Applications 2013.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Investigating New Numerical Techniques for Reservoir Simulations on GPUs Reviewed

ABDELFETTAH Ahmad, LTAIEF Hatem, YOKOTA Rio

GPU Technology Conference 2013.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM Reviewed

Abdelhalim Amer, Naoya Maruyama, Miquel Pericàs, Kenjiro Taura, Rio Yokota, Satoshi Matsuoka

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7905 255 - 266 2013

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-642-38750-0_19

Scopus

researchmap
A Task Parallelism Meets Fast Multipole Methods Reviewed

TAURA Kenjiro, NAKASHIMA Jun, YOKOTA Rio, MARUYAMA, Naoya

Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems 2012.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
An FMM Based on Dual Tree Traversal for Many-core Architectures Reviewed

Rio Yokota

Journal of Algorithms and Computational Technology 7 ( 3 ) 301 - 324 2012.9

　More details

Language：English

The present work attempts to integrate the independent efforts in the fast
N-body community to create the fastest N-body library for many-core and
heterogenous architectures. Focus is placed on low accuracy optimizations, in
response to the recent interest to use FMM as a preconditioner for sparse
linear solvers. A direct comparison with other state-of-the-art fast N-body
codes demonstrates that orders of magnitude increase in performance can be
achieved by careful selection of the optimal algorithm and low-level
optimization of the code. The current N-body solver uses a fast multipole
method with an efficient strategy for finding the list of cell-cell
interactions by a dual tree traversal. A task-based threading model is used to
maximize thread-level parallelism and intra-node load-balancing. In order to
extract the full potential of the SIMD units on the latest CPUs, the inner
kernels are optimized using AVX instructions. Our code -- exaFMM -- is an order
of magnitude faster than the current state-of-the-art FMM codes, which are
themselves an order of magnitude faster than the average FMM code.

DOI： 10.1260/1748-3018.7.3.301

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1209.3516v3
Petascale turbulence simulation using a highly parallel fast multipole method Reviewed

Yokota Rio, Barba Lorena, Barba Lorena, Narumi Tetsu, Yasuoka Kenji

Computer Physics Communications 184 ( 3 ) 445 - 455 2012.9

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.cpc.2012.09.011

Web of Science

Scopus

researchmap
4096GPUを用いた4096³規模の一様等方性乱流の渦法解析

横田理央, Barba Lorena, 成見哲

Tsubame ESJ. : e-science journal 6 1 - 6 2012.7

　More details

Language：Japanese Publisher：東京工業大学学術国際情報センター

researchmap
Petascale Fast Multipole Methods on GPUs Reviewed

YOKOTA Rio

The 11th International Symposium on Parallel and Distributed Computing 2012.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Data-Driven Fast Multipole Method on Distributed Memory Systems with Hardware Accelerators Reviewed

Hatem Ltaief, Rio Yokota

21st International Conference on Domain Decomposition Methods 2012.6

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Scalable Force Directed Graph Layout Algorithms Using Fast Multipole Methods Reviewed

Enas Yunis, Rio Yokota, Aron Ahmadia

Proceedings of the 11th International Symposium on Parallel and Distributed Computing 180 - 187 2012.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/ispdc.2012.32

researchmap
Recent Trends in Hierarchical N-body Methods on GPUs Reviewed

YOKOTA Rio, BARBA Lorena

GPU Technology Conference 2012.5

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
A Parallel Numerical Simulation of Dust Particles Using Direct Numerical Simulation Reviewed

Hoang Vu Nguyen, Rio Yokota, Georgiy Stenchikov

Proceedings of the European Geosciences Union General Assembly 2012.4

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Ensemble Reviewed

Rio Yokota

(2012) 14 ( 2 ) 85 - 89 2012.4

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.11436/mssj.14.85

researchmap
Data-Driven Execution of Fast Multipole Methods Reviewed

Hatem Ltaief, Rio Yokota

Concurrency and Computation: Practice and Experience 26 ( 11 ) 1935 - 1946 2012.3

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1002/cpe.3132

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1203.0889v1
Optimization of Molecular Dynamics Core Program on the K computer Reviewed

Yousuke Ohno, Rio Yokota, Hiroshi Koyama, Gentaro Morimoto, Aki Hasegawa, Gen Masumoto, Tetsu Narumi, Makoto Taiji

Proceedings of JSST 2012 International Conference on Simulation Technology 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
FMM Tree Construction on GPUs Reviewed

YOKOTA Rio

Ensemble 14 ( 2 ) 85 - 89 2012

　More details

Language：Japanese Publishing type：Research paper (scientific journal)

DOI： 10.11436/mssj.14.85

researchmap
A Task Parallel Implementation of Fast Multipole Methods Reviewed

Kenjiro Taura, Jun Nakashima, Rio Yokota, Naoya Maruyama

2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC) 617 - 625 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1109/SC.Companion.2012.86

Web of Science

researchmap
Scalable Fast Multipole Methods for Vortex Element Methods Reviewed

Qi Hu, Nail A. Gumerov, Rio Yokota, Lorena Barba, Ramani Duraiswami

2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC) 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Scalable Fast Multipole Methods for Vortex Element Methods Reviewed

Qi Hu, Nail A. Gumerov, Rio Yokota, Lorena Barba, Ramani Duraiswami

2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC) 1408 - + 2012

　More details

Language：English Publishing type：Research paper (international conference proceedings)

Web of Science

researchmap
Petascale Turbulence Simulation Using FMM Reviewed

Rio Yokota, Tetsu Narumi, Lorena Barba, Kenji Yasuoka

IPSJ SIG Notes 2011 ( 29 ) 1 - 8 2011.11

　More details

Language：Japanese Publishing type：Research paper (international conference proceedings) Publisher：Information Processing Society of Japan (IPSJ)

Fast multipole methods (FMM) were originally developed for accelerating N-body problems in astrophysics and other particle based methods. A recent trend in HPC has been to use FMMs in unconventional application areas. We have performed a 20483 turbulence calculation using an FMM designed for large scale GPU systems. The proposed method uses a hybridization of the treecode and FMM, and combines the data-parallel treecode with the O(N) FMM. The run on TSUBAME 2.0 using 4096 GPUs achieved 74 % parallel efficiency, and the sustained performance reached 1.01 PFlops.

CiNii Books

researchmap
FMM-based vortex method for simulation of isotropic turbulence on GPUs, compared with a spectral method Reviewed

Rio Yokota, L. A. Barba

Computers and Fluids 80 ( 80 ) 17 - 27 2011.10

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.compfluid.2012.08.002

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1110.2921v4
Hierarchical N-body simulations with auto-tuning for heterogeneous systems Reviewed

Rio Yokota, Lorena A. Barba

Computing in Science and Engineering 14 ( 3 ) 30 - 39 2011.8

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1109/MCSE.2012.1

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1108.5815v2
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems Reviewed

Rio Yokota, Lorena Barba

International Journal of High Performance Computing Applications 26 ( 4 ) 337 - 346 2011.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1177/1094342011429952

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1106.2176v2
Parameter Tuning of a Hybrid Treecode-FMM on GPUs Reviewed

YOKOTA Rio, BARBA Lorena

The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems 2011.6

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Comparing the Treecode with FMM on GPUs for Vortex Particle Simulations of a Leapfrogging Vortex Ring Reviewed

Rio Yokota, L. A. Barba

COMPUTERS & FLUIDS 45 ( 1 ) 155 - 161 2011.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.compfluid.2010.11.029

Web of Science

researchmap
Fast Multipole Method vs. Spectral Methods for the Simulation of Isotropic Turbulence on GPUs Reviewed

Rio Yokota, Lorena Barba

Proceedings of the 23rd International Conference on Parallel Computational Fluid Dynamics 2011.5

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Treecode and fast multipole method for N-body simulation with CUDA Reviewed

Rio Yokota, Lorena Barba

2011.2

　More details

Language：English Publisher：Morgan Kaufmann

DOI： 10.1016/B978-0-12-384988-5.00009-7

researchmap
Vortex methods for the simulation of turbulent flows Reviewed

Rio Yokota, Shinnosuke Obi

Journal of Fluid Science and Technology 6 ( 1 ) 14 - 29 2011.1

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1299/jfst.6.14

researchmap
N-body Simulation and FMM on the Large-scale GPU Cluster at Nagasaki University Reviewed

YOKOTA Rio, HAMADA Tsuyoshi

Journal of the Japan Society for Computational Engineering and Science 15 ( 4 ) 2416 - 2419 2010.10

　More details

Language：Japanese Publishing type：Research paper (scientific journal)

CiNii Books

researchmap
(Really) Fast macromolecular electrostatics – fast algorithms, open software and accelerated computing Reviewed

YOKOTA Rio, BARDHAN Jaydeep, KNEPLEY Matt, BARBA Lorena

ACS Division of Physical Chemistry 240th National Meeting 2010.8

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Biomolecular electrostatics using a fast multipole BEM on up to 512 GPUs and a billion unknowns Reviewed

Rio Yokota, Jaydeep P. Bardhan, Matthew G. Knepley, L. A. Barba, Tsuyoshi Hamada

COMPUTER PHYSICS COMMUNICATIONS 182 ( 6 ) 1272 - 1283 2010.7

　More details

Language：English

DOI： 10.1016/j.cpc.2011.02.013

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1007.4591v3
Performance of the Fast Multipole Method on GPUs Using Various Kernels Reviewed

Rio Yokota, Lorena Barba

Proceedings of the 9th World Congress on Computational Mechanics 2010.7

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Comparing vortex methods and finite difference methods in a homogeneous turbulent shear flow Reviewed

R. Yokota, S. Obi

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS 63 ( 7 ) 828 - 846 2010.7

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1002/fld.2102

Web of Science

researchmap
Comparing the Treecode with FMM on GPUs for Vortex Particle Simulations of a Leapfrogging Vortex Ring Reviewed

Rio Yokota, Lorena Barba

22nd International Conference on Parallel Computational Fluid Dynamics 2010.5

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Lagrangian simulation of turbulence using vortex methods Reviewed

YOKOTA Rio, OBI Shinnosuke

2nd International Workshops on Advances in Computational Mechanics 2010.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
42 TFlops Hierarchical N-body Simulation on GPUs with Applications in both Astrophysics and Turbulence Reviewed

HAMADA Tsuyoshi, YOKOTA Rio, NITADORI Keigo, NARUMI Tetsu, YASUOKA Kenji, TAIJI Makoto, OGURI Kiyoshi

Supercomputing 2009 1 - 12 2009.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1145/1654059.1654123

researchmap
PetRBF--A parallel O(N) algorithm for radial basis function interpolation Reviewed

Rio Yokota, L. A. Barba, Matthew G. Knepley

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING 199 ( 25 ) 1793 - 1804 2009.9

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.cma.2010.02.008

Web of Science

arXiv

researchmap

Other Link： http://arxiv.org/pdf/0909.5413v1
Fast Multipole Methods on GPUs for the Meshfree Simulation of Turbulence Reviewed

Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Kenji Yasuoka, Shinnosuke Obi

Proceedings of the 10th US National Congress on Computational Mechanics 2009.7

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Fast multipole methods on a cluster of GPUs for the meshless simulation of turbulence Reviewed

Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Shinnosuke Obi, Kenji Yasuoka

Computer Physics Communications 180 ( 11 ) 2066 - 2078 2009.6

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.cpc.2009.06.009

researchmap
DNS of Homogeneous Turbulence Using Vortex Methods Accelerated by the FMM on a Cluster of GPUs Reviewed

Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Kenji Yasuoka, Shinnosuke Obi

Proceedings of the 21st International Conference on Parallel Compuational Fluid Dynamics 2009.5

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Lagrangian Vortex Methods in Turbulent Channel Flows Reviewed

R. Yokota, K. Fukagata, S. Obi

ADVANCES IN TURBULENCE XII - PROCEEDINGS OF THE 12TH EUROMECH EUROPEAN TURBULENCE CONFERENCE 132 893 - 893 2009

　More details

Language：English Publishing type：Research paper (international conference proceedings)

DOI： 10.1007/978-3-642-03085-7_214

Web of Science

researchmap
2002 Numerical simulation of biological cell behavior in a micro channel by using CIP-Level Set method Reviewed

TAMURA Shuichi, YOKOTA Rio, FUKAGATA Koji

The Proceedings of The Computational Mechanics Conference 2009 ( 0 ) 546 - 547 2009

　More details

Language：Japanese Publisher：The Japan Society of Mechanical Engineers

We developed a numerical method based on the Level Set method, which can simulate biological cell behavior in micro channels. The isotropic compliant wall model is used to model the cell membrane and the surface force is calculated from the local displacement of membrane computed by using the Level Set function. The biological cell model behavior is investigated in a channel with sudden expansion and contraction. The shape of the cell model is found to be elongated in the contracted region and semicircle in the expanded region. The collision between two cell models in a T-junction is also investigated. When the cell models touch with each other, their boundaries overlap. A repulsion model is proposed to avoid this overlap.

DOI： 10.1299/jsmecmd.2009.22.546

CiNii Books

researchmap
Validation of Vortex Methods for a Turbulent Channel Flow Reviewed

Yokota Rio, Obi Shinnosuke

2009 253 - 253 2009

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

The vortex method is applied to the calculation of a turbulent channel flow of Re_b=5600, and the results are compared with a finite difference calculation. The fast multipole method was modified for the two way periodic boundary condition. The particle strength exchange was selected as the viscous diffusion scheme. The wall vorticity flux is calculated exactly, using a Neumann condition for the vorticity equation at the wall. The mean velocity profile agrees quantitatively between the vortex method and finite difference method.

CiNii Books

researchmap
Meshfree Simulation of Turbulence Using the Fast Multipole Methods on GPUs Reviewed

Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Kenji Yasuoka, Shinnosuke Obi

Proceedings of the 22nd Symposium on Computational Fluid Dynamics 2008.12

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Direct Numerical Simulation of Homogeneous Shear Flow Using Vortex Methods Reviewed

Rio Yokota, Shinnosuke Obi

Proceedings of the 4th International Conference on Vortex Flows and Vortex Models 2008.4

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Mesh-Free Simulation of the Homogeneous Shear Flow Using Vortex Methods Reviewed

Rio Yokota, Shinnosuke Obi

Proceedings of the 23rd IIS Turbulence and Shear Flow Dynamics Symposium 2008.3

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
The Study of Colliding Vortex Rings Using a Special-Purpose Computer and FMM Reviewed

Tarun K. Sheel, Rio Yokota, Kenji Yasuoka, Shinnosuke Obi

Transactions of the Japan Society for Computational Engineering and Science, 20080003 (2008) 20080003 20080003 2008.1

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.11421/jsces.2008.20080003

researchmap
Vortex Method Calculation of a Turbulent Channel Flow Reviewed

Yokota Rio, Fukagata Koji, Obi Shinnosuke

2008 337 - 337 2008

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

The vortex method is applied to the calculation of a turbulent channel flow of Re_b=5600, and the results are compared with a finite difference calculation. The mean velocity profile agrees quantitatively between the vortex method and finite difference method for the duration of one walkout time.

CiNii Books

researchmap
Computation of wing-tip vortex by a three-dimensional vortex method Reviewed

SATO Akira, YOKOTA Rio, OBI Shinnosuke

21st Symposium on Computational Fluid Dynamics 2007.12

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Calculation of isotropic turbulence using a pure Lagrangian vortex method Reviewed

R. Yokota, T. K. Sheel, S. Obi

JOURNAL OF COMPUTATIONAL PHYSICS 226 ( 2 ) 1589 - 1606 2007.10

　More details

Language：English Publishing type：Research paper (scientific journal)

DOI： 10.1016/j.jcp.2007.06.003

Web of Science

researchmap
Pure Lagrangian vortex methods for the simulation of decaying isotropic turbulence Reviewed

YOKOTA RIO, OBI SHINNOSUKE

Proceedings of 5th Int. Symp. Turbulent Shear Flow Phenomena 365 - 370 2007.8

　More details

Language：English Publishing type：Research paper (conference, symposium, etc.)

一様等方性乱流場を渦法により計算し、従来行なわれることのなかったDNSとの直接的な比較を通じて、双方の手法の比較を行なうとともに、渦法でこれまで評価が十分にされなかったエネルギー保存性などについて検討を加えた。

DOI： 10.1615/tsfp5.560

researchmap
Vortex Methods for the Calculation of Homogeneous Shear Flows Reviewed

Yokota Rio, Obi Shinnosuke

2007 195 - 195 2007

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

The vortex method is applied to the calculation of a homogeneous shear flow of Re_λ=25, and the results are compared with a finite difference calculation. The fast multipole method was modified for the shear periodic boundary condition. The core spreading method and particle strength exchange were selected as the viscous diffusion scheme. The time evolution of the energy spectrum, kinetic energy and enstrophy are shown for these different cases. The component energy ratio and particle density distribution were also examined for all cases.

CiNii Books

researchmap
662 Mesh-free Turbulence Simulation Using Vortex Methods Reviewed

YOKOTA Rio, OBI Shinnosuke

The Proceedings of Conference of Tokai Branch 2007 ( 0 ) 323 - 324 2007

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.) Publisher：The Japan Society of Mechanical Engineers

DOI： 10.1299/jsmetokai.2007.56.323

CiNii Books

researchmap
Vortex flow simulation between multiple bridge decks Reviewed

YOKOTA Rio, OBI Shinnosuke

Whither Turbulence Prediction and Control 2006.3

　More details

Language：English Publishing type：Research paper (international conference proceedings)

researchmap
2016 Simulation of a Wake using a 3-D Vortex Element Method Reviewed

YOKOTA Rio, OBI Shinnosuke

The proceedings of the JSME annual meeting 2006 ( 0 ) 31 - 32 2006

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.) Publisher：The Japan Society of Mechanical Engineers

The wake of a bluff body is calculated using a 3-D Vortex Method and turbulence statistics are calculated from the results, which are then compared with measurements by Particle Image Velocimetry and a URANS calculation. It is shown that when compared to a 2-D Vortex Method calculation, which was previously performed by the authors, the results of the 3-D calculation is much closer to the PIV measurements and URANS calculation.

DOI： 10.1299/jsmemecjo.2006.1.0_31

CiNii Books

researchmap
1202 Calculation of Fluid Structure Interaction using VEM and BEM(1) Reviewed

Yokota Rio, Obi Shinnosuke

The Proceedings of the Fluids engineering conference 2006 ( 0 ) _1202 - a_ 2006

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.) Publisher：The Japan Society of Mechanical Engineers

Fluid structure interaction of a circular cylinder is simulated by the coupling of a Vortex Element Method and Boundary Element Method. Both methods are volume-mesh-free, and thus alleviates the burden on handeling moving boundary problems in general. The rigid vibration of the cylinder is examined first, and compared with experimental studies. Then, the elastic deformation is considered.

DOI： 10.1299/jsmefed.2006._1202-a_

CiNii Books

researchmap
複数の鈍い形状物体周りの渦流れシミュレーション Reviewed

Rio Yokota, Shinnosuke Obi

2005.12

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.)

researchmap
Vortex flow simulation of multiple bluff bodies

YOKOTA Rio, TOKAI Norihiko, OBI Shinnosuke

The Proceedings of The Computational Mechanics Conference 2004 ( 0 ) 731 - 732 2004

　More details

Language：Japanese Publishing type：Research paper (conference, symposium, etc.) Publisher：The Japan Society of Mechanical Engineers

乱流の数値予測法の比較検証を目的として、一様流中に置かれた非流線型物体周りの流れを渦法と非定常RANSで計算した。２次元の計算に関しては、計算精度と時間のいずれの面でも両者の間には大きな差は見られなかった。

DOI： 10.1299/jsmecmd.2004.17.731

CiNii Books

researchmap

▼display all

Books

巨大行列とAI

Rio Yokota（ Role： Contributor）

2020.2

　More details

Responsible for pages：29-33 Language：Japanese

researchmap
スーパーコンピューティングコンテスト2019

Rio Yokota（ Role： Contributor）

2020.1

　More details

Responsible for pages：44-49 Language：Japanese

researchmap
High Performance Parallelism Pearls

YOKOTA Rio（ Role： Joint editorN-body methods on Xeon Phi coprocessors）

Morgan Kaufmann 2014.11 （ ISBN:9780128021187 ）

　More details

Language：English

DOI： 10.1016/B978-0-12-802118-7.00010-8

researchmap
GPU Computing Gems

Rio Yokota, Lorena Barba（ Role： ContributorTreecode and fast multipole method for N-body simulation with CUDA）

Morgan Kaufmann 2011 （ ISBN:0123849888 ）

　More details

Language：English

DOI： 10.1016/B978-0-12-384988-5.00009-7

researchmap

MISC

大規模言語モデルの強化学習による効率的な論理推論

Daisuke Nohara, Taishi Nakamura, Rio Yokota

2026.3

　More details

Language：Japanese

researchmap
視覚・言語モデルにおける日本語OCR性能の向上

Shungo Yasuda, Rio Yokota

2026.3

　More details

Language：Japanese

researchmap
One-Shot NASによるBERTのモデル圧縮

Takumi Okamoto, Rio Yokota

2024.5

　More details

Language：Japanese

DOI： 10.11517/pjsai.JSAI2024.0_2M5OS2405

researchmap
継続事前学習による日本語に強い大規模言語モデルの構築

Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Shota Hirai, Sakae Mizuki, Rio Yokota, Naoaki Okazaki

2024.3

　More details

Language：Japanese

researchmap
継続学習を用いた効率の良いマルチリンガル・マルチエキスパートモデルの開発

Taishi Nakamura, Rio Yokota

2024.3

　More details

Language：Japanese

researchmap
大規模言語モデルの日本語能力の効率的な強化: 継続事前学習における語彙拡張と対訳コーパスの活用

Sakae Mizuki, Hiroki Iida, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Masanari Ohi, Kakeru Hattori, Shota Hirai, Rio Yokota, Naoaki Okazaki

2024.3

　More details

Language：Japanese

researchmap
Swallowコーパス: 日本語大規模ウェブコーパス

Naoaki Okazaki, Kakeru Hattori, Shota Hirai, Hiroki Iida, Masanari Ohi, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Rio Yokota, Sakae Mizuki

2024.3

　More details

Language：Japanese

researchmap
大規模言語モデルの構造探索

Takumi Okamoto, Rio Yokota

2024.3

　More details

Language：Japanese

researchmap
大規模言語モデルの分散並列学習

Kazuki Fujii, Rio Yokota

2024.3

　More details

Language：Japanese

researchmap
Visual SLAM を目的とした深度の一貫性を考慮した密なバンドル調整

Kai Okawa, Ken Sakurada, Rio Yokota

2024.1

　More details

Language：Japanese

researchmap
Attempt to improve the computational performance by multi-processes GPU execution

大島聡史, 伊田明弘, 河合直聡, 深谷猛, 横田理央, 山崎市太郎

計算工学講演会論文集(CD-ROM) 29 2024

　More details

Language：Japanese

J-GLOBAL

researchmap
超解像事前学習における核心的要素の解明

Go Ohtani, Ryu Tadokoro, Hirokatsu Kataoka, Nakamasa Inoue, Rio Yokota, Yoshimitsu Aoki

2023.12

　More details

Language：Japanese

researchmap
Two-way coupled simulation of quantum turbulence and normal-fluid turbulence in superfluid helium-4

Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Tomokazu Saito, Rio Yokota

APS DFD 76th Annual Meeting of the Division of Fluid Dynamics 2023.11

　More details

Language：English

researchmap
Two-phase flow of quantum turbulence and normal-fluid turbulence in superfluid helium-4

Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Rio Yokota

ERCOFTAC Symposium on Engineering Turbulence Modeling and Measurements (ETMM) 2023.9

　More details

Language：English

researchmap
超流動ヘリウム4の量子乱流:常流体乱流による渦糸バンドルの形成

Satoshi Yui, Hiromichi Kobayashi, Makoto Tsubota, Tomokazu Saito, Rio Yokota

2023.9

　More details

Language：Japanese

DOI： 10.11316/jpsgaiyo.78.2.0_1275

researchmap
Vortex-Filament Bundle Induced by Normal-Fluid Turbulence in Turbulent Superfluid Helium-4

Satoshi Yui, Hiromichi Kobayashi, Makoto Tsubota, Tomokazu Saito, Rio Yokota

International Symposium on Quantum Fluids and Solids (QFS) 2023.8

　More details

Language：English

researchmap
Vortex-filament bundle induced by normal-fluid turbulence in turbulent superfluid helium-4

Hiromichi Kobayashi, Satoshi Yui, Makoto Tsubota, Tomokazu Saito, Rio Yokota

International Symposium on Quantum Fluids and Solids (QFS) 2023.8

　More details

Language：English

researchmap
Tensorコアによる単精度行列積エミュレーションの自動精度選択を用いた量子回路シミュレーション

Hiroyuki Ootomo, Hidetaka Manabe, Kenji Harada, Rio Yokota

2023.3

　More details

Language：Japanese

researchmap
GPUとA64FXにおけるTransformerの性能比較

Shukai Nakamura, Rio Yokota

2023.3

　More details

Language：Japanese

researchmap
ニュートンフラクタル画像による事前学習効果

Toshiki Omi, Ryo Nakamura, Hirokatsu Kataoka, Nakamasa Inoue, Rio Yokota

2023.3

　More details

Language：Japanese

researchmap
深層学習における勾配の前処理法に関する検討

Satoki Ishikawa, Rio Yokota

2023.3

　More details

Language：Japanese

researchmap
量子渦計算の高速多重極展開法を用いた高速化

Tomokazu Saito, Rio Yokota

2023.3

　More details

Language：Japanese

researchmap
O(N) Factorization of Dense Matrices on GPUs Without Trailing Submatrix Dependencies

Qianxiang Ma, Rio Yokota

SIAM Conference on Computational Science and Engineering (CSE) 2023.2

　More details

Language：English

researchmap
Parallel QR Factorization of Block Low-Rank Matrices

Muhammad Ridwan Apriansyah, Rio Yokota

SIAM Conference on Computational Science and Engineering (CSE) 2023.2

　More details

Language：English

researchmap
CUDA Fortran+MIG+UVMを用いたBLR行列QR分解の大規模高速化

Satoshi Ohshima, Akihiro Ida, Masatoshi Kawai, Rio Yokota, Ichitaro Yamazaki

2023 ( HPC-190 ) 2023

　More details

Language：Japanese

J-GLOBAL

researchmap
対称ブロック低ランク行列の精度保証付き固有値問題解法

Akihiro Ida, Takeshi Ogita, Rio Yokota

2022.9

　More details

Language：Japanese

researchmap
Tensorコアを用いた単精度行列積エミュレーションのアプリケーションでの評価

Hiroyuki Ootomo, Rio Yokota

2022.7

　More details

Language：Japanese

researchmap
走行動画の大規模自己教師あり学習の検討と計画

Tomoya Takahashi, Shingo Yashima, Kohta Ishikawa, Ikuro Sato, Rio Yokota

2022.7

　More details

Language：Japanese

researchmap
ViTのファインチューニング時におけるNASのモデル縮小効果

Xinyu Zhang, Sora Takashima, Rio Yokota

2022.6

　More details

Language：Japanese

DOI： 10.11517/pjsai.JSAI2022.0_3J4OS3b03

researchmap
Vision Transformerにおけるバッチサイズの汎化性能への影響

Shukai Nakamura, Rio Yokota

2022.3

　More details

Language：Japanese

researchmap
深層学習における2次最適化の汎化性能の検証

Hiro Ishii, Rio Yokota

2022.3

　More details

Language：Japanese

researchmap
マルチインスタンスGPU上におけるBLR行列のQR分解

大島聡史, 伊田明弘, 横田理央, 山崎市太郎

日本応用数理学会年会講演予稿集(CD-ROM) 2022 2022

　More details

J-GLOBAL

researchmap
TensorCoreを用いた精度補正単精度行列積

Hiroyuki Ootomo, Rio Yokota

2021.7

　More details

Language：Japanese

researchmap
画像分類のための継続的な事前学習における教師なし表現学習の堅牢性に関する検証

Hikaru Nakata, Rio Yokota

2020.6

　More details

Language：Japanese

DOI： 10.11517/pjsai.JSAI2020.0_2J5GS201

researchmap
TensorコアのAPIの構造解析を用いた拡張ライブラリの開発

Hiroyuki Ootomo, Rio Yokota

2020.3

　More details

Language：Japanese

researchmap
確率的重み付け平均法のラージバッチ学習における有用性の検証

Takahiro Shohata, Hiroki Naganuma, Rio Yokota

2020 ( 1 ) 359 - 360 2020.2

　More details

Language：Japanese

CiNii Books

CiNii Research

researchmap
早期終了タイミングを予測する：深層学習における確率勾配の分布の変化点検出

Keita Yashima, Kohta Ishikawa, Ikuro Sato, Tetsuhiro Nomura, Rio Yokota, Satoshi Matsuoka

2019.11

　More details

Language：Japanese

researchmap
Flexible and Simplistic Hierarchical Matrix-Based Fast Direct Solver

P. Spalthoff, R. Yokota

The 170th Workshop on High Performance Computing 2019.7

　More details

Language：Japanese

researchmap
GPU Implementation of TSQR Using Tensor Cores

H. Ootomo, R. Yokota

The 170th Workshop on High Performance Computing 2019.7

　More details

Language：Japanese

researchmap
Improving the Generalization Gap in Large-batch Training Using Noise Injection

Hiroki Naganuma, Rio Yokota

IEICE General Conference 2019.3

　More details

Language：Japanese

researchmap
Second Order Optimization for Large Scale Parallel Deep Learning

Rio Yokota, Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse

IEICE General Conference 2019.3

　More details

Language：Japanese

researchmap
Batched QR Decomposition Using TensorCores

Hiroyuki Ootomo, Rio Yokota

The 81st National Convention of IPSJ 2019.3

　More details

Language：Japanese

researchmap
Variational Inference in Deep Learning Using Natural Gradient Descent

Hikaru Nakata, Kazuki Osawa, Rio Yokota

The 81st National Convention of IPSJ 2019.3

　More details

Language：Japanese

researchmap
Second Order Optimization for Distributed Data-parallel Deep Learning on 4000 GPUs

Rio Yokota, Yohei Tsuji, Kazuki Osawa

I2R-TokyoTech Co-workshop on DL 2.0 2019.3

　More details

Language：English

researchmap
Fisher情報行列の解析に基づく大規模深層学習のための二次最適化手法

Kazuki Osawa, Rio Yokota, Chuan-Sheng Foo, Vijay Chandrasekhar

2019.3

　More details

Language：Japanese

researchmap
大規模並列深層学習のための目的関数の平滑化

Hiroki Naganuma, Rio Yokota

2019 ( 1 ) 315 - 316 2019.2

　More details

Language：Japanese

CiNii Books

CiNii Research

researchmap
GPUによる階層型行列計算法の高速化に向けた多数の小密行列ベクトル積計算の最適化

Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

2019 2019

　More details

Language：Japanese

J-GLOBAL

researchmap
K-FACと分散処理による深層学習の高速化

Yohei Tsuji, Kazuki Osawa, Rio Yokota, Satoshi Matsuoka

2018.7

　More details

Language：Japanese

researchmap
Software Auto-Tuning for Hierarchical Matrix Computation

大島聡史, 山崎市太郎, 伊田明弘, 横田理央

計算工学講演会論文集 Proceedings of the Conference on Computational Engineering and Science 23 2018.6

　More details

Language：Japanese Publisher：日本計算工学会

J-GLOBAL

researchmap
Deep Learning Using Kronecker-factored Approximation of Fisher Matrix

Hiroyuki Ohtomo, Kazuki Osawa, Rio yokota

The 80th National Convention of IPSJ 2018.3

　More details

Language：Japanese

researchmap
Hyper-parameter Tuning for Approximate Natural Gradient Methods

Yuji Kuwamura, Kazuki Osawa, Rio Yokota

The 80th National Convention of IPSJ 2018.3

　More details

Language：Japanese

researchmap
Distributed Learning of Deep Neural Networks Using the Kronecker Factorization of the Fisher Information Matrix

Hiroyuki Otomo, Kazuki Osawa, Rio Yokota

The 163rd Workshop on High Performance Computing 2018.3

　More details

Language：Japanese

researchmap
Preface

Rio Yokota, Weigang Wu

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10776 LNCS V - VI 2018

　More details

Scopus

researchmap
Preface

Rio Yokota, Michèle Weiland, David Keyes, Carsten Trinitis

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10876 LNCS V - VI 2018

　More details

Scopus

researchmap
Preface

Rio Yokota, Michèle Weiland, John Shalf, Sadaf Alam

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11203 LNCS V 2018

　More details

Scopus

researchmap
深層学習における低精度演算を用いた高速化及びアクセラレーターの性能評価

Hiroki Naganuma, Akira Sekiya, Kazuki Osawa, Hiroyuki Ootomo, Yuji Kuwamura, Rio Yokota

2017.10

　More details

Language：Japanese

researchmap
Using Low-Rank Approximation in Convolutional Neural Networks

Yoshifumi Motoyama, Toshio Endo, SATOSHI MATSUOKA, Rio Yokota, Keisuke Fukuda

158th Research Presentation Seminar in High Performance Computing 2017-HPC-158 ( 25 ) 2017.3

　More details

Language：Japanese

researchmap
階層型行列計算のGPU向け最適化

大島聡史, 山崎市太郎, 伊田明弘, 横田理央

日本応用数理学会年会講演予稿集(CD-ROM) 2017 2017

　More details

J-GLOBAL

researchmap
A Matrix-free Preconditioner for the Helmholtz Equation based on the Fast Multipole Method

Huda Ibeid, Rio Yokota, David Keyes

2016.8

　More details

Fast multipole methods (FMM) were originally developed for accelerating
$N$-body problems for particle-based methods. FMM is more than an $N$-body
solver, however. Recent efforts to view the FMM as an elliptic Partial
Differential Equation (PDE) solver have opened the possibility to use it as a
preconditioner for a broader range of applications. FMM can solve Helmholtz
problems with optimal $\mathcal{O}(N \log N)$ complexity, has compute-bound
inner kernels, and highly asynchronous communication patterns. The combination
of these features makes FMM an interesting candidate as a preconditioner for
sparse solvers on architectures of the future. The use of FMM as a
preconditioner allows us to use lower order multipole expansions than would be
required as a solver because individual solves need not be accurate. This
reduces the amount of computation and communication significantly and makes the
time-to-solution competitive with state-of-the-art preconditioners.
Furthermore, the high asynchronicity of FMM allows it to scale to much larger
core counts than factorization-based and multilevel methods. We describe our
tests in reproducible details with freely available codes.

arXiv

researchmap

Other Link： http://arxiv.org/pdf/1608.02461v1
A Matrix-Free Preconditioner for Elliptic Solvers Based on the Fast Multipole Method

Huda Ibeid, Rio Yokota, David Keyes

SIAM Conference on Parallel Processing for Scientific Computing 2016.4

　More details

Language：English

researchmap
Comparison of FMM and HSS at Large Scale

YOKOTA Rio, ROUET, Francois-Henri, LI Xiaoye

SIAM Conference on Applied Linear Algebra 2015.10

　More details

Language：English

researchmap
Fast Multipole Method as Preconditioner

Huda Ibeid, Jennifer Pestana, Rio Yokota, David Keyes

SIAM Conference on Computational Science and Engineering 2015.3

　More details

Language：English

researchmap
Asynchronous Execution of the Fast Multipole Method Using Charm++

Mustafa AbdulJabbar, Rio Yokota, David Keyes

arXiv:1405.7487 2014.5

　More details

Language：English

Fast multipole methods (FMM) on distributed mem- ory have traditionally used
a bulk-synchronous model of com- municating the local essential tree (LET) and
overlapping it with computation of the local data. This could be perceived as
an extreme case of data aggregation, where the whole LET is communicated at
once. Charm++ allows a much finer control over the granularity of
communication, and has a asynchronous execution model that fits well with the
structure of our FMM code. Unlike previous work on asynchronous fast N-body
methods such as ChaNGa and PEPC, the present work performs a direct comparison
against the traditional bulk-synchronous approach and the asynchronous approach
using Charm++. Furthermore, the serial performance of our FMM code is over an
order of magnitude better than these previous codes, so it is much more
challenging to hide the overhead of Charm++.

arXiv

researchmap

Other Link： http://arxiv.org/abs/1405.7487v1
Fast Multipole Method as a Preconditioner

IBEID Huda, YOKOTA Rio, KEYES David

SIAM Conference on Computational Science and Engineering 2013.2

　More details

Language：English

researchmap
Towards a Dataflow FMM using the OmpSs Programming Model

Miquel Pericas, Abdelhalim Amer, Keisuke Fukuda, Naoya Maruyama, Rio Yokota, Satoshi Matsuoka

研究報告ハイパフォーマンスコンピューティング(HPC) 2012 ( 12 ) 1 - 7 2012.9

　More details

Language：English

CiNii Books

researchmap
Parallelizing ExaFMM with MassiveThreads Task Parallel Library and Its Evaluation

Kenjiro Taura, Jun Nakashima, Rio Yokota, Naoya Maruyama

2012 ( 13 ) 1 - 13 2012.7

　More details

Language：Japanese

CiNii Books

researchmap
Turbulence Simulation Using 4096³ Vortex Particles on 4096 GPUs

Yokota Rio, Barba Lorena, Narumi Tetsu

Tsubame ESJ. : e-science journal 6 17 - 22 2012.7

　More details

Language：English Publisher：東京工業大学学術国際情報センター

researchmap
Scaling Fast Multipole Methods up to 4000 GPUs

YOKOTA Rio, NARUMI Tetsu, BARBA Lorena, YASUOKA Kenji

ATIP/A*CRC Workshop on Accelerator Technologies for High Performance Computing 2012.5

　More details

Language：English

researchmap
Mesh-free direct numerical simulation of turbulence using the vortex method on parallel MDGRAPE-3 boards along with the fast multipole method

YOKOTA Rio, NARUMI Tetsu, YASUOKA Kenji, EBISUZAKI Toshikazu, OBI Shinnosuke

Next-Generation Supercomputing Symposium 2007.10

　More details

Language：Japanese

researchmap

▼display all

Presentations

Matrices in Deep Neural Networks and How to Compute Them in Parallel Invited International conference

Rio Yokota

IEEE CLUSTER 2022 2022.9

　More details

Event date： 2022.9

Language：English Presentation type：Oral presentation (keynote)

Venue：Heidelberg, Germany

researchmap
Fast N-body Methods on Many-core and Heterogenous Systems International conference

YOKOTA Rio

International Workshop on Computational Science and Numerical Analysis 2012.3

　More details

Language：English Presentation type：Oral presentation (invited, special)

Venue：Tokyo

researchmap
Petaflops Scale Turbulence Simulation on TSUBAME 2.0 International conference

YOKOTA Rio

GPU@BU Workshop 2011.11

　More details

Language：English Presentation type：Public lecture, seminar, tutorial, course, or other speech

researchmap
Large Scale Multi-GPU FMM for Bioelectrostatics International conference

YOKOTA Rio, BARBA Lorena

Presentations SIAM Conference on Computational Science and Engineering 2011.2

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
12 Steps to a Fast Multipole Method on GPUs International conference

YOKOTA Rio

Pan-American Advanced Studies Institute 2011.1

　More details

Language：English Presentation type：Public lecture, seminar, tutorial, course, or other speech

researchmap
RBF interpolation using Gaussians with domain decomposition on GPUs International conference

YOKOTA Rio, BARBA Lorena

SIAM annual meeting 2010.7

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
Calculation of the decay of colliding turbulent vortex rings

4th International Conference on Vortex Flows and Vortex Models 2008

　More details

researchmap
Simulation of a wake using 3-D vortex element method

Annual Meeting of the JSME 2006

　More details

researchmap
Vortex flow simulation between multipole bridge decks

Whither Turbulence Prediction and Control 2006

　More details

researchmap
Simulation of homogeneous isotropic turbulence using the vortex method,

YOKOTA Rio, OBI Shinnosuke

20th Symposium on Computational Fluid Dynamics 2006.12

　More details

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Range of Applications for the Fast Multipole Method on GPUs International conference

YOKOTA Rio

Accelerated Computing 2010.1

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis

Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota

HPC Asia 2020 2020.1

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
Advances in Fast Multipole Methods for Scalable Electrostatics Calculations Invited International conference

YOKOTA Rio

Workshop: Electrostatics methods in Molecular Simulation 2013.5

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
ExaFMM – a Testbed for Comparing Various Implementations of the FMM International conference

YOKOTA Rio

SIAM Conference on Computational Science and Engineering 2015.3

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
Communication Complexity of the Fast Multipole Method and its Alge- braic Variants International conference

YOKOTA Rio, KEYES David

CBMS-NSF Conference: Fast Direct Solvers for Elliptic PDEs 2014.6

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
Various Implementations of FMM and Their Performance on Future Architectures International conference

YOKOTA Rio

Multi-resolution Interactions Workshop 2015.8

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
Running Fast Multipole Method on the Full Node of TSUBAME and K computer International conference

YOKOTA Rio

Scalable Hierarchical Algorithms for Extreme Computing 2012.4

　More details

Language：English Presentation type：Oral presentation (general)

researchmap
Petascale Fast Multipole Methods on GPUs Invited International conference

YOKOTA Rio

The 11th International Symposium on Parallel and Distributed Computing 2012.6

　More details

Language：English Presentation type：Oral presentation (keynote)

researchmap
Compute-Memory Tradeoff in Hierarchical Low-Rank Approximation Methods International conference

Rio Yokota

SIAM Conference on Computational Science and Engineering 2017.2

　More details

Language：English Presentation type：Oral presentation (general)

Venue：Atlanta

researchmap
Energy Conservation of Fast Multipole Methods in Classical Molecular Dynamics Simulations Invited International conference

Rio Yokota

7th AICS International Symposium 2017.2

　More details

Language：English Presentation type：Oral presentation (invited, special)

Venue：Kobe

researchmap
Hierarchical Low-Rank Approximations at Extreme Scale Invited International conference

Rio Yokota

32nd International Conference, ISC High Performance 2017.6

　More details

Language：English Presentation type：Oral presentation (invited, special)

Venue：Frankfurt

researchmap
A Common API for Fast Multipole Methods International conference

YOKOTA Rio

Accelerate Data Analytics and Computing Workshop 2016.1

　More details

Language：English Presentation type：Symposium, workshop panel (nominated)

researchmap
Tuning Parameters in FMM Invited

YOKOTA Rio

Seventh Symposium on Automatic Tuning Technology and its Application 2015.12

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Improving Data Locality of Fast Multipole Methods International conference

YOKOTA Rio

Third Workshop on Programming Abstractions for Data Locality 2016.10

　More details

Language：English Presentation type：Symposium, workshop panel (public)

Venue：Kobe

researchmap
Fast Multipole Method Library for Multiple Architectures and its Application to Molecular and Fluid Simulations

YOKOTA Rio

8th Symposium of the Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures 2016.7

　More details

Language：Japanese Presentation type：Symposium, workshop panel (public)

researchmap
Perforamance Portability of FMM

YOKOTA Rio

21st Conference of Japan Computational Engineering Society 2016.5

　More details

Language：Japanese Presentation type：Oral presentation (general)

researchmap
O(N)で並列性の高い密行列のLU分解 Invited

Rio Yokota

2022.12

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
人工画像を用いたVision Transformerの大規模事前学習 Invited

Rio Yokota

2022.9

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
人工画像を用いたVision Transformerの大規模事前学習 Invited

Rio Yokota

2022.7

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Scaling Deep Learning to Thousands of GPUs Invited International conference

Rio Yokota

HPC 2018 2018.7

　More details

Language：English Presentation type：Oral presentation (invited, special)

Venue：Cetraro

researchmap
Energy Conserving Fast Multipole Methods for the Calculation of Long-range Interactions Invited International conference

Rio Yokota

Mathematics in Action: Modeling and analysis in molecular biology and electro- physiology 2018.6

　More details

Language：English Presentation type：Oral presentation (invited, special)

Venue：Suzhou

researchmap
Optimization Methods for Large Scale Distributed Deep Learning Invited International conference

Rio Yokota

IPAM Workshop I: Big Data Meets Large-Scale Computing 2018.9

　More details

Language：English Presentation type：Oral presentation (invited, special)

Venue：Los Angeles

researchmap
Early Application Results on TSUBAME 3 Invited International conference

Rio Yokota

Smoky Mountains Computational Sciences and Engineering Conference 2018.8

　More details

Language：English Presentation type：Oral presentation (invited, special)

Venue：Gatlinburg

researchmap
Can we use Hierarchical Low-Rank Approximation for Deep Learning? Invited International conference

Rio Yokota

HPC Saudi 2018 2018.3

　More details

Language：English Presentation type：Oral presentation (keynote)

Venue：Jeddah

researchmap
Recent Trends in Hierarchical Low-Rank Approximation Methods Invited International conference

Rio Yokota

Tokyo Institute of Technology and Stony Brook University Joint Science and Technology Meeting 2019.5

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
階層的低ランク近似法の最新研究動向と応用例 Invited

Rio Yokota

2019.5

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
ImageNetベンチマークの大規模並列深層学習 Invited

Rio Yokota

2019.7

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
深層学習の高速化と大規模並列化 Invited

Rio Yokota

2019.9

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
深層学習に現れる密行列も構造 Invited

Rio Yokota

2019.10

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
近似行列分解と分散深層学習 Invited

Rio Yokota

2019.11

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Degree of Approximation and Overhead of Computing Curvature, Information, and Noise Matrices Invited International conference

Rio Yokota

ICML Workshop "Beyond first order methods in machine learning systems", July 2020

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
FMMの自動チューニング可能なパラメータについて Invited

Rio Yokota

2015.12

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Portability of the Performance of FMM Invited

Rio Yokota

2016.5

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Kronecker Factorization for Second Order Optimization in Deep Learning Invited International conference

Rio Yokota

SIAM CSE 2019.2

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
Overview of Distributed Memory Parallelism in Deep Learning Invited International conference

Rio Yokota

DD26 MS01: Learning, Algorithms, Domain Decomposition Methods, and Applications 2020.12

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
スパコンを用いた大規模並列分散深層学習 Invited

Rio Yokota

2021.3

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Overview of structured low-rank approximation methods Invited International conference

Rio Yokota

IUTAM Symposium on Computational Methods for Large-Scale and Complex Wave Problems 2021.6

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
階層的低ランク近似法に関するレビュー Invited

Rio Yokota

2021.9

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Approximations of Natural Gradient Descent in Distributed Training Invited International conference

Rio Yokota

INFORMS Annual Meeting Session: Beyond first order methods in machine learning systems I 2021.10

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
大規模並列深層学習の基礎 Invited

Rio Yokota

2021.10

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
二次最適化を用いたImageNetの大規模分散深層学習 Invited

Rio Yokota

2020.1

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
二次最適化を用いた巨大な言語モデルの学習およびFRNNを用いたプラズマ挙動予測 Invited

Rio Yokota

2020.2

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Distributed Deep Learning with Second Order Information Invited International conference

Rio Yokota

SPCL_Bcast(COMM_WORLD) 2020.10

　More details

Language：English Presentation type：Oral presentation (invited, special)

researchmap
深層学習におけるヘッセ行列, フィッシャー行列, 共分散行列の高速近似解法 Invited

Rio Yokota

2020.10

　More details

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap

▼display all

Awards

ACM Gordon Bell Prize (price/performance)

2009

　More details

researchmap

Research Projects

高性能都市モデル計算

2026 - 2029

日本学術振興会科学研究費助成事業基盤研究(B)

　 More details

researchmap
次世代計算機の潜在能力を引き出すための科学技術ソフトウェアの刷新

Grant number：25H01109 2025.4 - 2028.3

日本学術振興会科学研究費助成事業基盤研究(A)

横田理央

　 More details

Grant amount：\46020000 （ Direct Cost: \35400000 、 Indirect Cost：\10620000 ）

researchmap
GPU教育教材の作成及び教育プログラムの実施、生成AI技術を中心としたGPU計算資源の高度活用促進

2025 - 2030

文部科学省文部科学省次世代HPC・AI開発支援拠点事業

　 More details

researchmap
大規模言語モデル（LLM：Large Language Model）を活用した医薬品等の有効性・安全性評価のためのアウトカム抽出の方法論の確立に向けた研究

Grant number：24AC0401 2024.4 - 2027.3

厚生労働省厚生労働科学研究費応用研究

武藤学, 松本繁巳, 中島貴子, 黒田知宏, 吉原博幸, 小林慎治, 粂直人, 横田理央, 加藤康之

　 More details

researchmap
低ランク構造行列法の適用範囲拡大と多様な計算アーキテクチャの活用

Grant number：24K02949 2024.4 - 2027.3

日本学術振興会科学研究費助成事業基盤研究(B)

伊田明弘, 横田理央, 塙敏博, 岩下武史, 大島聡史, 星野哲也, 平石拓, 河合直聡

　 More details

Grant amount：\18590000 （ Direct Cost: \14300000 、 Indirect Cost：\4290000 ）

researchmap
大規模言語モデルのミスアライメントに対するレッドチーミング基盤

2024 - 2030

科学技術振興機構 JST K-program

　 More details

researchmap
大規模言語モデルの分散並列事前学習の実施に関する研究

2024 - 2029

科学技術振興機構 JST次世代人工知能技術等研究開発拠点形成事業

　 More details

researchmap
人工生成データ及び少量実データの継続事前学習による限定資源下でのAI基盤モデル構築

2024 - 2028

日本学術振興会国際共同研究加速基金海外連携研究

　 More details

researchmap
Next-generation high-performance linear solver for future computational science and engineering

Grant number：23H00462 2023.4 - 2027.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (A)

　 More details

Grant amount：\45500000 （ Direct Cost: \35000000 、 Indirect Cost：\10500000 ）

researchmap
深層生成モデルを活用した構成的なパターン認識・理解

Grant number：23H00490 2023.4 - 2026.3

日本学術振興会科学研究費助成事業基盤研究(A)

篠田浩一, 井上中順, 横田理央, 川上玲, 佐藤育郎

　 More details

Grant amount：\47190000 （ Direct Cost: \36300000 、 Indirect Cost：\10890000 ）

本研究課題では，識別の対象（インスタンス）を属性の集合（束）とみなし，特徴量空間においてその特徴を属性ごとに分解する．そして，これらの属性特徴からインスタンスを再合成する過程で属性特徴を最適化することで，各属性を高精度で識別し，かつ，外れ値に対し頑健な識別手法を実現することを目的としている。このために深層生成モデルと高密度な属性アノテーションに基づく学習手法を開発する．従来研究の多くが対象とその属性が一対一に対応する平坦な意味構造を仮定していたのに対し，本研究は多くの属性が複雑に絡み合う対象における複数の属性を同時に識別することを可能にする．新しい属性やクラスの創発も視野に入れる．より具体的には、深層学習を用いた「合成による識別」のアプローチにより，構成的なパターン認識・理解を行う方法論を確立する．人の動作認識，話者・感情認識，マルチモーダル認識の3つのタスクで横断的に評価し，従来に比べ高い識別性能を目指す．初年度である本年度は、人の動作認識、話者・感情認識、マルチモーダル認識の各々の課題において、評価データベースの構築と、ベースライン方式の開発を行った。これらと並行して、比較的小規模なタスクで、拡散モデルなどの生成モデルを用いて識別を行う方式の開発を行った。また、ニューラル構造探索などを用いて生成モデルの効率的な学習を行う方式も開発した。特に、センサーと映像のマルチモーダル認識における基本方式の構築、およびデータベース構築、人間の歩容認識の基本方式の開発、マルチモーダル感情認識の基本方式の開発を行った。

researchmap
Functional improvement of mixing promotion in fluid equipment by elucidating the universal statistical law of two-phase turbulence

Grant number：22H01403 2022.4 - 2026.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)

　 More details

Grant amount：\17290000 （ Direct Cost: \13300000 、 Indirect Cost：\3990000 ）

researchmap
Functional improvement of mixing promotion in fluid equipment by elucidating the universal statistical law of two-phase turbulence

Grant number：23K22674 2022.4 - 2026.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)

　 More details

Grant amount：\17290000 （ Direct Cost: \13300000 、 Indirect Cost：\3990000 ）

researchmap
Fast and Accurate Eigenvalue Calculations by Hierarchical Low-rank Approximation and its Application to Large-scale Electronic Structure Calculations

Grant number：22H03598 2022.4 - 2025.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B) Grant-in-Aid for Scientific Research (B)

Rio Yokota, Akihiro Ida, Takeshi Ogita, Takeo Hoshi

　 More details

Authorship：Principal investigator

Grant amount：\17680000 （ Direct Cost: \13600000 、 Indirect Cost：\4080000 ）

researchmap
Fast and accurate eigenvalue calculations by hierarchical low-rank approximation and its application to large-scale electronic structure calculations

Grant number：23K24854 2022.4 - 2025.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)

　 More details

Grant amount：\17680000 （ Direct Cost: \13600000 、 Indirect Cost：\4080000 ）

researchmap
A New Bayes Duality Principle for Adaptive, Robust, Life-long Learning of AI

Grant number：JY210177nn 2021.10 - 2027.3

JST CREST

Emtiyaz Khan, Kenichi Bannai, Rio Yokota, Julyan Arbel

　 More details

Authorship：Coinvestigator(s)

researchmap
Construction of numerical linear algebra based on lattice H-matrices and its high-performance implementation on modern architectures

Grant number：21H03447 2021.4 - 2024.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)

　 More details

Grant amount：\17290000 （ Direct Cost: \13300000 、 Indirect Cost：\3990000 ）

researchmap
Application of Unconventional Linear Algebra Techniques to Continuous Learning in Supergiant Neural Networks

Grant number：20K20624 2020.7 - 2023.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Research (Pioneering) Grant-in-Aid for Challenging Research (Pioneering)

　 More details

Grant amount：\25350000 （ Direct Cost: \19500000 、 Indirect Cost：\5850000 ）

researchmap
Life-Long Deep Learning using Bayesian Principles

Grant number：20H04247 2020.4 - 2023.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B) Grant-in-Aid for Scientific Research (B)

　 More details

Grant amount：\18200000 （ Direct Cost: \14000000 、 Indirect Cost：\4200000 ）

researchmap
Linear Solvers for Machine Learning Hardware

Grant number：18H03248 2018.4 - 2021.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B) Grant-in-Aid for Scientific Research (B)

Yokota Rio

　 More details

Grant amount：\16900000 （ Direct Cost: \13000000 、 Indirect Cost：\3900000 ）

The trend in computer architecture has now shifted from general purpose accelerators to specialized hardware for machine learning. The present work focuses on the affinity between hierarchical low-rank approximation methods, and low-precision arithmetic units and tensor product accelerators in machine learning processors to develop a suitable linear algebra library for future architectures. In FY2018, we ported our H-matrix library to use batched MAGMA operations in order to take advantage of the tensor product accelerators. In FY2019, we optimized the inner kernels of the H-matrix by making use of TensorCores. In FY2020, we extended this work to recover the accuracy when using TensorCores and measured the energy efficiency.

researchmap
Enhancement of H-matrix library and optimization for next generation supercomputers

Grant number：17H01749 2017.4 - 2020.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B) Grant-in-Aid for Scientific Research (B)

Ida Akihiro

　 More details

Grant amount：\18850000 （ Direct Cost: \14500000 、 Indirect Cost：\4350000 ）

In this study, we enhanced the HACApK, which is a library for H-matrices: the dynamic load balancing technique is introduced for H-matrix generation, and algorithms of H-matrix-vector multiplication for GPU computing and mixed-precision computing are developed and implemented. These new implementations are several to ten times faster than existing HACApK. We proposed a novel variant of low-rank structured matrices, called “lattice H-matrices”, which allow the construction of efficient operation and communication patterns compared to the conventional H-matrices. In numerical experiments for performing H-matrix-vector multiplications, the lattice H-matrices is several tens of times faster than the normal H-matrices when several thousand processes are used. Moreover, we developed an LU decomposition method based on the lattice H-matrices, and a QR decomposition method for the BLR matrices which is a simple version of lattice H-matrices. We also proposed their parallelization algorithms.

researchmap
Acceleration and economization of deep learning algorithms for image processing in social infrastructure

2016.11 - 2022.3

Japan Science and Technology Agency CREST

SHINODA Koichi

　 More details

Grant type：Competitive

researchmap
性能と生産性を両立するエクサスケールコンピュータ向け階層型粒子法フレームワーク

Grant number：16H02827 2016.4 - 2019.3

日本学術振興会科学研究費助成事業基盤研究(B) 基盤研究(B)

丸山直也, 横田理央, 田浦健次朗

　 More details

Grant amount：\16900000 （ Direct Cost: \13000000 、 Indirect Cost：\3900000 ）

今日のペタスケールシステムの千倍の性能を目指したエクサスケールスーパーコンピュータでは、計算機アーキテクチャの質的および量的な変化が不可避であり、それに従って既存のアプリケーションの大幅な書き換えが必須となる。本研究では頻出基本数値計算手法である粒子法に着目し、アーキテクチャの変更の度にアプリケーションを変更することなく高性能を達成可能なソフトウェア基盤技術を確立することを目的として研究開発を進めた。これは、アーキテクチャ非依存にアプリケーション開発が可能なプログラミングフレームワークに基づき、並列化および性能最適化を自動化することを狙ったものである。今年度は本フレームワークの第一版としてCPUおよびGPUに対応したフレームワークを開発した。本フレームワークはC++テンプレートメタプログラミングに基づき、FMM等の階層的粒子法を簡便に記述可能なプログラミングモデルを提供する。ユーザプログラムはテンプレート展開によってCPU用の並列コードやCUDAを用いたGPU用コードへと自動的に変換されるため、対象プロセッサ用に別途プログラムを作成する必要がない。また、MPIを用いた複数ノード向け並列化もフレームワークによって自動的になされるため、単一のユーザプログラムによって単一プロセッサからスーパーコンピュータクラスの大規模システムまで統一的に動作させることが可能である。また、本フレームワークの実装には高性能を達成するためにFMMアルゴリズムの高性能実装技術に関する研究成果や軽量マルチスレッドランタイムであるMassiveThreadsが活用されており、人手による実装に近い性能が自動的に達成できている。

researchmap
Grant-in-Aid for Scientific Research (B)

2016.4 - 2019.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

MARUYAMA Naoya

　 More details

Grant type：Competitive

researchmap
Large scale iterative solvers by combining FMM and H-matrices

Grant number：16H05859 2016.4 - 2018.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (A) Grant-in-Aid for Young Scientists (A)

Yokota Rio, Li Xiaoye S., Keyes David E.

　 More details

Grant amount：\6630000 （ Direct Cost: \5100000 、 Indirect Cost：\1530000 ）

In FY2016, we extended the FMM to H-matrices and developed a LU decomposition code using H-matrices. The dual tree traversal of exaFMM was used to determine the block cluster tree for arbitrary admissibility conditions, which allowed tasked based parallelization of the compression part of the H-matrix code. In FY2017, we further optimized inner kernels of the H-matrix code and compared H-matrices with multigrid for real applications. The use of batched MAGMA enabled us to maximize the performance of GPUs even for small matrices. The advantage of H-matrices over multigrid depends on the condition number of the matrix, while the H-matrix becomes advantageous as the degree of parallelism increases.

researchmap
Grant-in-Aid for Encouragement of Young Scientists (A)

2016.4 - 2018.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

YOKOTA Rio

　 More details

Authorship：Principal investigator Grant type：Competitive

researchmap
Grant-in-Aid for Research Activity start-up

2015.8 - 2017.3

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research

YOKOTA Rio

　 More details

Authorship：Principal investigator Grant type：Competitive

researchmap
エクサスケーラブルな大規模連立一次方程式の前処理としてのFMMの代数学的拡張

Grant number：15H06196 2015.8 - 2016.3

日本学術振興会科学研究費助成事業研究活動スタート支援研究活動スタート支援

横田理央

　 More details

Grant amount：\1430000 （ Direct Cost: \1100000 、 Indirect Cost：\330000 ）

次世代計算機上で既に性能が出ると分かっている階層的 N 体アルゴリズムを出発点にとり，それを任意の連立一次方程式を扱えるソルバへと徐々に拡張した．平成27年度には，Poisson 方程式しか解くことのできない現在の FMM を Helmholtz 方程式や Stokes 方程式へと拡張し，流体解析のみならず構造・電磁場・音響解析へも適用できるようにした。また、それぞれの方程式を同等の計算条件，計算機環境の下で multigrid 法やHSS行列と直接比較し，いままでやられてこなかった手法間の定量的な優位性の評価を行った．Multigrid 法との比較においては Poisson 方程式に比べ Helmholtz 方程式は FMM の優位性が顕著であった．これは Helmholtz 方程式が高周波を含む場合に multigrid 法の収束性が著しく低下するのに対して，FMM の収束性がさほど低下しないことが原因である．HSS行列との比較ではセットアップのオーバーヘッドが小さい FMM が HSS に比べて合計の計算時間で有利になるという結果が得られた．特に２次元 Laplace 方程式においてその差は顕著で FMM が約1000倍高速であった．スケーラビリティのベンチマークにおいては FMM は Cray XC40 の 131,072 コアを用いた計算で良好な並列化効率が得られ，4000億点規模の計算を数秒で行うことができた．これは，FMM の計算としては世界最大規模であり，最速の計算でもあると思われる．

researchmap

▼display all

Teaching Experience

高性能科学技術計算

　More details

researchmap
計算機ネットワーク

　More details

researchmap
5th Academic Group Literacy

Institution：Tokyo Institute of Technology, School of Computing, Department of Computer Science

　More details

researchmap
High Performance Scientific Computing

Institution：Tokyo Institute of Technology, School of Computing, Department of Computer Science

　More details

researchmap
Computer Networks

Institution：Tokyo Institute of Technology, School of Computing, Department of Computer Science

　More details

researchmap
5類リテラシ

　More details

researchmap

▼display all

Media Coverage

“Japan-specific LLMs Are Absolutely Necessary”: A Supercomputer Researcher Pursues Open AI Models Internet

Nikkei xTECH Nikkei xTECH online 2023.7

　More details

researchmap
Alarm over Japan Falling Behind in Domestic Generative AI Newspaper, magazine

Nikkei (The Nikkei) Nikkei (The Nikkei) morning edition, page 2 2023.6

　More details

researchmap
Growing Participation in Japanese Homegrown Generative AI Newspaper, magazine

Yomiuri Shimbun Yomiuri Shimbun morning edition, page 10 2023.6

　More details

researchmap
Tokyo Tech and Others to Develop Domestic Generative AI Newspaper, magazine

Nikkei Sangyo Shimbun Nikkei Sangyo Shimbun page 7 2023.6

　More details

researchmap
Pioneering Text-Generating AI with a Domestic Model Newspaper, magazine

Nikkei (The Nikkei) Nikkei (The Nikkei) morning edition, page 14 2023.5

　More details

researchmap
Toward the Development of a Japanese Homegrown Conversational AI: Tokyo Tech, RIKEN, and Others Newspaper, magazine

Asahi Shimbun Asahi Shimbun morning edition, page 27 2023.5

　More details

researchmap
Japanese Homegrown Generative AI on Fugaku Newspaper, magazine

Nikkei (The Nikkei) Nikkei (The Nikkei) morning edition, page 1 2023.5

　More details

researchmap
Developing an LLM on Fugaku Newspaper, magazine

Nikkan Kogyo Shimbun Nikkan Kogyo Shimbun morning edition, page 23 2023.5

　More details

researchmap
Toward Developing Domestic Generative AI on the Fugaku Supercomputer Newspaper, magazine

Tokyo Shimbun Tokyo Shimbun morning edition, page 2 2023.5

　More details

researchmap
Toward Developing Domestic Generative AI Technology Using Fugaku Newspaper, magazine

Mainichi Shimbun Mainichi Shimbun morning edition, page 18 2023.5

　More details

researchmap
Japanese-Language AI Development Begins This Month Newspaper, magazine

Yomiuri Shimbun Yomiuri Shimbun morning edition, page 29 2023.5

　More details

researchmap
Domestic Generative AI on Fugaku Newspaper, magazine

Sankei Shimbun Sankei Shimbun morning edition, page 2 2023.5

　More details

researchmap
Developing Japanese-Language AI on Fugaku Newspaper, magazine

Yomiuri Shimbun Yomiuri Shimbun evening edition, page 10 2023.5

　More details

researchmap

▼display all