Hi! I’m speed.
I am a second year graduate student at Kyoto University majoring in Computer Science.
I like to learn by building things.
Links
Hobby
Interests
- Large language models
- Multimodal models
Education
Newer ↑
- Master’s degree, Course of Communications and Computer Engineering, Graduate School of Informatics, Kyoto University, Japan. April 2024 – March 2026(Expected)
- Bachelor’s degree, Software Science Course, Department of Information and Computer Sciences, School of Engineering Science, Osaka University, Japan. GPA: 3.57/4.00. April 2020 – March 2024
Internships / Employments
Newer ↑
Date |
Organization |
Position |
Description |
2024/12 ~ present |
Sakana AI |
Student Intern |
🐟🐠🐡 |
2024/4 ~ present |
Research and Development Center for Large Language Models, National Institute of Informatics |
Research Assistant |
I am involved in research on the development and evaluation of multimodal models. |
2024/2 ~ 2024/4 |
LLM-jp, National Institute of Informatics |
Student Intern |
I was engaged in research on memorization in LLMs. |
2022/3 ~ 2024/4 |
Center for Quantum Information and Quantum Biology at Osaka University |
Software development as technical assistant |
I’m developing numerical software for quantum computation and quantum chemistry |
Research
International Conference
- Developing Japanese CLIP Models Leveraging an Open-weight LLM for Large-scale Dataset Translation. NAACL Student Research Workshop 2025, April 2025, Issa Sugiura, Shuhei Kurita, Yusuke Oda, Daisuke Kawahara, Naoaki Okazaki.
[Paper]
- Constructing Multimodal Datasets from Scratch for Rapid Development of a Japanese Visual Language Model. NAACL 2025 Demo Track, April 2025. Keito Sasagawa, Koki Maeda, Issa Sugiura, Shuhei Kurita, Naoaki Okazaki, Daisuke Kawahara.
[Paper to appear]
- A Comprehensive Analysis of Memorization in Large Language Models. The 17th International Natural Language Generation Conference (INLG 2024), September 2024. Hirokazu Kiyomaru*, Issa Sugiura*, Daisuke Kawahara and Sadao Kurohashi.
[Paper] | [Code]
Journal
- Removing Mislabeled Data from Trained Models via Machine Unlearning. IEICE Transactions on Information and Systems, August 2025. Issa Sugiura, Shingo Okamura, Naoto Yanai.
[Paper]
Preprints
- EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements. Arxiv, 2025. Issa Sugiura, Takashi Ishida, Taro Makino, Chieko Tazuke, Takanori Nakagawa, Kosuke Nakago, David Ha.
[Paper] | [Dataset] | [Code]
- llm-jp-modernbert: A ModernBERT Model Trained on a Large-Scale Japanese Corpus with Long Context Length. Arxiv, 2025. Issa Sugiura, Kouta Nakayama, Yusuke Oda.
[Paper] | [Model] | [Code]
- LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs, 2024, LLM-jp: Akiko Aizawa, Eiji Aramaki, …, Issa Sugiura, …, Koichiro Yoshino (80 authors, Authors are listed in alphabetical order).
[Paper]
Domestic Conference
Personal Projects
- pqcat
- A fast command-line tool for inspecting Parquet files.
- jaccard
- A simple web app to compute Jaccard similarity between two texts by tokenizing them using OpenAI’s tiktoken.
Certifications/ Qualifications
Newer ↑
- TOEIC L&R: 450 + 390 = 840 (October 2, 2022)
- Security Camp organized by IPA(Information-technology Promotion Agency), Web Security Course. August 2022