Kohei Uehara's Website


About me
I am Kohei Uehara (上原 康平), currently working as a research engineer at SB Intuitions Corp.
I received my Ph.D. in Information Science and Technology from the University of Tokyo in March 2023.
My research interest focuses on machine learning across vision and language, Large Language Models (LLMs) and Vision-Language Models (VLMs), Accessibility, and Human-Computer Interaction (HCI).
Current Positions

Work Experience

Education

Projects
Asagi - Japanese Vision&Language Model

Asagi is a Japanese Vision&Language Model. The architecture of Asagi is based on LLaVA, which consists of a vision encoder, a language decoder, and a 2-layer MLP for projecting visual features into the language feature space.
We used Japanese LLMs as the language decoder, and the vision encoder is based on the SigLIP model.
We synthesized a large-scale Japanese Vision & Language dataset, consisting of approximately 20 million image-text pairs.
The model is publicly available on the Hugging Face Model Hub.
Please check the project page for more details.

Asagi

Publications
Journal and International Conference
Domestic Conference
Others

Competitions

Lectures
Invited Talks

Grants & Fellowships

Professional Activities

Links
Google Scholar Citations

Last update: April 5, 2025