NVIDIA Nemotron powers sovereign AI by providing not only open models but also datasets, libraries, recipes, and cookbooks that allow developers to customize models and adapt them to diverse use cases and languages.
Today, NVIDIA released NVIDIA Nemotron-Nano-9B-v2-Japanese, which achieved state-of-the-art performance (SOTA) with Nejumi Leaderboard 4 parameters of 10B or less.
This model is an important milestone in Japan’s enterprise AI development, achieving advanced Japanese understanding and powerful agent functions in a lightweight size that is easy to introduce.
By customizing the currently published Nemotron 2 Nano model for the Japanese language, we aim to be in the community to develop and publish custom cutting-edge models that support a variety of use cases and languages. The Nemotron team will incorporate the learnings gained from this customization into future Nemotron releases and strengthen our theoretical capabilities in Japanese.
Importance of SLM (Small Language Model) in Japanese Enterprise
Important gap in Japanese enterprise AI: The current Japanese enterprise AI environment has an issue in which there are almost no SLMs with “advanced Japanese language ability” and “ability to execute tasks as an agentic AI.” This creates barriers to adoption, particularly in the following areas:
On-premises deployment requirements: Companies with sensitive data may prefer to operate the model within a private network. Models with less than 10B (10 billion) parameters can significantly reduce infrastructure preparation while maintaining practical level performance.
Streamline customization: Start with a powerful Japanese-based model with proven agent capabilities to shorten fine-tuning cycles. It allows computational resources to be focused on adapting to specific domains rather than building basic capabilities.
Accelerate agent development: The model’s architecture and performance enable rapid prototyping of multi-agent systems and complex workflows without the overhead of larger models.
Utilize proven infrastructure
Nemotron 2 Nano: an outstanding architecture
Nemotron-Nano-9B-v2-Japanese is built on the NVIDIA Nemotron-Nano-9B-v2, which had an excellent size-to-performance ratio in the English benchmark. Based on this efficient architecture, we customized it and strengthened our Japanese language proficiency.
Advanced inference capabilities and optimized parameter efficiency Solid foundation for multilingual adaptation Proven ability to perform interactive tasks
By adapting this validated architecture to Japanese, we maintain the strengths of the base model while achieving superior Japanese proficiency.
Nemotron-personas-japan: Seed set for high-quality synthetic data generation
The model’s data strategy focuses on leveraging the open source (CC BY 4.0) dataset “Nemotron-Persona-Japan” as a high-quality seed for synthetic data generation (SDG). This dataset consists of synthetically generated personas based on Japan’s real-world demographics, geographic distribution, and distribution of personality traits, capturing the diversity and richness of the population. Based on Lusona, we have built a highly diverse, scalable, and robust training pipeline. With a rich set of seed data personas, we were able to efficiently expand diverse scenarios and subtly synthetic datasets. This approach allows the augmented data to maintain the exact cultural integrity of its original character while achieving the scale required for cutting-edge training.
In particular, for Nemotron-Nano-9B-v2-Japanese, we utilized these personas as the basis for generating training data for tool calls. This ensures that the capabilities the model acquires do not persist in consistent tool call functionality, but are rooted in culturally correct Japanese dialogue and real-world use cases.
The Nemotron-Persona collection also includes datasets from the United States, India, Singapore, and Brazil, making it possible to replicate the same methodology across regions.
training pipeline
Nemotron-Nano-9B-v2-Japanese was built using a combination of Japanese open source corpus and NVIDIA’s Nemotron stack for continuous pre-learning, synthetic data generation, and post-learning processes.
Continuing prior learning
Japanese OSS corpus: Wikipedia, fineweb-2 Japanese, Aozora Bunko, sip3-ja-general-web-corpus Nemotron-CC-v2.1 Nemotron-Pretraining-Specialized-v1
SFT
Tool call dataset Nemotron-Post-Training-v3 with Nemotron-Persona-Japan as seed set
Nemotron-Nano-9B-v2-Software used for Japanese
In order to maximize the model’s Japanese language ability, we conducted continuous preliminary training. Here, we are making full use of the assets of LLM-jp, Japan’s leading open source LLM community. At the same time, we leveraged Nemotron Pre-training Datasets to maintain the model’s agent functionality.
The Nemotron-Persona-Japan seeded tool call dataset used for SFT was very powerful. As a result, we succeeded in ensuring that a variety of real-world progressions were achieved while minimizing duplication.
Model training inherits the training recipe established in Nemotron Nano 2. This allowed us to prioritize training instability and increase throughput.
This approach achieves the performance of a powerful Japanese language model while maintaining robust tool invocation functionality and reasonableness.
benchmark performance

Nemotron-Nano-9B-v2-Japanese was ranked 1st in the <10B model category on "Nejumi Leaderboard 4", the most included LLM evaluation platform in Japan. Nejumi Leaderboard evaluates approximately 40 benchmark mark models across the following areas from multiple angles.
Basic language skills: Japanese understanding and generative agent skills: code generation, mathematical reasoning, tool usage, etc. Alignment: ability to follow directions, vigilance, toxicity, veracity, stability, etc.
These multi-dimensional evaluations make the Nejumi Leaderboard a trusted source of thought for developers selecting a base model for customization and deployment in the Japanese environment.

The benchmark results confirm that Nemotron-Nano-9B-v2-Japanese was able to integrate strong Japanese language ability into the base model Nemotron-Nano-9B-v2. These improvements extend beyond Japanese language knowledge and question-answering abilities to solid tasks such as tool invocation, pointing, and alignment. Notably, it outperforms the similarly sized Qwen3-8B, achieving an excellent size-to-performance ratio.
technological superiority

Inference efficiency: By inheriting the Nemotron 2 Nano (Transformer-Mamba) architecture, it delivers up to 6x higher throughput compared to open source alternatives while being deployable on edge GPUs. The figure above shows the results measured in the Nemotron 2 Nano paper. Contextual processing: Optimized for multi-turn conversations and tool interactions. It has powerful structured data generation capabilities for API calls and function execution. Fine-tuning efficiency: The number of parameters that can be fully fine-tuned even with affordable computational infrastructure.
Deployment options
Direct deployment
For applications that require a high level of Japanese understanding and agentic skills, the model can be deployed and utilized as is.
Customization to your own domain
Benchmark-proven good performance on Japanese and agentic tasks provides a solid starting point for professional application development. NeMo Framework (NeMo Megatron-Bridge, NeMo AutoModel, and NeMo-RL) is available for specific customizations.
Use it now
Nemotron-Nano-9B-v2-Japanese is available now for AI application developers. Whether your application is a user-facing agent, an in-house automation tool, or a domain-specific assistant, this model provides an excellent size-to-performance ratio for production deployment.
The combination of Nemotron 2 Nano’s proven architecture and Nemotron-Persona-Japan’s high-quality seeded dataset provides an efficient starting point for Japan’s sovereign AI development.
We encourage the community to take advantage of Nemotron models, datasets, recipes, and libraries, and to customize Nemotron models for even more languages and use cases. We can’t wait to see what you build!
Stay up to date on NVIDIA Nemotron by subscribing to NVIDIA News and following NVIDIA AI on the Nemotron channel on LinkedIn, X, YouTube, and Discord.
Access the open Nemotron model at Hugging Face and our collection of NIM microservices and developer samples at build.nvidia.com.

