Gemma 3n on NVIDIA Jetson & RTX: Multimodal AI Unleashed

Deploying Google DeepMind’s Gemma 3n on NVIDIA Jetson and RTX: Unlocking Multimodal AI at the Edge

NVIDIA has announced the general availability of Google DeepMind’s Gemma 3n models on both NVIDIA Jetson and RTX platforms. This development marks a significant milestone in bringing advanced multimodal AI capabilities—including text, vision, and audio—to a wide range of devices, from edge robotics to AI-powered PCs.

Gemma 3n: A Leap in Multimodal AI

Gemma 3n introduces two new models optimized for on-device, multimodal applications. Building on the foundation laid by version 3.5, Gemma 3n now supports audio processing alongside text and vision, integrating leading research models for each modality:

Audio: Universal Speech Model
Vision: MobileNet v4
Text: MatFormer

Per-Lay Embeddings: Efficient Memory Usage

A standout feature of Gemma 3n is the introduction of Per-Lay Embeddings. This innovation substantially reduces RAM requirements for parameter storage. For example, the Gemma 3n E4B model, with 8 billion raw parameters, operates with a memory footprint similar to a 4B model. This advancement allows developers to deploy higher-quality models in environments with limited resources.

Model Specifications

Model Name	Raw Parameters	Input Context Length	Output Context Length	Size on Disk
E2B	5B	32K	32K minus request	1.55GB
E4B	8B	32K	32K minus request	2.82GB

Table: Gemma 3n model components and specifications.

Powering Robotics and Edge AI with Jetson

The Gemma 3n models are well-suited for NVIDIA Jetson devices, which are designed for edge applications such as robotics, smart cameras, and industrial automation. The combination of lightweight architecture and dynamic memory management enables these models to function efficiently in resource-constrained environments.

Gemma 3n Impact Challenge

Developers working with Jetson can participate in the Gemma 3n Impact Challenge on Kaggle. This initiative encourages the creation of impactful solutions in fields like accessibility, education, healthcare, environmental sustainability, and crisis response. Cash prizes, starting at $10,000, are awarded for top submissions and for leveraging on-device deployment technologies such as Jetson.

NVIDIA RTX: AI for Windows Developers and Enthusiasts

NVIDIA RTX AI PCs make it easy for developers and AI enthusiasts to deploy Gemma 3n models using the Ollama platform. These models can be integrated into popular applications like AnythingLLM and LM Studio, benefiting from RTX acceleration.

Quick Start with Ollama

Deploying Gemma 3n locally on RTX and Jetson devices is straightforward:

Download and install Ollama for Windows.
Open a terminal and run:
- ollama pull gemma3n:e4b
- ollama run gemma3n:e4b "Summarize Shakespeare’s Hamlet"

NVIDIA collaborates with Ollama to optimize performance for RTX GPUs, leveraging the GGML library for efficient model execution.

Customizing Gemma with NVIDIA NeMo Framework

Developers can further tailor Gemma 3n models using the open-source NVIDIA NeMo Framework, available via Hugging Face. NeMo provides an end-to-end workflow for post-training Llama models, enabling fine-tuning with enterprise-specific data for improved accuracy.

NeMo Workflow Overview

Data Curation (NeMo Curator): Prepares high-quality datasets for pretraining or fine-tuning by extracting, filtering, and deduplicating large data volumes.
Fine-Tuning (NeMo): Supports techniques like LoRA (Low-Rank Adaptation), PEFT (Parameter-Efficient Fine-Tuning), and full parameter tuning for comprehensive model customization.
Model Evaluation (NeMo Evaluator): Assesses the performance of adapted models using custom tests and benchmarking.

Advancing Open-Source AI and Community Collaboration

NVIDIA actively contributes to the open-source AI ecosystem and has released hundreds of projects under open licenses. By supporting open models like Gemma, NVIDIA promotes AI transparency and encourages collaborative progress in AI safety and resilience.

Conclusion

The availability of Gemma 3n on NVIDIA Jetson and RTX platforms empowers developers to bring advanced multimodal AI capabilities to both edge and desktop environments. With innovations in memory efficiency, robust developer tools, and a commitment to open-source collaboration, this partnership sets a new standard for accessible, high-performance AI deployment.

Read more such articles from our Newsletter here.

Deploying Google DeepMind’s Gemma 3n on NVIDIA Jetson and RTX: Unlocking Multimodal AI at the Edge

Jump to

Gemma 3n: A Leap in Multimodal AI

Per-Lay Embeddings: Efficient Memory Usage

Model Specifications

Powering Robotics and Edge AI with Jetson

Gemma 3n Impact Challenge

NVIDIA RTX: AI for Windows Developers and Enthusiasts

Quick Start with Ollama

Customizing Gemma with NVIDIA NeMo Framework

NeMo Workflow Overview

Advancing Open-Source AI and Community Collaboration

Conclusion

Prachi Kothiyal

Leave a Comment Cancel Reply

You may also like

Examining the Risks of AI Therapy Chatbots: Stigma, Safety, and the Limits of Automated Mental Health Support

The Great Comeback: How AI is Reigniting a Universal Passion for Coding

Unlocking Chrome’s Built-In AI: On-Device Summarization, Translation, and Language Detection

Categories

Recent Posts

Interested in working with Newsletters ?

Home

Discover Jobs

Enterprise blog

Professionals blog

About us

Terms of use

Privacy policy

Contact us