An ADK Java agent powered by Gemma 4

📅 April 2, 2026 — by Guillaume Laforge

Today, DeepMind announced the release of Gemma 4, a very impressive and powerful new version of the Gemma family of models. As I’ve been contributing to ADK Java a fair bit recently, I was curious to see how I would configure ADK Java agents to work with Gemma 4.

In this article, we’ll explore two paths:

Calling the AI Studio API surface directly,
Calling Gemma 4 hosted via a vLLM instance thanks to the LangChain4j bridge.

With the appropriate model weights format, we’ll also be able to run Gemma 4 locally via Ollama. But that’s for another day.

The Easy Case: Gemma 4 on AI Studio

If you’re using Gemma 4 via the Google AI Studio API surface, it’s simple. Specify the model name to be gemma-4-31b-it instead of gemini-2.5-flash for example.

LlmAgent agent = LlmAgent.builder()
    .modelName("gemma-4-31b-it")
    .apiKey(System.getenv("GEMINI_API_KEY"))
    // ... instructions and tools
    .build();

It’s also possible to use the Gemini model builder and reference the model name:

Gemini gemma4 = Gemini.builder()
    .modelName("gemma-4-31b-it")
    .apiKey(System.getenv("GEMINI_API_KEY"))
    .build();

LlmAgent agent = LlmAgent.builder()
    .model(gemma4)
    // ... instructions and tools
    .build();

Here, Gemma 4 is exposed the same way as the Gemini models, via the same API surface. That’s why the model is an instance of Gemini.

Calling a vLLM hosted Gemma 4 via LangChain4j

During the beta testing period, internally at Google, my colleague Vlad was exposing the Gemma 4 model weights via vLLM, running inside a Google Cloud Run instance with GPU. And I was using his endpoint to test Gemma 4 😉

However, vLLM features an OpenAI-compatible API. So Gemma 4 on vLLM needs to be called with that API surface, not with the Gemini one.

Fortunately, with the LangChain4j bridge I developed last year, you can configure OpenAI-compatible models, thanks to the OpenAiChatModel (or the streaming variant) chat model from LangChain4j to connect to the vLLM server.

Creating a Simple Agent

First, we need to configure the OpenAiChatModel (or OpenAiStreamingChatModel):

ChatModel model = OpenAiChatModel.builder()
    .modelName("gg-hf-gg/gemma-4-31b-it")
    .apiKey("Gemma4TW") // A dummy key if not required by your vLLM setup
    .baseUrl("https://your-vllm-instance/v1")
    .timeout(Duration.ofMinutes(5))
    .customParameters(
        Map.of("chat_template_kwargs", Map.of("enable_thinking", true))
    )
    .build();

Important

For function calling (tool use) to work correctly with Gemma 4 on vLLM, as we shall see in further examples, you must enable the thinking capability in the chat template. This is done via the chat_template_kwargs / enable_thinking parameter, which enables thinking but also function calling at the same time.

Note

I’ve defined a long timeout, as the cold start to load the weights in memory can take up to 4 minutes! But once the Cloud Run instance is hot, Gemma 4 replies instantly.

Let’s have a look at a simple science teacher agent:

LlmAgent teacherAgent = LlmAgent.builder()
    .name("science-teacher")
    .model(LangChain4j.builder()
        .chatModel(model)
        .modelName("gg-hf-gg/gemma-4-31b-it")
        .build())
    .instruction("""
        You're a friendly science teacher
        who explains concepts simply.
        """)
    .build();

We use the LangChain4j.builder() to wrap the OpenAI compatible chat model as a Java class extending ADK’s BaseLlm class, which is the parent class of all LLMs supported by ADK.

Adding Tools (Local Java Functions)

Gemma 4’s reasoning capabilities shine when you add tools. You can expose any Java method as a tool using ADK’s FunctionTool.

LlmAgent orderAgent = LlmAgent.builder()
    .name("order-agent")
    .model(LangChain4j.builder()
        .chatModel(model)
        .modelName("gg-hf-gg/gemma-4-31b-it")
        .build())
    .instruction(
        "Use the `lookup_order` tool to retrieve order details.")
    .tools(FunctionTool.create(this, "retrieveOrder"))
    .build();

@Annotations.Schema(name = "lookup_order",
        description = "Retrieve order details by ID")
public Map<String, Object> retrieveOrder(String orderId) {
    // Your database logic here...
    return Map.of("status", "out_for_delivery");
}

In this example, we reference a local Java function to lookup order details, so Gemma 4 can call it should the user ask for the status of their order.

Wrapping up

That’s about it for today! With ADK Java and Gemma 4, you have a powerful, flexible, and open-weight foundation for your next AI agent project! 🤖 Thanks to the LangChain4j / ADK bridge, it’s even possible to invoke Gemma via different API surfaces than Gemini’s.

Note

As a reminder, we’ve just announced ADK Java 1.0, if you want to have a refresher about the latest features and enhancements to the project.

And you can watch this YouTube video I recorded that goes through the new features, as well as a concrete ADK agent called “Comic Trip” that transforms travel photography into vintage pop-art comic illustrations. Go check out the behind-the-scene article on how I built it.

← PREVIOUS
Creating a Wikipedia MCP Server in Java in a Few Prompts with Skills