Gemini

Driving a web browser with Gemini's Computer Use model in Java

📅 November 2, 2025 — by Guillaume Laforge

In this article, I’ll guide you through the process of programmatically interacting with a web browser using the new Computer Use model in Gemini 2.5 Pro. We’ll accomplish this in Java ☕ leveraging Microsoft’s powerful Playwright Java SDK to handle the browser automation.

The New Computer Use Model

Unveiled in this announcement article and made available in public preview last month, via the Gemini API on Google AI Studio and Vertex AI, Gemini 2.5 Pro introduces a pretty powerful “Computer Use” feature.

Creative Java AI agents with ADK and Nano Banana 🍌

📅 September 22, 2025 — by Guillaume Laforge

java agent-development-kit generative-ai ai-agents large-language-models gemini

Large Language Models (LLMs) are all becoming “multimodal”. They can process text, but also other “modalities” in input, like pictures, videos, or audio files. But models that output more than just text are less common…

Recently, I wrote about my experiments with Nano Banana 🍌 (in Java), a Gemini chat model flavor that can create and edit images. This is pretty handy in particular for interactive creative tasks, like for example a marketing assistant that would help you design a new product, by describing it, by futher tweaking its look, by exposing it in different settings for marketing ads, etc.

MCP Client and Server with the Java MCP SDK and LangChain4j

📅 April 4, 2025 — by Guillaume Laforge

model-context-protocol langchain4j java gemini large-language-models

MCP (Model Context Protocol) is making a buzz these days! MCP is a protocol invented last November by Anthropic, integrated in Claude Desktop and in more and more tools and frameworks, to expand LLMs capabilities by giving them access to various external tools and functions.

My colleague Philipp Schmid gave a great introduction to MCP recently, so if you want to learn more about MCP, this is the place for you.

In this article, I’d like to guide you through the implementation of an MCP server, and an MCP client, in Java. As I’m contributing to LangChain4j, I’ll be using LangChain4j’s mcp module for the client.

The power of large context windows for your documentation efforts

📅 February 15, 2025 — by Guillaume Laforge

generative-ai large-language-models machine-learning langchain4j gemini

My colleague Jaana Dogan was pointing at the Anthropic’s MCP (Model Context Protocol) documentation pages which were describing how to build MCP servers and clients. The interesting twist was about preparing the documentation in order to have Claude assist you in building those MCP servers & clients, rather than clearly documenting how to do so.

MCP tutorials are great. There are no tutorials really.

"Copy these resources to Claude, and start asking some questions like..." pic.twitter.com/GG50DMWNLW
Read more...

A Generative AI Agent with a real declarative workflow

📅 January 31, 2025 — by Guillaume Laforge

generative-ai ai-agents large-language-models gemini machine-learning workflows

In my previous article, I detailed how to build an AI-powered short story generation agent using Java, LangChain4j, Gemini, and Imagen 3, deployed on Cloud Run jobs.

This approach involved writing explicit Java code to orchestrate the entire workflow, defining each step programmatically. This follow-up article explores an alternative, declarative approach using Google Cloud Workflows.

I’ve written extensively on Workflows in the past, so for those AI agents that exhibit a very explicit plan and orchestration, I believe Workflows is also a great approach for such declarative AI agents.

An AI agent to generate short sci-fi stories

📅 January 27, 2025 — by Guillaume Laforge

generative-ai ai-agents large-language-models gemini machine-learning langchain4j java

This project demonstrates how to build a fully automated short story generator using Java, LangChain4j, Google Cloud’s Gemini and Imagen 3 models, and a serverless deployment on Cloud Run.

Every night at midnight UTC, a new story is created, complete with AI-generated illustrations, and published via Firebase Hosting. So if you want to read a new story every day, head over to:

→ short-ai-story.web.app ←

The code of this agent is available on Github. So don’t hesitate to check out the code:

Let's think with Gemini Flash 2.0's experimental thinking mode and LangChain4j

📅 December 20, 2024 — by Guillaume Laforge

java large-language-models machine-learning langchain4j generative-ai gemini

Yesterday, Google released yet another cool Gemini model update, with Gemini 2.0 Flash thinking mode. Integrating natively and transparently some chain of thought techniques, the model is able to take some more thinking time, and automatically decomposes a complex task into smaller steps, and explores various paths in its thinking process. Thanks to this approach, Gemini 2.0 Flash is able to solve more complex problems than Gemini 1.5 Pro or the recent Gemini 2.0 Flash experiment.

Detecting objects with Gemini 2.0 and LangChain4j

📅 December 13, 2024 — by Guillaume Laforge

java large-language-models machine-learning langchain4j generative-ai gemini

Hot on the heels of the announcement of Gemini 2.0, I played with the new experimental model both from within Google AI Studio, and with LangChain4j.

Google released Gemini 2.0 Flash, with new modalities, including interleaving images, audio, text, video, both in input and output. Even a live bidirectional speech-to-speech mode, which is really exciting!

When experimenting with AI Studio, what attracted my attention was AI Studio’s new starter apps section. There are 3 examples (including links to Github projects showing how they were implemented):

Lots of new cool Gemini stuff in LangChain4j 0.35.0

📅 September 25, 2024 — by Guillaume Laforge

generative-ai langchain4j java google-cloud large-language-model gemini

While LangChain4j 0.34 introduced my new Google AI Gemini module, a new 0.35.0 version is already here today, with some more cool stuff for Gemini and Google Cloud!

Let’s have a look at what’s in store!

Gemini 1.5 Pro 002 and Gemini 1.5 Flash 002

This week, Google announced the release of the new versions of the Google 1.5 models:

google-1.5-pro-002
google-1.5-flash-002

Of course, both models are supported by LangChain4j! The Google AI Gemini module also supports the gemini-1.5-flash-8b-exp-0924 8-billion parameter model.

New Gemini model in LangChain4j

📅 September 5, 2024 — by Guillaume Laforge

generative-ai langchain4j java google-cloud large-language-model gemini

A new version of LangChain4j, the super powerful LLM toolbox for Java developers, was released today. In 0.34.0, a new Gemini model has been added. This time, this is not the Gemini flavor from Google Cloud Vertex AI, but the Google AI variant.

It was a frequently requested feature by LangChain4j users, so I took a stab at developing a new chat model for it, during my summer vacation break.

Gemini, show me the code!

Let’s dive into some code examples to see it in action!

|< 2 of 3 >|