❯ Guillaume Laforge

Large-Language-Models

An ADK Java GitHub template for your first Java AI agent

With the unveiling of the Java version of Agent Development Kit (ADK) which lets you build AI agents in Java, I recently covered how to get started developing your first agent.

The installation and quickstart documentation also helps for the first steps, but I realized that it would be handy to provide a template project, to further accelarate your time-to-first-conversation with your Java agents! This led me to play with GitHub’s template project feature, which allows you to create a copy of the template project on your own account or organization. It comes with a ready-made project structure, a configured pom.xml file, and a first Java agent you can customize at will, and run from both the command-line or the ADK Dev UI.

Read more...

Things you never dared to ask about LLMs — Take 2

Recently, I had the chance to deliver this talk on the mysteries of LLMs, at Devoxx France, with my good friend Didier Girard, It was fun to uncover the oddities of LLMs, and better understand where they thrive or fail, and why.

In this post, I’d like to share an update of the presentation deck, with a few additional slides here and there, to cover for example

  • the difficulty of LLMs to work with acronyms, scientific molecule names, plant names, special uncommon vocabulary, which require more tokens and weakens attention,
  • the difference between deterministic and probabilistic problems, and why predictive models are still important,
  • some limits of LLMs with regards to understanding dates, data ownership, or the fact they can’t easily forget what they learned.

This was fun delivering the talk with Didier, as a friendly dialogue makes things more entertaining! We were lucky that this talk was recorded (however, in French 🇫🇷) and you can watch the video below:

Read more...

Beyond the chatbot or AI sparkle: a seamless AI integration

When I talk about Generative AI, whether it’s with developers at conferences or with customers, I often find myself saying the same thing: chatbots are just one way to use Large Language Models (LLMs).

Unfortunately, I see many articles or presentations that just focus on demonstrating LLMs at work within the context of chatbots. I feel guilty of showing the traditional chat interfaces too. But there’s so much more to it!

Read more...

Vibe coding an MCP server with Micronaut, LangChain4j, and Gemini

Unlike Quarkus and Spring Boot, Micronaut doesn’t (yet?) provide a module to facilitate the implementation of MCP servers (Model Context Protocol). But being my favorite framework, I decided to see what it takes to build a quick implementation, by vibe coding it, with the help of Gemini!

In a recent article, I explored how to use the MCP reference implementation for Java to implement an MCP server, served as a servlet via Jetty, and to call that server from LangChain4j’s great MCP support. One approach with Micronaut may have been to somehow integrate the servlet I had built via Micronaut’s servlet support, but that didn’t really feel like a genuine and native way to implement a server, so I decided to do it from scratch.

Read more...

LLMs.txt to help LLMs grok your content

Since I started my career, I’ve been sharing what I’ve learned along the way in this blog. It makes me happy when developers find solutions to their problems, or discover new things, thanks to articles I’ve written here. So it’s important for me that readers are able to find those posts. Of course, my blog is indexed by search engines, and people usually find about it from Google or other engines, or they discover it via the links I share on social media. But with LLM powered tools (like Gemini, ChatGPT, Claude, etc.) you can make your content more easily grokkable by such tools.

Read more...

Advanced RAG — Sentence Window Retrieval

Retrieval Augmented Generation (RAG) is a great way to expand the knowledge of Large Language Models to let them know about your own data and documents. With RAG, LLMs can ground their answers on the information your provide, which reduces the chances of hallucinations.

Implementing RAG is fairly trivial with a framework like LangChain4j. However, the results may not be on-par with your quality expectations. Often, you’ll need to further tweak different aspects of the RAG pipeline, like the document preparation phase (in particular docs chunking), or the retrieval phase to find the best information in your vector database.

Read more...

The power of large context windows for your documentation efforts

My colleague Jaana Dogan was pointing at the Anthropic’s MCP (Model Context Protocol) documentation pages which were describing how to build MCP servers and clients. The interesting twist was about preparing the documentation in order to have Claude assist you in building those MCP servers & clients, rather than clearly documenting how to do so.


A Generative AI Agent with a real declarative workflow

In my previous article, I detailed how to build an AI-powered short story generation agent using Java, LangChain4j, Gemini, and Imagen 3, deployed on Cloud Run jobs.

This approach involved writing explicit Java code to orchestrate the entire workflow, defining each step programmatically. This follow-up article explores an alternative, declarative approach using Google Cloud Workflows.

I’ve written extensively on Workflows in the past, so for those AI agents that exhibit a very explicit plan and orchestration, I believe Workflows is also a great approach for such declarative AI agents.

Read more...

An AI agent to generate short sci-fi stories

This project demonstrates how to build a fully automated short story generator using Java, LangChain4j, Google Cloud’s Gemini and Imagen 3 models, and a serverless deployment on Cloud Run.

Every night at midnight UTC, a new story is created, complete with AI-generated illustrations, and published via Firebase Hosting. So if you want to read a new story every day, head over to:

short-ai-story.web.app

The code of this agent is available on Github. So don’t hesitate to check out the code:

Read more...

Analyzing trends and topics from Bluesky's Firehose with generative AI

First article of the year, so let me start by wishing you all, my dear readers, a very happy new year! And what is the subject of this new piece of content? For a while, I’ve been interested in analyzing trends and topics in social media streams. I recently joined Bluesky (you can follow me at @glaforge.dev), and contrarily to X, it’s possible to access its Firehose (the stream of all the messages sent by its users) pretty easily, and even for free. So let’s see what we can learn from the firehose!

Read more...