❯ Guillaume Laforge

Generative-Ai

Generating videos in Java with Veo 3

Yesterday, we went bananas 🍌 creating and editing images with Nano Banana, in Java. Now, what about generating videos as well, still in Java, with Veo 3?

Especially since this week, Google announced that Veo 3 became generally available, with reduced pricing, a new 9:16 aspect ratio (nice for those vertical viral videos) and even with resolution up to 1080p!

In today’s article, we’ll see how to create videos, in Java, with the GenAI Java SDK. We’ll create videos either:

Read more...

Generating and editing images with Nano Banana in Java

By now, you’ve all probably seen the incredible images generated by the Nano Banana model (also known as Gemini 2.5 Flash Image preview)? If you haven’t, I encourage you to play with it within Google AI Studio, and from the Gemini app. or have a look at the @NanoBanana X/Twitter account which shares some of its greatest creations.

As a Java developer, you may be wondering how you can integrate Nano Banana in your own LLM-powered apps. This is what this article is about! I’ll show you how you can use this model to:

Read more...

The Sci-Fi naming problem: Are LLMs less creative than we think?

Like many developers, I’ve been exploring the creative potential of Large Language Models (LLMs). At the beginning of the year, I crafted a project to build an AI agent that could generate short science-fiction stories. I used LangChain4j to create a deterministic workflow to drive Gemini for the story generation, and Imagen for the illustrations. The initial results were fascinating. The model could weave narratives, describe futuristic worlds, and create characters with seemingly little effort. But as I generated more stories, a strange and familiar pattern began to emerge…

Read more...

AI Agents, the New Frontier for LLMs

I recently gave a talk titled “AI Agents, the New Frontier for LLMs”. The session explored how we can move beyond simple request-response interactions with Large Language Models to build more sophisticated and autonomous systems.

If you’re already familiar with LLMs and Retrieval Augmented Generation (RAG), the next logical step is to understand and build AI agents.

What makes a system “agentic”?

An agent is more than just a clever prompt. It’s a system that uses an LLM as its core reasoning engine to operate autonomously. The key characteristics that make a system “agentic” include:

Read more...

Advanced RAG β€” Using Gemini and long context for indexing rich documents (PDF, HTML...)

A very common question I get when presenting and talking about advanced RAG (Retrieval Augmented Generation) techniques, is how to best index and search rich documents like PDF (or web pages), that contain both text and rich elements, like pictures or diagrams.

Another very frequent question that people ask me is about RAG versus long context windows. Indeed, models with long context windows usually have a more global understanding of a document, and each excerpt in its overall context. But of course, you can’t feed all the documents of your users or customers in one single augmented prompt. Also, RAG has other advantages like offering a much lower latency, and is generally cheaper.

Read more...

Advanced RAG β€” Hypothetical Question Embedding

In the first article of this Advanced RAG series, I talked about an approach I called sentence window retrieval, where we calculate vector embeddings per sentence, but the chunk of text returned (and added in the context of the LLM) actually contains also surrounding sentences to add more context to that embedded sentence. This tends to give a better vector similarity than the whole surrounding context. It is one of the techniques I’m covering in my talk on advanced RAG techniques.

Read more...

Things you never dared to ask about LLMs β€” Take 2

Recently, I had the chance to deliver this talk on the mysteries of LLMs, at Devoxx France, with my good friend Didier Girard, It was fun to uncover the oddities of LLMs, and better understand where they thrive or fail, and why. I also delivered this talk alone at Devoxx Poland.

In this post, I’d like to share an update of the presentation deck, with a few additional slides here and there, to cover for example

Read more...

Beyond the chatbot or AI sparkle: a seamless AI integration

When I talk about Generative AI, whether it’s with developers at conferences or with customers, I often find myself saying the same thing: chatbots are just one way to use Large Language Models (LLMs).

Unfortunately, I see many articles or presentations that just focus on demonstrating LLMs at work within the context of chatbots. I feel guilty of showing the traditional chat interfaces too. But there’s so much more to it!

Read more...

LLMs.txt to help LLMs grok your content

Since I started my career, I’ve been sharing what I’ve learned along the way in this blog. It makes me happy when developers find solutions to their problems, or discover new things, thanks to articles I’ve written here. So it’s important for me that readers are able to find those posts. Of course, my blog is indexed by search engines, and people usually find about it from Google or other engines, or they discover it via the links I share on social media. But with LLM powered tools (like Gemini, ChatGPT, Claude, etc.) you can make your content more easily grokkable by such tools.

Read more...

Advanced RAG β€” Sentence Window Retrieval

Retrieval Augmented Generation (RAG) is a great way to expand the knowledge of Large Language Models to let them know about your own data and documents. With RAG, LLMs can ground their answers on the information your provide, which reduces the chances of hallucinations.

Implementing RAG is fairly trivial with a framework like LangChain4j. However, the results may not be on-par with your quality expectations. Often, you’ll need to further tweak different aspects of the RAG pipeline, like the document preparation phase (in particular docs chunking), or the retrieval phase to find the best information in your vector database.

Read more...