Large-Language-Models

A Generative AI Agent with a real declarative workflow

📅 January 31, 2025 — by Guillaume Laforge

In my previous article, I detailed how to build an AI-powered short story generation agent using Java, LangChain4j, Gemini, and Imagen 3, deployed on Cloud Run jobs.

This approach involved writing explicit Java code to orchestrate the entire workflow, defining each step programmatically. This follow-up article explores an alternative, declarative approach using Google Cloud Workflows.

I’ve written extensively on Workflows in the past, so for those AI agents that exhibit a very explicit plan and orchestration, I believe Workflows is also a great approach for such declarative AI agents.

An AI agent to generate short sci-fi stories

📅 January 27, 2025 — by Guillaume Laforge

generative-ai agents large-language-models machine-learning langchain4j java

This project demonstrates how to build a fully automated short story generator using Java, LangChain4j, Google Cloud’s Gemini and Imagen 3 models, and a serverless deployment on Cloud Run.

Every night at midnight UTC, a new story is created, complete with AI-generated illustrations, and published via Firebase Hosting. So if you want to read a new story every day, head over to:

→ short-ai-story.web.app ←

The code of this agent is available on Github. So don’t hesitate to check out the code:

Analyzing trends and topics from Bluesky's Firehose with generative AI

📅 January 6, 2025 — by Guillaume Laforge

generative-ai large-language-models machine-learning clustering langchain4j java

First article of the year, so let me start by wishing you all, my dear readers, a very happy new year! And what is the subject of this new piece of content? For a while, I’ve been interested in analyzing trends and topics in social media streams. I recently joined Bluesky (you can follow me at @glaforge.dev), and contrarily to X, it’s possible to access its Firehose (the stream of all the messages sent by its users) pretty easily, and even for free. So let’s see what we can learn from the firehose!

Let's think with Gemini Flash 2.0's experimental thinking mode and LangChain4j

📅 December 20, 2024 — by Guillaume Laforge

java large-language-models machine-learning langchain4j generative-ai

Yesterday, Google released yet another cool Gemini model update, with Gemini 2.0 Flash thinking mode. Integrating natively and transparently some chain of thought techniques, the model is able to take some more thinking time, and automatically decomposes a complex task into smaller steps, and explores various paths in its thinking process. Thanks to this approach, Gemini 2.0 Flash is able to solve more complex problems than Gemini 1.5 Pro or the recent Gemini 2.0 Flash experiment.

Detecting objects with Gemini 2.0 and LangChain4j

📅 December 13, 2024 — by Guillaume Laforge

java large-language-models machine-learning langchain4j generative-ai

Hot on the heels of the announcement of Gemini 2.0, I played with the new experimental model both from within Google AI Studio, and with LangChain4j.

Google released Gemini 2.0 Flash, with new modalities, including interleaving images, audio, text, video, both in input and output. Even a live bidirectional speech-to-speech mode, which is really exciting!

When experimenting with AI Studio, what attracted my attention was AI Studio’s new starter apps section. There are 3 examples (including links to Github projects showing how they were implemented):

Semantic code search for Programming Idioms with LangChain4j and Vertex AI embedding models

📅 December 2, 2024 — by Guillaume Laforge

java large-language-models machine-learning langchain4j generative-ai

By Guillaume Laforge & Valentin Deleplace

The Programming Idioms community website created by Valentin lets developers share typical implementations in various programming languages for usual tasks like printing the famous “Hello World!” message, counting the characters in a string, sorting collections, or formatting dates, to name a few. And many more: there are currently 350 idioms, covering 32 programming languages. It’s a nice way to discover how various languages implement such common tasks!

Redacting sensitive information when using Generative AI models

📅 November 25, 2024 — by Guillaume Laforge

java large-language-models machine-learning langchain4j generative-ai security

As we are making our apps smarter with the help of Large Language Models, we must keep in mind that we are often dealing with potentially sensitive information coming from our users. In particular, in the context of chatbots, our application users have the ability to input any text in the conversation.

Personally Identifiable Information (PII) should be dealt with the highest level of attention, because we care about our users, we don’t want to leak their personal details, and we must comply with all sorts of laws or regulations. In a word, we are responsible AI developers.

Data extraction: The many ways to get LLMs to spit JSON content

📅 November 18, 2024 — by Guillaume Laforge

java large-language-models machine-learning langchain4j generative-ai

Data extraction from unstructured text is a very important task where LLMs shine, as they understand human languages well. Rumor has it that 80% of the worldwide knowledge and data comes in the form of unstructured text (vs 20% for data stored in databases, spreadsheets, JSON/XML, etc.) Let’s see how we can get access to that trove of information thanks to LLMs.

In this article, we’ll have a look at different techniques to make LLMs generate JSON output and extract data from text. This applies to most LLMs and frameworks, but for illustration purposes, we’ll use Gemini and LangChain4j in Java.

Things you never dared to ask about LLMs

📅 October 24, 2024 — by Guillaume Laforge

generative-ai large-language-models

Along my learning journey about generative AI, lots of questions popped up in my mind. I was very curious to learn how things worked under the hood in Large Language Models (at least having an intuition rather than knowing the maths in and out). Sometimes, I would wonder about how tokens are created, or how hyperparameters influence text generation.

Before the dotAI conference, I was invited to talk at the meetup organised by DataStax. I presented about all those things you never dared to ask about LLMs, sharing both the questions I came up with while learning about generative AI, and the answers I found and discovered along the way.

Advanced RAG Techniques

📅 October 14, 2024 — by Guillaume Laforge

generative-ai large-language-models java langchain4j retrieval-augmented-generation

Retrieval Augmented Generation (RAG) is a pattern to let you prompt a large language model (LLM) about your own data, via in-context learning by providing extracts of documents found in a vector database (or potentially other sources too).

Implementing RAG isn’t very complicated, but the results you get are not necessarily up to your expectations. In the presentations below, I explore various advanced techniques to improve the quality of the responses returned by your RAG system:

|< 2 of 4 >> >|