Talks

AI Agents, the New Frontier for LLMs

📅 July 16, 2025 — by Guillaume Laforge

I recently gave a talk titled “AI Agents, the New Frontier for LLMs”. The session explored how we can move beyond simple request-response interactions with Large Language Models to build more sophisticated and autonomous systems.

If you’re already familiar with LLMs and Retrieval Augmented Generation (RAG), the next logical step is to understand and build AI agents.

What makes a system “agentic”?

An agent is more than just a clever prompt. It’s a system that uses an LLM as its core reasoning engine to operate autonomously. The key characteristics that make a system “agentic” include:

Things you never dared to ask about LLMs — Take 2

📅 May 26, 2025 — by Guillaume Laforge

generative-ai large-language-models

Recently, I had the chance to deliver this talk on the mysteries of LLMs, at Devoxx France, with my good friend Didier Girard, It was fun to uncover the oddities of LLMs, and better understand where they thrive or fail, and why. I also delivered this talk alone at Devoxx Poland.

In this post, I’d like to share an update of the presentation deck, with a few additional slides here and there, to cover for example

Things you never dared to ask about LLMs

📅 October 24, 2024 — by Guillaume Laforge

generative-ai large-language-models

Along my learning journey about generative AI, lots of questions popped up in my mind. I was very curious to learn how things worked under the hood in Large Language Models (at least having an intuition rather than knowing the maths in and out). Sometimes, I would wonder about how tokens are created, or how hyperparameters influence text generation.

Before the dotAI conference, I was invited to talk at the meetup organised by DataStax. I presented about all those things you never dared to ask about LLMs, sharing both the questions I came up with while learning about generative AI, and the answers I found and discovered along the way.

Advanced RAG Techniques

📅 October 14, 2024 — by Guillaume Laforge

generative-ai large-language-models java langchain4j retrieval-augmented-generation

Retrieval Augmented Generation (RAG) is a pattern to let you prompt a large language model (LLM) about your own data, via in-context learning by providing extracts of documents found in a vector database (or potentially other sources too).

Implementing RAG isn’t very complicated, but the results you get are not necessarily up to your expectations. In the presentations below, I explore various advanced techniques to improve the quality of the responses returned by your RAG system:

Gemini, Google's Large Language Model, for Java Developers

📅 May 3, 2024 — by Guillaume Laforge

google-cloud generative-ai large-language-models java langchain4j

As a follow-up to my talk on generative AI for Java developers, I’ve developed a new presentation that focuses more on the Gemini large multimodal model by Google.

In this talk, we cover the multimodality capabilities of the model, as it’s able to ingest code, PDF, audio, video, and is able to reason about them. Another specificity of Gemini is its huge context window of up to 1 million tokens! This opens interesting perspectives, especially in multimodal scenarios.

Generative AI in practice: Concrete LLM use cases in Java, with the PaLM API

📅 November 13, 2023 — by Guillaume Laforge

java generative-ai large-language-model langchain4j

Large Language Models, available through easy to use APIs, bring powerful machine learning tools in the hands of developers. Although Python is usually seen as the lingua franca of everything ML, with LLM APIs and LLM orchestration frameworks, complex tasks become easier to implement for enterprise developers.

Abstract

Large language models (LLMs) are a powerful new technology that can be used for a variety of tasks, including generating text, translating languages, and writing different kinds of creative content. However, LLMs can be difficult to use, especially for developers who are not proficient in Python, the lingua franca for AI. So what about us Java developers? How can we make use of Generative AI?
Read more...

From Bird to Elephant: Starting a New Journey on Mastodon

📅 June 9, 2023 — by Guillaume Laforge

mastodon twitter social-media

At Devoxx France and Devoxx Greece, I had the pleasure to talk about my new social media journey on Mastodon. After a quick introduction about Mastodon and the Fediverse, I contrasted the key differences between Twitter and Mastodon. Then I shared some advice on how to get started, how to chose an instance, or clients you can pick from.

I moved on to important tips to get the best experience on the platform, and ensure to gather a great following:

Google Cloud Workflows API automation, patterns, and best practices

📅 February 1, 2023 — by Guillaume Laforge

google-cloud workflows best-practices web-api patterns

Workflows at a glance, benefits, key features, use cases
UI interface in Google Cloud console
Deep dive into the Workflows syntax
Workflows connectors
Demos
Patterns and best practices

Choreography vs orchestration in microservices and best practices

📅 October 20, 2022 — by Guillaume Laforge

google-cloud workflows serverless microservices orchestration choreography best-practices patterns

We went from a single monolith to a set of microservices that are small, lightweight, and easy to implement. Microservices enable reusability, make it easier to change and scale apps on demand but they also introduce new problems. How do microservices interact with each other toward a common goal? How do you figure out what went wrong when a business process composed of several microservices fails? Should there be a central orchestrator controlling all interactions between services or should each service work independently, in a loosely coupled way, and only interact through shared events? In this talk, we’ll explore the Choreography vs Orchestration question and see demos of some of the tools that can help. And we’ll explore some best practices and patterns to apply when adopting an orchestration approach.

Reuse old smartphones to monitor 3D prints with WebRTC WebSockets and serverless

📅 October 13, 2022 — by Guillaume Laforge

3d-printing webrtc websockets serverless micronaut

Reuse old smartphones to monitor 3D prints, with WebRTC, WebSockets and Serverless Monitoring my 3D prints in my basement means climbing lots of stairs back and forth! So here’s my story about how I reused an old smartphone to check the status of my prints. I built a small web app that uses WebRTC to exchange video streams between my broadcasting smartphone and viewers, with WebSockets for signaling, and a serverless platform for easily deploying and hosting my containerized app.

1 of 7 >> >|