❯ Guillaume Laforge

Large-Language-Models

Semantic code search for Programming Idioms with LangChain4j and Vertex AI embedding models

By Guillaume Laforge & Valentin Deleplace The Programming Idioms community website created by Valentin lets developers share typical implementations in various programming languages for usual tasks like printing the famous β€œHello World!” message, counting the characters in a string, sorting collections, or formatting dates, to name a few. And many more: there are currently 350 idioms, covering 32 programming languages. It’s a nice way to discover how various languages implement such common tasks! Read more...

Redacting sensitive information when using Generative AI models

As we are making our apps smarter with the help of Large Language Models, we must keep in mind that we are often dealing with potentially sensitive information coming from our users. In particular, in the context of chatbots, our application users have the ability to input any text in the conversation. Personally Identifiable Information (PII) should be dealt with the highest level of attention, because we care about our users, we don’t want to leak their personal details, and we must comply with all sorts of laws or regulations. Read more...

Data extraction: The many ways to get LLMs to spit JSON content

Data extraction from unstructured text is a very important task where LLMs shine, as they understand human languages well. Rumor has it that 80% of the worldwide knowledge and data comes in the form of unstructured text (vs 20% for data stored in databases, spreadsheets, JSON/XML, etc.) Let’s see how we can get access to that trove of information thanks to LLMs. In this article, we’ll have a look at different techniques to make LLMs generate JSON output and extract data from text. Read more...

Things you never dared to ask about LLMs

Along my learning journey about generative AI, lots of questions poppep up in my mind. I was very curious to learn how things worked under the hood in Large Language Models (at least having an intuition rather than knowing the maths in and out). Sometimes, I would wonder about how tokens are created, or how hyperparameters influence text generation. Before the dotAI conference last week, I was invited to talk at the meetup organised by DataStax. Read more...

Advanced RAG Techniques

Retrieval Augmented Generation (RAG) is a pattern to let you prompt a large language model (LLM) about your own data, via in-context learning by providing extracts of documents found in a vector database (or potentially other sources too). Implementing RAG isn’t very complicated, but the results you get are not necessarily up to your expectations. In the presentations below, I explore various advanced techniques to improve the quality of the responses returned by your RAG system: Read more...

A Gemini and Gemma tokenizer in Java

It’s always interesting to know how the sausage is made, don’t you think? That’s why, a while ago, I looked at embedding model tokenization, and I implemented a little visualization to see the tokens in a colorful manner. Yet, I was still curious to see how Gemini would tokenize text… Both LangChain4j Gemini modules (from Vertex AI and from Google AI Labs) can count the tokens included in a piece of text. Read more...

Some advice and good practices when integrating an LLM in your application

When integrating an LLM into your applicaton to extend it and make it smarter, it’s important to be aware of the pitfalls and best practices you need to follow to avoid some common problems and integrate them successfully. This article will guide you through some key best practices that I’ve come across. Understanding the Challenges of Implementing LLMs in Real-World Applications One of the first challenges is that LLMs are constantly being improved. Read more...

Let's make Gemini Groovy!

The happy users of Gemini Advanced, the powerful AI web assistant powered by the Gemini model, can execute some Python code, thanks to a built-in Python interpreter. So, for math, logic, calculation questions, the assistant can let Gemini invent a Python script, and execute it, to let users get a more accurate answer to their queries. But wearing my Apache Groovy hat on, I wondered if I could get Gemini to invoke some Groovy scripts as well, for advanced math questions! Read more...

Grounding Gemini with Web Search results in LangChain4j

The latest release of LangChain4j (version 0.31) added the capability of grounding large language models with results from web searches. There’s an integration with Google Custom Search Engine, and also Tavily. The fact of grounding an LLM’s response with the results from a search engine allows the LLM to find relevant information about the query from web searches, which will likely include up-to-date information that the model won’t have seen during its training, past its cut-off date when the training ended. Read more...

Gemini, Google's Large Language Model, for Java Developers

As a follow-up to my talk on generative AI for Java developers, I’ve developed a new presentation that focuses more on the Gemini large multimodal model by Google. In this talk, we cover the multimodality capabilities of the model, as it’s able to ingest code, PDF, audio, video, and is able to reason about them. Another specificity of Gemini is its huge context window of up to 1 million tokens! This opens interesting perspectives, especially in multimodal scenarios. Read more...