As a follow-up to my talk on generative AI for Java developers, I’ve developed a new presentation that focuses more on
the Gemini large multimodal model by Google.
In this talk, we cover the multimodality capabilities of the model, as it’s able to ingest code, PDF, audio, video, and is able to reason about them.
Another specificity of Gemini is its huge context window of up to 1 million tokens!
This opens interesting perspectives, especially in multimodal scenarios.
Lately, for my Generative AI powered Java apps,
I’ve used the Gemini
multimodal large language model from Google.
But there’s also Gemma,
its little sister model.
Gemma is a family of lightweight, state-of-the-art open models built from the same research
and technology used to create the Gemini models. Gemma is available in two sizes: 2B and 7B.
Its weights are freely available, and its small size means you can run it on your own, even on your laptop.
So I was curious to give it a run with LangChain4j.
No need to be a Python developer to do Generative AI!
If you’re a Java developer, you can take advantage of LangChain4j
to implement some advanced LLM integrations in your Java applications.
And if you’re interested in using
Gemini,
one of the best models available, I invite you to have a look at the following “codelab” that I worked on:
As I was working on tweaking the Vertex AI text embedding model in LangChain4j,
I wanted to better understand how the textembedding-geckomodel
tokenizes the text, in particular when we implement the
Retrieval Augmented Generation approach.
The various PaLM-based models offer a computeTokens endpoint, which returns a list of tokens (encoded in Base 64)
and their respective IDs.
Note
At the time of this writing, there’s no equivalent endpoint for Gemini models.
This week LangChain4j, the LLM orchestration framework for Java developers, released version
0.26.1, which contains my first significant contribution to the open source project:
support for the Imagen image generation model.
Imagen is a text-to-image diffusion model that was announced last year.
And it recently upgraded to Imagen v2, with even higher quality graphics generation.
As I was curious to integrate it in some of my generative AI projects, I thought that would be a great first
contribution to LangChain4j.
My go-to framework when developing Java apps or microservices is
Micronaut.
For the apps that should have a web frontend, I rarely use
Micronaut Views
and its templating support.
Instead, I prefer to just serve static assets from my resource folder,
and have some JavaScript framework (usually Vue.js)
to populate my HTML content (often using
Shoelace for its nice Web Components).
However, the static asset documentation
is a bit light on explanations.
So, since I always forget how to configure Micronaut to serve static assets,
I thought that would be useful to document this here.
In Java, builders are a pretty classical pattern for creating complex objects with lots of attributes.
A nice aspect of builders is that they help reduce the number of constructors you need to create,
in particular when not all attributes are required to be set (or if they have default values).
However, I’ve always found builders a bit verbose with their newBuilder() / build() method combos,
especially when you work with deeply nested object graphs, leading to lines of code of builders of builders of…
In this article, we’ll figure out how to create slugs.
Not the slobbery kind of little gastropods that crawls on the ground. Instead,
we’ll see how to create the short hyphened text you can see in the URL of your web browser,
and that is often a URL-friendly variation of the title of the article.
Interestingly, one of the most popular posts on my blog is an almost 20 year old article that explains
how to remove accents from a string.
And indeed, in slugs you would like to remove accents, among other things.
A promising feature of the Gemini large language model released recently by Google DeepMind,
is the support for function calls.
It’s a way to supplement the model, by letting it know an external functions or APIs can be called.
So you’re not limited by the knowledge cut-off of the model: instead, in the flow of the conversation with the model,
you can pass a list of functions the model will know are available to get the information it needs,
to complete the generation of its answer.
Hot on the heels of the release of Gemini,
I’d like to share a couple of resources I created to get your hands on large language models,
using LangChain4J, and the PaLM 2 model.
Later on, I’ll also share with you articles and codelabs that take advantage of Gemini, of course.
The PaLM 2 model supports 2 modes:
text generation,
and chat.
In the 2 codelabs, you’ll need to have created an account on Google Cloud, and created a project.
The codelabs will guide you through the steps to setup the environment,
and show you how to use the Google Cloud built-in shell and code editor, to develop in the cloud.