❯ Guillaume Laforge

Gemini

Gemini Nano running locally in your browser

Generative AI use cases are usually about running large language models somewhere in the cloud. However, with the advent of smaller models and open models, you can run them locally on your machine, with projects like llama.cpp or Ollama.

And what about in the browser? With MediaPipe and TensorFlow.js, you can train and run small neural networks for tons of fun and useful tasks (like recognising hand movements through the webcam of your computer), and it’s also possible to run Gemma 2B and even 7B models.

Read more...

Analyzing video, audio and PDF files with Gemini and LangChain4j

Certain models like Gemini are multimodal. This means that they accept more than just text as input. Some models support text and images, but Gemini goes further and also supports audio, video, and PDF files. So you can mix and match text prompts and different multimedia files or PDF documents.

Until LangChain4j 0.32, the models could only support text and images, but since my PR got merged into the newly released 0.33 version, you can use all those files with the LangChain4j Gemini module!

Read more...

Latest Gemini features support in LangChain4j 0.32.0

LangChain4j 0.32.0 was released yesterday, including my pull request with the support for lots of new Gemini features:

  • JSON output mode, to force Gemini to reply using JSON, without any markup,
  • JSON schema, to control and constrain the JSON output to comply with a schema,
  • Response grounding with Google Search web results and with private data in Vertex AI datastores,
  • Easier debugging, thanks to new builder methods to log requests and responses,
  • Function calling mode (none, automatic, or a subset of functions),
  • Safety settings to catch harmful prompts and responses.

Let’s explore those new features together, thanks to some code examples! And at the end of the article, if you make it through, you’ll also discover 2 extra bonus points.

Read more...

Let's make Gemini Groovy!

The happy users of Gemini Advanced, the powerful AI web assistant powered by the Gemini model, can execute some Python code, thanks to a built-in Python interpreter. So, for math, logic, calculation questions, the assistant can let Gemini invent a Python script, and execute it, to let users get a more accurate answer to their queries.

But wearing my Apache Groovy hat on, I wondered if I could get Gemini to invoke some Groovy scripts as well, for advanced math questions!

Read more...

Gemini codelab for Java developers using LangChain4j

No need to be a Python developer to do Generative AI! If you’re a Java developer, you can take advantage of LangChain4j to implement some advanced LLM integrations in your Java applications. And if you’re interested in using Gemini, one of the best models available, I invite you to have a look at the following “codelab” that I worked on:

Codelab — Gemini for Java Developers using LangChain4j

In this workshop, you’ll find various examples covering the following use cases, in crescendo approach:

Read more...

Gemini Function Calling

A promising feature of the Gemini large language model released recently by Google DeepMind, is the support for function calls. It’s a way to supplement the model, by letting it know an external functions or APIs can be called. So you’re not limited by the knowledge cut-off of the model: instead, in the flow of the conversation with the model, you can pass a list of functions the model will know are available to get the information it needs, to complete the generation of its answer.

Read more...

Hands on Codelabs to dabble with Large Language Models in Java

Hot on the heels of the release of Gemini, I’d like to share a couple of resources I created to get your hands on large language models, using LangChain4J, and the PaLM 2 model. Later on, I’ll also share with you articles and codelabs that take advantage of Gemini, of course.

The PaLM 2 model supports 2 modes:

  • text generation,
  • and chat.

In the 2 codelabs, you’ll need to have created an account on Google Cloud, and created a project. The codelabs will guide you through the steps to setup the environment, and show you how to use the Google Cloud built-in shell and code editor, to develop in the cloud.

Read more...

Get Started with Gemini in Java

Google announced today the availability of Gemini, its latest and more powerful Large Language Model. Gemini is multimodal, which means it’s able to consume not only text, but also images or videos.

I had the pleasure of working on the Java samples and help with the Java SDK, with wonderful engineer colleagues, and I’d like to share some examples of what you can do with Gemini, using Java!

First of all, you’ll need to have an account on Google Cloud and created a project. The Vertex AI API should be enabled, to be able to access the Generative AI services, and in particular the Gemini large language model. Be sure to check out the instructions.

Read more...