Gemini Nano running locally in your browser

📅 August 7, 2024 — by Guillaume Laforge

Generative AI use cases are usually about running large language models somewhere in the cloud. However, with the advent of smaller models and open models, you can run them locally on your machine, with projects like llama.cpp or Ollama.

And what about in the browser? With MediaPipe and TensorFlow.js, you can train and run small neural networks for tons of fun and useful tasks (like recognising hand movements through the webcam of your computer), and it’s also possible to run Gemma 2B and even 7B models.

Sentiment analysis with few-shot prompting

📅 July 30, 2024 — by Guillaume Laforge

generative-ai langchain4j java google-cloud large-language-model

In a rencent article, we talked about text classification using Gemini and LangChain4j.

A typical example of text classification is the case of sentiment analysis.

In my LangChain4j-powered Gemini workshop, I used this use case to illustrate the classification problem:

ChatLanguageModel model = VertexAiGeminiChatModel.builder()
    .project(System.getenv("PROJECT_ID"))
    .location(System.getenv("LOCATION"))
    .modelName("gemini-1.5-flash-001")
    .maxOutputTokens(10)
    .maxRetries(3)
    .build();

PromptTemplate promptTemplate = PromptTemplate.from("""
    Analyze the sentiment of the text below.
    Respond only with one word to describe the sentiment.

    INPUT: This is fantastic news!
    OUTPUT: POSITIVE

    INPUT: Pi is roughly equal to 3.14
    OUTPUT: NEUTRAL

    INPUT: I really disliked the pizza. Who would use pineapples as a pizza topping?
    OUTPUT: NEGATIVE

    INPUT: {{text}}
    OUTPUT:
    """);

Prompt prompt = promptTemplate.apply(
    Map.of("text", "I love strawberries!"));

Response<AiMessage> response = model.generate(prompt.toUserMessage());

System.out.println(response.content().text());

I used a PromptTemplate to craft the prompt, with a {{text}} placeholder value to analyze the sentiment of that particular text.

Analyzing video, audio and PDF files with Gemini and LangChain4j

📅 July 25, 2024 — by Guillaume Laforge

generative-ai langchain4j java google-cloud large-language-model

Certain models like Gemini are multimodal. This means that they accept more than just text as input. Some models support text and images, but Gemini goes further and also supports audio, video, and PDF files. So you can mix and match text prompts and different multimedia files or PDF documents.

Until LangChain4j 0.32, the models could only support text and images, but since my PR got merged into the newly released 0.33 version, you can use all those files with the LangChain4j Gemini module!

Text classification with Gemini and LangChain4j

📅 July 11, 2024 — by Guillaume Laforge

generative-ai langchain4j java google-cloud large-language-model

Generative AI has potential applications far beyond chatbots and Retrieval Augmented Generation. For example, a nice use case is: text classification.

I had the chance of meeting some customers and prospects who had the need for triaging incoming requests, or for labeling existing data. In the first case, a government entity was tasked with routing citizen requests to access undisclosed information to the right governmental service that could grant or reject that access. In the second case, a company needed to sort out tons of existing internal documents that were not properly organized, and they wanted to quickly start better structuring this trove of information, by labelling each of these docs into different categories.

Latest Gemini features support in LangChain4j 0.32.0

📅 July 5, 2024 — by Guillaume Laforge

generative-ai langchain4j java google-cloud large-language-model

LangChain4j 0.32.0 was released yesterday, including my pull request with the support for lots of new Gemini features:

JSON output mode, to force Gemini to reply using JSON, without any markup,
JSON schema, to control and constrain the JSON output to comply with a schema,
Response grounding with Google Search web results and with private data in Vertex AI datastores,
Easier debugging, thanks to new builder methods to log requests and responses,
Function calling mode (none, automatic, or a subset of functions),
Safety settings to catch harmful prompts and responses.

Let’s explore those new features together, thanks to some code examples! And at the end of the article, if you make it through, you’ll also discover 2 extra bonus points.

The power of embeddings: How numbers unlock the meaning of data

📅 July 2, 2024 — by Guillaume Laforge

generative-ai machine-learning

Prelude

As I’m focusing a lot on Generative AI, I’m curious about how things work under the hood, to better understand what I’m using in my gen-ai powered projects. A topic I’d like to focus on more is: vector embeddings, to explain more clearly what they are, how they are calculated, and what you can do with them.
A colleague of mine, André, was showing me a cool experiment he’s been working on, to help people prepare an interview, with the help of an AI, to shape the structure of the resulting final article to write.
Read more...

Functional builders in Java with Jilt

📅 June 17, 2024 — by Guillaume Laforge

java golang design-pattern

A few months ago, I shared an article about what I called Java functional builders, inspired by an equivalent pattern found in Go. The main idea was to have builders that looked like this example:

LanguageModel languageModel = new LanguageModel(
    name("cool-model"),
    project("my-project"),
    temperature(0.5),
    description("This is a generative model")
);

Compared to the more tranditional builder approach:

You’re using the new keyword again to construct instances.
There’s no more build() method, which felt a bit verbose.

Compared to using constructors with tons of parameters:

Let's make Gemini Groovy!

📅 June 3, 2024 — by Guillaume Laforge

groovy google-cloud generative-ai large-language-models java langchain4j

The happy users of Gemini Advanced, the powerful AI web assistant powered by the Gemini model, can execute some Python code, thanks to a built-in Python interpreter. So, for math, logic, calculation questions, the assistant can let Gemini invent a Python script, and execute it, to let users get a more accurate answer to their queries.

But wearing my Apache Groovy hat on, I wondered if I could get Gemini to invoke some Groovy scripts as well, for advanced math questions!

Grounding Gemini with Web Search results in LangChain4j

📅 May 28, 2024 — by Guillaume Laforge

google-cloud generative-ai large-language-models java langchain4j

The latest release of LangChain4j (version 0.31) added the capability of grounding large language models with results from web searches. There’s an integration with Google Custom Search Engine, and also Tavily.

The fact of grounding an LLM’s response with the results from a search engine allows the LLM to find relevant information about the query from web searches, which will likely include up-to-date information that the model won’t have seen during its training, past its cut-off date when the training ended.

Calling Gemma with Ollama, TestContainers, and LangChain4j

📅 April 3, 2024 — by Guillaume Laforge

google-cloud generative-ai large-language-models java containers langchain4j

Lately, for my Generative AI powered Java apps, I’ve used the Gemini multimodal large language model from Google. But there’s also Gemma, its little sister model.

Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Gemma is available in two sizes: 2B and 7B. Its weights are freely available, and its small size means you can run it on your own, even on your laptop. So I was curious to give it a run with LangChain4j.

|< << 4 of 51 >> >|