❯ Guillaume Laforge

Tech Watch #2 β€” Oct 06, 2023

  • Generative AI exists because of the transformer
    I confess I rarely read the Financial Times, but they have a really neat articles with animations on how large language models work, thanks to the transformer neural network architecture, an architecture invented by Google in 2017. They talk about text vector embeddings, how the self-attention makes LLM understand the relationship between words and the surrounding context, and also doesn’t forget to mention hallucinations, how “grounding” and RLHF (Reinforcement Learning with Human Feedback) can help mitigate them to some extent.

    Read more...

Client-side consumption of a rate-limited API in Java

In the literature, you’ll easily find information on how to rate-limit your API. I even talked about Web API rate limitation years ago at a conference, covering the usage of HTTP headers like X-RateLimit-*.

Rate limiting is important to help your service cope with too much load, or also to implement a tiered pricing scheme (the more you pay, the more requests you’re allowed to make in a certain amount of time). There are useful libraries like Resilience4j that you can configure for Micronaut web controllers, or Bucket4j for your Spring controllers.

Read more...

Tech Watch #1 β€” Sept 29, 2023

Inspired my by super boss Richard Seroter with his regular daily reading list, I decided to record and share my tech watch, every week (or so). I always take notes of interesting articles I read for my own curiosity and to remember them when I need those references later on. But also to share them with Les Cast Codeurs podcast! So I hope it’ll be interesting to my readers too!

  • LLMs Demand Observability-Driven Development
    A great tribune from Charity Majors on the importance of observability-driven development, in the wake of large language models. Developing LLM based solutions is typically not something you can do with a classical test-driven approach, as you only really get proper test data when you have it coming from production usage. Furthermore, LLMs are pretty much unpredictable and underterministic. But with observability in place, you can better understand why there’s latency in some scenarios, why the LLM came to certain solutions, and this will help you improve as your learn along the way.

    Read more...

Discovering LangChain4J, the Generative AI orchestration library for Java developers

As I started my journey with Generative AI and Large Language Models, I’ve been overwhelmed with the omnipresence of Python. Tons of resources are available with Python front and center. However, I’m a Java developer (with a penchant for Apache Groovy, of course). So what is there for me to create cool new Generative AI projects?

When I built my first experiment with the PaLM API, using the integration within the Google Cloud’s Vertex AI offering, I called the available REST API, from my Micronaut application. I used Micronaut’s built-in mechanism to marshal / unmarshal the REST API constructs to proper classes. Pretty straightfoward.

Read more...

Custom Environment Variables in Workflows

In addition to the built-in environment variables available by default in Google Cloud Workflows (like the project ID, the location, the workflow ID, etc.) it’s now possible to define your own custom environment variables!

Why is it useful and important? It’s particularly handy when you want to read information that is dependent on the deployment of your workflow, like, for example, information about the environment you’re running in. Is my workflow running in development, staging, or production environment? Then you can read your custom MY_ENVIRONMENT variable, like you read the existing built-in environment variables. And you define such variables at deployment time.

Read more...

Creating kids stories with Generative AI

Last week, I wrote about how to get started with the PaLM API in the Java ecosystem, and particularly, how to overcome the lack of Java client libraries (at least for now) for the PaLM API, and how to properly authenticate. However, what I didn’t explain was what I was building! Let’s fix that today, by telling you a story, a kid story! Yes, I was using the trendy Generative AI approach to generate bedtime stories for kids.

Read more...

Just a handy command-line tool

When developing new projects on my laptop, I often run some commands over and over again. Regardless of how far you’ve gone with your CI/CD pipelines, running commands locally without resorting to becoming a bash ninja can be pretty easy with… just!

just is a handy way to save and run project-specific commands

It’s a command-line tool that lets you define some commands to run (called recipes), in the form of a Makefile-inspired syntax. It even allows you to define dependencies between the various tasks of your justfile. It runs across all environments (Mac, Linux, Windows), and is quick to install. It loads .env files in which you can define variables specific to your project (other developers can have the same justfile but have variables specific for their projects)

Read more...

Getting started with the PaLM API in the Java ecosystem

Large Language Models (LLMs for short) are taking the world by storm, and things like ChatGPT have become very popular and used by millions of users daily. Google came up with its own chatbot called Bard, which is powered by its ground-breaking PaLM 2 model and API. You can also find and use the PaLM API from withing Google Cloud as well (as part of Vertex AI Generative AI products) and thus create your own applications based on that API. However, if you look at the documentation, you’ll only find Python tutorials or notebooks, or also explanations on how to make cURL calls to the API. But since I’m a Java (and Groovy) developer at heart, I was interested in seeing how to do this from the Java world.

Read more...

Exploring Open Location Code

When using Google Maps, you might have seen those strange little codes, as in the screenshot above. This is a plus code, or to use the more official name, an Open Location Code. It’s a way to encode a location in a short and (somewhat) memorable form.

In countries like France, every house has an official address, so you can easily receive letters or get some parcel delivered. But there are countries where no such location system exists, so you have to resort to describing where you live (take this road, turn right after the red house, etc.)

Read more...

cURL's --json flag

As cURL was celebrating its 25th birthday, I was reading Daniel Stenberg’s story behind the project, and discovered a neat little feature I hadn’t heard of before: the --json flag! Daniel even blogged about it when it landed in cURL 7.82.0 last year.

So what’s so cool about it? If you’re like me, you’re used to post some JSON data with the following verbose approach:

curl --data '{"msg": "hello"}' \
    --header "Content-Type: application/json" \
    --header "Accept: application/json" \
    https://example.com

You have to pass the data, and also pass headers to specify the content-type. You can make it slightly shorter with the one-letter flags:

Read more...