❯ Guillaume Laforge

A retryable JUnit 5 extension for flaky tests

As I work a lot with Large Language Models (LLMs), I often have to deal with flaky test cases, because LLMs are not always consistent and deterministic in their responses. Thus, sometimes, a test passes maybe a few times in a row, but then, once in a while, it fails.

Maybe some prompt tweaks will make the test pass more consistently, lowering the temperature too, or using techniques like few-shot prompting will help the model better understand what it has to do. But in some circumenstances, you can’t find ways around those weird failures, and the sole solution I found was to make a test retryable.

If a test fails, let’s retry a few more times (2 or 3 times) till it passes. But if it fails everytime in spite of the retries, then it’ll just fail as expected.

I wrote JUnit Rules in the past for such situations, but that was in the JUnit 4 days. Now, I’m using JUnit 5, and although it’s possible to make JUnit 4 tests run under JUnit 5, I thought it was a great opportunity to try creating a JUnit 5 extension, which is the more powerful mechanism that replaces JUnit 4 rules.

It all starts with a failing test case

Let’s say you have an hypothetical flaky test that fails a few times in a row:

    private static int count = 1;
    @Test
    void test_custom_junit_retry_extension() {
        assertThat(count++).isEqualTo(4);
    }

The first 3 executions will see an assertion failure, but the 4th would succeed as the counter is then equal to 4.

I’d like to annotate this test method with a custom annotation that indicates the number of times I’m ready to retry that test:

    private static int count = 1;
    @Test
    @ExtendWith(RetryExtension.class)
    @Retry(4)
    void test_custom_junit_retry_extension() {
        assertThat(count++).isEqualTo(4);
    }

This @ExtendWith() annotation indicates that I’m registering a JUnit 5 extension. And @Retry(4) is a custom annotation that I’ve created.

Note that @ExtendWith() can be at the class-level, but it can also live at the method level.

Let’s have a look at the @Retry annotation:

import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;

@Retention(RetentionPolicy.RUNTIME)
public @interface Retry {
    int value() default 3;
}

By default, I attempt the test 3 times, if no number is provided for the annotation value.

Now it’s time to see how the extension code works:

import org.junit.jupiter.api.extension.ExtensionContext;
import org.junit.jupiter.api.extension.TestExecutionExceptionHandler;

import java.util.concurrent.atomic.AtomicInteger;

public class RetryExtension implements TestExecutionExceptionHandler {

    private final AtomicInteger counter = new AtomicInteger(1);

    private void printError(Throwable e) {
        System.err.println(
            "Attempt test execution #" + counter.get() +
            " failed (" + e.getClass().getName() +
            "thrown):  " + e.getMessage());
    }

    @Override
    public void handleTestExecutionException(
        ExtensionContext extensionContext, Throwable throwable)
        throws Throwable {

        printError(throwable);

        extensionContext.getTestMethod().ifPresent(method -> {
            int maxExecutions =
                method.getAnnotation(Retry.class) != null ?
                method.getAnnotation(Retry.class).value() : 1;

            while (counter.incrementAndGet() <= maxExecutions) {
                try {
                    extensionContext.getExecutableInvoker().invoke(
                        method,
                        extensionContext.getRequiredTestInstance());
                    return;
                } catch (Throwable t) {
                    printError(t);

                    if (counter.get() >= maxExecutions) {
                        throw t;
                    }
                }
            }
        });
    }
}

Let’s go through the code step by step:

  • The extension has a counter to count the number of executions
  • a printError() method is used to report the assertion failure or exception
  • The class implements the TestExecutionExceptionHandler interface
  • That interface requires the method handleTestExecutionException() to be implemented
  • This method is invoked when a test throws some exception
  • If an exception is thrown, let’s see if the method is annotated with the @Retry annotation
  • and let’s retrieve the number of attempts demanded by the developer
  • Then let’s loop to do some more executions of the test method, until it passes or up to the number of attempts

Missing standard JUnit 5 extension?

I thought a @Retry extension would be pretty common, and that it would be integrated in JUnit 5 directly. Or at least, some library would provide common JUnit 5 extensions? But my search didn’t yield anything meaningful. Did I overlook or miss something?

At least now, I have a solution to work around some flaky tests, thanks to this retryable extension!

Going further

If you want to learn more about JUnit 5 extensions, there were a few resources that helped me develop this extension. First of all, two artciles from Baeldung on Migrating from JUnit 4 to JUnit 5 to understand the changes since JUnit 4, and this Guide to JUnit 5 Extensions. And of course, the JUnit 5 documentation on extensions.

Update

I’m glad I shared this article on Twitter, because I immediately got a response! Thanks @donal_tweets for your answer!

The JUnit Pioneer library provides a JUnit 5 extension pack, which includes a powerful retrying extension. Replace the usual @Test annotation with @RetryingTest. You can specify the number of attempts, the minimum number of successes, or some wait time before retries.

There’s also a rerunner extension that is quite similar.

My friend @aheritier also suggested that Maven Surefire can be configured to automatically retry failing tests a few times, thanks to a special flag:

mvn -Dsurefire.rerunFailingTestsCount=2 test

In my case, I don’t want to retry all failing tests, but only a specific one that I know is flaky.

For those who prefer Gradle over Maven, there’s a Gradle plugin as well: test-retry. You can configure the behavior in your build.gradle file:

test {
   retry {
       maxRetries = 2
       maxFailures = 20
       failOnPassedAfterRetry = true
   }
}

Someone also suggested me to use fuzzy assertions, but my test is very binary as it either fails or succeeds. There’s no threshold, or value that would fit within some bounds.