Evaluating Claude's abilities in concurrent programming.
After Anthropic released its newest model last week, it received mostly positive feedback. Some were comparing it to the GPT-4 level, while others suspected that Claude was even self-conscious. With this premise, I decided to test it in the most challenging programming paradigm: concurrency.
The problem
There are 3 services that capture temperature: analog, infrared, and digital:
private final AnalogSensor analogSensor;
private final DigitalSensor digitalSensor;
private final InfraredSensor infraredSensor;
Each sensor can return a temperature by calling the getTemp
method. This method is a blocking call, which means that data retrieval may take some time (you can imagine that it's a network request to some online temperature service).
The task is to implement readTemperature()
, which has a total timeout of 500ms and a requirement to return a result if at least the results of 2 services are available within that timeout.
The entire code:
public class TemperatureService {
private final AnalogSensor analogSensor;
private final DigitalSensor digitalSensor;
private final InfraredSensor infraredSensor;
public TemperatureService(AnalogSensor analogSensor,
DigitalSensor digitalSensor,
InfraredSensor infraredSensor) {
this.analogSensor = analogSensor;
this.digitalSensor = digitalSensor;
this.infraredSensor = infraredSensor;
}
/**
* Invokes three distinct services to retrieve a snapshot
* of the current temperature. The method aims to provide a response
* within 500 milliseconds. If the time limit is exceeded,
* it is deemed futile to await the response, and a
* {@link TimeoutException} is thrown.
* <p>
* The resulting TemperatureResult is expected
* to encompass a minimum of two internal responses, with greater diversity
* being preferable. For instance, combinations
* such as analog + digital, analog + infrared, or digital + infrared are considered
* the minimum acceptable configurations.
*/
TemperatureResult readTemperature() throws TimeoutException {
BigDecimal analog = analogSensor.getTemp();
BigDecimal digital = digitalSensor.getTemp();
BigDecimal infrared = infraredSensor.getTemp();
return new TemperatureResult(analog, digital, infrared);
}
static class TemperatureResult {
final BigDecimal analog;
final BigDecimal digital;
final BigDecimal infrared;
TemperatureResult(BigDecimal analog, BigDecimal digital, BigDecimal infrared) {
this.analog = analog;
this.digital = digital;
this.infrared = infrared;
}
}
interface AnalogSensor extends TempSensor {}
interface DigitalSensor extends TempSensor {}
interface InfraredSensor extends TempSensor {}
interface TempSensor {
BigDecimal getTemp();
}
}
Another requirement was to use Java 8 to test Claude and see how well it can distinguish between different API versions.
First prompt
For the evaluation, I was using the claude-3-opus-20240229
model, which was the most advanced at the moment I was writing this article.
System prompt
You are a professional Java developer with deep knowledge of
concurrent programming. You have been given some code.
[THE INSERTED CODE]
User
Implement readTemperature method following the discription from
Java Doc
Output
1. Call the three underlying services concurrently to get the temperature
results.
2. Wait for a maximum of 500ms for the responses.
3. Ensure that at least two internal responses are available before
returning the `TemperatureResult`.
4. Throw a `TimeoutException` if the method takes longer than 500ms
to complete.
The explanation is correct, but a bit abstract. The most interesting was the generated code:
TemperatureResult readTemperature() throws TimeoutException {
long startTime = System.currentTimeMillis();
ExecutorService executor = Executors.newFixedThreadPool(3);
Future<BigDecimal> analogFuture = executor.submit(analogSensor::getTemp);
Future<BigDecimal> digitalFuture = executor.submit(digitalSensor::getTemp);
Future<BigDecimal> infraredFuture = executor.submit(infraredSensor::getTemp);
try {
BigDecimal analog = null;
BigDecimal digital = null;
BigDecimal infrared = null;
try {
analog = analogFuture.get(500, TimeUnit.MILLISECONDS);
} catch (TimeoutException | InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
try {
digital = digitalFuture.get(500 -
(System.currentTimeMillis() - startTime),
TimeUnit.MILLISECONDS);
} catch (TimeoutException | InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
try {
infrared = infraredFuture.get(500 -
(System.currentTimeMillis() - startTime),
TimeUnit.MILLISECONDS);
} catch (TimeoutException | InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
if (
(analog != null && digital != null) ||
(analog != null && infrared != null) ||
(digital != null && infrared != null)
) {
return new TemperatureResult(analog, digital, infrared);
} else {
throw new TimeoutException("Could not retrieve " +
"at least two temperature results within 500ms");
}
} finally {
executor.shutdownNow();
}
}
The interesting aspect here is that services are polled in separate threads, but results are retrieved through a blocking get
call and sequentially. This was incorrect and would result in a TimeoutException
if the first service takes 3 seconds to return. I clicked the Add to Conversation
button and attempted to be more specific this time.
Second prompt
System prompt
[COPIED FROM ABOVE]
User
It works for the majority of cases but fails for one specific use
case: when two services return in 100ms, but one takes 3s. This is a
valid use case by definition and should return the values of those two
results without throwing a timeout exception. Please adjust the code
accordingly.
Output
Apologies for the oversight. You're right, the previous implementation would throw a TimeoutException
even if two services return within the specified time limit. Here's the adjusted code to handle the case where two services return quickly, and one takes longer:
TemperatureResult readTemperature() throws TimeoutException {
ExecutorService executor = Executors.newFixedThreadPool(3);
Future<BigDecimal> analogFuture = executor.submit(analogSensor::getTemp);
Future<BigDecimal> digitalFuture = executor.submit(digitalSensor::getTemp);
Future<BigDecimal> infraredFuture = executor.submit(digitalSensor::getTemp);
try {
BigDecimal analog = null;
BigDecimal digital = null;
BigDecimal infrared = null;
long startTime = System.currentTimeMillis();
long remainingTime = 500;
while (remainingTime > 0 && hasPendingResults(analog, digital, infrared)) {
if (analog == null) {
try {
analog = analogFuture.get(remainingTime, TimeUnit.MILLISECONDS);
} catch (TimeoutException | InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
}
if (digital == null) {
try {
digital = digitalFuture.get(remainingTime, TimeUnit.MILLISECONDS);
} catch (TimeoutException | InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
}
if (infrared == null) {
try {
infrared = infraredFuture.get(remainingTime, TimeUnit.MILLISECONDS);
} catch (TimeoutException | InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
}
remainingTime = 500 - (System.currentTimeMillis() - startTime);
}
if ((analog != null && digital != null)
|| (analog != null && infrared != null)
|| (digital != null && infrared != null)) {
return new TemperatureResult(analog, digital, infrared);
} else {
throw new TimeoutException("Could not retrieve at least two temperature " +
"results within 500ms");
}
} finally {
executor.shutdownNow();
}
}
In this iteration, Claude introduced a while
loop for a duration of 500ms, but this implementation didn't consider the case if one of the service calls takes more than 500ms. This time, I tried to be more precise about why this approach wouldn't work.
Third prompt. We got it
System prompt
[COPIED FROM ABOVE]
User
The code still doesn't work because the `get` method is a blocking call, and you receive results sequentially. If the first call returns in 3 seconds, you will automatically fail the result, but this is incorrect. Please fix the code.
Output
You're absolutely right. I apologize for the confusion. Here's the corrected implementation that avoids the blocking get
calls and waits for the futures to complete concurrently:
TemperatureResult readTemperature() throws TimeoutException {
ExecutorService executor = Executors.newFixedThreadPool(3);
Future<BigDecimal> analogFuture = executor.submit(analogSensor::getTemp);
Future<BigDecimal> digitalFuture = executor.submit(digitalSensor::getTemp);
Future<BigDecimal> infraredFuture = executor.submit(infraredSensor::getTemp);
try {
BigDecimal analog = null;
BigDecimal digital = null;
BigDecimal infrared = null;
long startTime = System.currentTimeMillis();
long remainingTime = 500;
while (remainingTime > 0 &&
(analog == null || digital == null || infrared == null)) {
if (analog == null && analogFuture.isDone()) {
try {
analog = analogFuture.get();
} catch (InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
}
if (digital == null && digitalFuture.isDone()) {
try {
digital = digitalFuture.get();
} catch (InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
}
if (infrared == null && infraredFuture.isDone()) {
try {
infrared = infraredFuture.get();
} catch (InterruptedException | ExecutionException e) {
// Ignore and proceed with other sensors
}
}
if ((analog != null && digital != null)
|| (analog != null && infrared != null)
|| (digital != null && infrared != null)) {
return new TemperatureResult(analog, digital, infrared);
}
remainingTime = 500 - (System.currentTimeMillis() - startTime);
if (remainingTime > 0) {
Thread.sleep(10); // Sleep for a short duration to avoid busy waiting
}
}
throw new TimeoutException("Could not retrieve at least " +
"two temperature results within 500ms");
} finally {
executor.shutdownNow();
}
}
Even the code didn't compile , it did what was expected from it: we are checking the results of each service call until we run out of 500ms and in the end preparing the result. A neat addition was the use of the Thread.sleep method to avoid overwhelming the service API.
Next prompt. Further improvements
So far, I have already got the result, but I didn't like its style. For the next iteration, I wanted to see more use of the concurrency API available in Java 8.
System prompt
[COPIED FROM ABOVE]
User
This code is correct, thanks. Could you simplify or rework it using
only Java 8 functionality?
Output
Certainly! Here's a simplified version of the code using Java 8 features, specifically the CompletableFuture
class:
TemperatureResult readTemperature() throws TimeoutException {
CompletableFuture<BigDecimal> analogFuture =
CompletableFuture.supplyAsync(analogSensor::getTemp);
CompletableFuture<BigDecimal> digitalFuture =
CompletableFuture.supplyAsync(digitalSensor::getTemp);
CompletableFuture<BigDecimal> infraredFuture =
CompletableFuture.supplyAsync(infraredSensor::getTemp);
CompletableFuture<TemperatureResult> temperatureResultFuture = CompletableFuture
.allOf(analogFuture, digitalFuture, infraredFuture)
.thenApply(v -> {
BigDecimal analog = analogFuture.getNow(null);
BigDecimal digital = digitalFuture.getNow(null);
BigDecimal infrared = infraredFuture.getNow(null);
return new TemperatureResult(analog, digital, infrared);
});
try {
return temperatureResultFuture.get(500, TimeUnit.MILLISECONDS);
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException("Error retrieving temperature result", e);
} catch (TimeoutException e) {
TemperatureResult result = temperatureResultFuture.getNow(null);
if (result != null &&
atLeastTwoNonNull(result.analog, result.digital, result.infrared)) {
return result;
} else {
throw new TimeoutException("Could not retrieve " +
"at least two temperature results within 500ms");
}
}
}
This solution won't work again with the notorious 3s long service call. Essentially, in such a situation, we won't get to the thenApply and instead, a runtime error will be thrown. I'm not sure why Claude forgot about this use case, but I decided to mention it again.
Last prompt. We got it second time
System prompt
[COPIED FROM ABOVE]
User
This solution fails for the case when the first service returns in 3 seconds.
Output
Certainly! Here's a simplified version of the code using Java 8 features, specifically the CompletableFuture
class:
TemperatureResult readTemperature() throws TimeoutException {
CompletableFuture<BigDecimal> analogFuture = CompletableFuture.supplyAsync(analogSensor::getTemp);
CompletableFuture<BigDecimal> digitalFuture = CompletableFuture.supplyAsync(digitalSensor::getTemp);
CompletableFuture<BigDecimal> infraredFuture = CompletableFuture.supplyAsync(infraredSensor::getTemp);
CompletableFuture<TemperatureResult> temperatureResultFuture = CompletableFuture
.anyOf(
CompletableFuture.allOf(analogFuture, digitalFuture),
CompletableFuture.allOf(analogFuture, infraredFuture),
CompletableFuture.allOf(digitalFuture, infraredFuture))
.thenApply(v -> {
BigDecimal analog = analogFuture.getNow(null);
BigDecimal digital = digitalFuture.getNow(null);
BigDecimal infrared = infraredFuture.getNow(null);
return new TemperatureResult(analog, digital, infrared);
});
try {
return temperatureResultFuture.get(500, TimeUnit.MILLISECONDS);
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException("Error retrieving temperature result", e);
} catch (TimeoutException e) {
TemperatureResult result = temperatureResultFuture.getNow(null);
if (result != null &&
atLeastTwoNonNull(result.analog, result.digital, result.infrared)) {
return result;
} else {
throw new TimeoutException("Could not retrieve " +
"at least two temperature results within 500ms");
}
}
}
Here, I was surprised for the second time. Apart from the general knowledge of the Java Concurrency package, Claude used the composition of anyOf
with a list of allOf
calls that I probably wouldn't have implemented by myself, and I really liked it.
Conclusion
Even so the generated code was not the most idiomatic Java code, and Claude wasn't able to generate the correct solution on the first try, I still had a really good impression after working with Claude. After trying Phind
and IntelliJ IDEA AI Assistant
to help me with the same problem, I can confirm that Claude is probably the most capable programming model available on the market. It's not ideal, but at the same pace, we will be there really soon.
Discuss on X