Skip to main content

Spring Cloud Sleuth

Introduction

In a microservices architecture, a single user request might traverse dozens of different services before generating a response. When an error or performance issue occurs, it can be extremely challenging to pinpoint where the problem originated. This is where Spring Cloud Sleuth comes into play.

Spring Cloud Sleuth is a distributed tracing solution for Spring Cloud applications. It provides functionality to track requests as they propagate through complex distributed systems. By assigning unique identifiers to requests, Sleuth enables developers to correlate log entries across multiple services, making debugging and performance analysis significantly easier.

Understanding Distributed Tracing Concepts

Before diving into Sleuth, let's familiarize ourselves with some key terminology:

  • Trace: Represents the complete journey of a request as it moves through the distributed system
  • Span: A named, timed operation representing a piece of the workflow
  • Trace ID: A unique identifier shared by all spans in a trace
  • Span ID: A unique identifier for a specific operation within a trace
  • Parent Span ID: Identifies the parent operation that triggered the current span

Sleuth works by automatically adding these identifiers to your application's logs, allowing you to track a request's journey across your microservices.

Getting Started with Spring Cloud Sleuth

Adding Dependencies

To use Spring Cloud Sleuth, you need to add its dependency to your Spring Boot project:

xml
<!-- For Maven -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>

Or if you're using Gradle:

groovy
// For Gradle
implementation 'org.springframework.cloud:spring-cloud-starter-sleuth'

Basic Configuration

One of the best things about Spring Cloud Sleuth is that it requires minimal configuration. Once you add the dependency, it automatically instruments common components like:

  • Spring MVC controllers
  • RestTemplate
  • WebClient
  • Scheduled actions
  • Message channels (Spring Integration)
  • Feign client

Log Format

After adding Sleuth, your log entries will be automatically enriched with tracing information. For example:

2023-10-15 12:34:56.789 INFO [service-name,5d0e2a0d86a3f235,9e71a7bfd551daec,true] 23479 --- [nio-8080-exec-1] c.e.DemoController : Processing request

In this log entry:

  • service-name: The name of your application
  • 5d0e2a0d86a3f235: The trace ID
  • 9e71a7bfd551daec: The span ID
  • true: Indicates whether the span should be exported to a tracing system

Working with Spring Cloud Sleuth

Customizing Sampling Rate

By default, Sleuth samples 10% of requests for tracing. You can customize this behavior in your application.properties:

properties
spring.sleuth.sampler.probability=1.0

Setting the probability to 1.0 means 100% of requests will be traced, which is useful for development but might be too resource-intensive for production.

Manual Instrumentation

If you need to create custom spans for specific operations, Sleuth provides a Tracer bean:

java
@RestController
public class ExampleController {

private final Tracer tracer;

public ExampleController(Tracer tracer) {
this.tracer = tracer;
}

@GetMapping("/example")
public String example() {
Span span = tracer.nextSpan().name("custom-operation").start();
try (SpanInScope ws = tracer.withSpan(span)) {
// Your custom operation code
return "Operation completed";
} finally {
span.end();
}
}
}

Baggage Items (Context Propagation)

Sometimes you need to pass additional context across service boundaries. Sleuth supports this through baggage items:

java
@GetMapping("/process")
public void processRequest() {
Span currentSpan = tracer.currentSpan();
currentSpan.tag("customerId", "123");

// For complex baggage:
Baggage.Builder builder = this.tracer.createBaggage("user.id", "user-123");
try (Tracer.SpanInScope ws = this.tracer.withSpan(currentSpan)) {
// Call other services, the baggage will be propagated
restTemplate.getForEntity("http://another-service/api", String.class);
}
}

Integration with Zipkin

While Sleuth adds tracing information to your logs, Zipkin provides visualization and analysis capabilities for these traces. Here's how to integrate Spring Cloud Sleuth with Zipkin:

Adding Zipkin Dependency

xml
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-sleuth-zipkin</artifactId>
</dependency>

Configuring Zipkin

Add the following to your application.properties:

properties
spring.zipkin.base-url=http://localhost:9411

This assumes you have Zipkin running locally. For a quick start with Zipkin, you can use Docker:

bash
docker run -d -p 9411:9411 openzipkin/zipkin

After configuration, you'll be able to visualize traces in the Zipkin UI:

  1. Start your application
  2. Make a few requests
  3. Open http://localhost:9411 in your browser
  4. Search for traces and analyze request flow

Practical Example: Microservices Communication Tracing

Let's look at a practical example with two services: an API Gateway and a Product Service.

API Gateway Service

java
@RestController
public class GatewayController {
private final RestTemplate restTemplate;
private final Logger logger = LoggerFactory.getLogger(this.getClass());

public GatewayController(RestTemplate restTemplate) {
this.restTemplate = restTemplate;
}

@GetMapping("/products/{id}")
public Product getProduct(@PathVariable Long id) {
logger.info("Request received for product {}", id);
return restTemplate.getForObject("http://product-service/products/" + id, Product.class);
}
}

Product Service

java
@RestController
public class ProductController {
private final Logger logger = LoggerFactory.getLogger(this.getClass());

@GetMapping("/products/{id}")
public Product getProduct(@PathVariable Long id) {
logger.info("Fetching product details for id {}", id);
// Simulate database lookup
return new Product(id, "Sample Product", 29.99);
}
}

Resulting Logs

When a request passes through these services with Sleuth enabled, the logs will look like:

API Gateway:

2023-10-15 14:23:45.123 INFO [api-gateway,6f4d28ac34cf8e76,6f4d28ac34cf8e76,true] 23481 --- [nio-8080-exec-2] c.e.GatewayController : Request received for product 42

Product Service:

2023-10-15 14:23:45.268 INFO [product-service,6f4d28ac34cf8e76,9af3b7d8c6345f12,true] 23482 --- [nio-8081-exec-1] c.e.ProductController : Fetching product details for id 42

Notice both logs share the same trace ID (6f4d28ac34cf8e76), but have different span IDs. This allows you to track the request across services.

Common Challenges and Solutions

Dealing with Asynchronous Operations

Sleuth handles most asynchronous operations automatically, but for custom executors, you need to ensure context propagation:

java
@Configuration
public class ThreadPoolConfig {

@Bean
public Executor executor(BeanFactory beanFactory) {
Executor delegate = new ThreadPoolTaskExecutor();
return new LazyTraceExecutor(beanFactory, delegate);
}
}

Avoiding Trace ID Collisions in High-Load Systems

In high-load systems, you might want to customize the ID generation strategy:

properties
spring.sleuth.trace-id128=true

This will generate 128-bit IDs instead of the default 64-bit, significantly reducing the risk of collisions.

Summary

Spring Cloud Sleuth is an essential tool for implementing distributed tracing in Spring Cloud applications. It provides:

  • Automatic instrumentation of common Spring components
  • Consistent tracing information across service boundaries
  • Easy integration with visualization tools like Zipkin
  • Support for custom spans and contextual baggage

With Sleuth, debugging and monitoring microservices becomes significantly easier, as you can trace requests end-to-end across your distributed system.

Additional Resources

Exercise

  1. Create a simple microservices application with at least three services that communicate with each other.
  2. Add Spring Cloud Sleuth to all services and configure them to send traces to a local Zipkin instance.
  3. Implement a custom span for a critical business operation.
  4. Add a baggage item that carries user information across service calls.
  5. Analyze the traces in Zipkin and identify potential performance bottlenecks.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)