How to Find the Bottleneck When Your API Request Is Slow

I’ve spent the better part of three decades watching teams chase performance problems the wrong way. Someone adds an index. Someone else tweaks the connection pool. A third person rewrites a query. Six weeks later, the API is still slow—or worse, it’s fast in staging and slow in production, and nobody can explain why.

The pattern is always the same: we optimize before we understand. And that’s expensive. In time, in morale, and in the trust of the people who use our systems.

Here’s what I’ve learned. When an API that used to respond in 200 milliseconds suddenly takes three seconds, the first question isn’t “what should we change?” It’s “where is the time going?” Answer that, and the fix often becomes obvious. Miss it, and you’re throwing darts in the dark.

Step 1: Measure Before You Optimize

This sounds obvious. In practice, it’s the step most teams skip. We see a slow endpoint, we have a hunch, and we act on it. I’ve done it myself. It rarely ends well.

Spring Boot Actuator gives you HTTP request metrics out of the box. Add spring-boot-starter-actuator, expose the metrics endpoint (with appropriate security—don’t open it to the world), and you’ll get http.server.requests with tags for URI, method, and status. That tells you which endpoints are slow. Not “probably slow.” Actually slow. Spring Boot metrics docs cover the full picture.

Micrometer—which Actuator uses under the hood—can export to Prometheus, Grafana, or Datadog. Once you have dashboards, you’ll see p50, p95, p99. Pay attention to p99. The average might look fine while a small fraction of requests are timing out. I’ve seen systems where p50 was 100ms and p99 was 8 seconds. The average hid the problem.

One more thing: measure in production, or in an environment that mirrors it. Staging often has different data volumes, different network latency, different everything. The bottleneck in production might not exist in staging. I’ve learned that lesson the hard way.

Step 2: Is It the Database?

In my experience, most slow APIs in Spring Boot and JPA applications are slow because of the database layer. Not all of them. But most. The trick is to confirm it before you start tuning.

Enable SQL logging—temporarily, in dev or a staging environment that can handle the noise:

spring.jpa.show-sql=true
spring.jpa.properties.hibernate.format_sql=true
logging.level.org.hibernate.SQL=DEBUG

Trigger the slow request. Watch the logs. Are there dozens of queries for a single API call? That’s your first red flag. I’ve seen endpoints that fired 200 queries for one response. The database wasn’t slow. We were just asking it to do too much.

Hibernate statistics give you numbers instead of gut feel:

spring.jpa.properties.hibernate.generate_statistics=true

Expose this via Actuator or log it periodically. You’ll see query count, cache hit rates, and connection acquisition time. If Session or transaction time is 80 or 90 percent of your request time, the bottleneck is the database. If it’s 20 percent, look elsewhere.

One word of caution: generate_statistics has a small overhead. I wouldn’t recommend leaving it on in production at full blast forever. Use it to diagnose, then turn it down or off once you’ve found the issue.

Step 3: The N+1 Problem—Your Most Likely Culprit

If you’ve worked with JPA for any length of time, you’ve run into this. I’ve fixed it in more codebases than I can count. It’s the single most common performance anti-pattern in ORM-based applications.

Here’s what happens. You fetch a list of entities—say, 50 orders. Each order has a @OneToMany relationship to order items. With lazy loading (the default for @OneToMany), the first query retrieves the orders. Then your code iterates and calls order.getItems(). Hibernate, doing exactly what you asked, runs a query for each order. One query for the list, plus N queries for the related data. Hence: N+1.

With 50 orders, that’s 51 queries. With 500, it’s 501. Each one has round-trip latency. Each one acquires a connection. The math adds up fast.

How to spot it: With SQL logging enabled, look for the same query pattern repeated many times with different parameter values. SELECT * FROM order_items WHERE order_id = ? appearing 50 times? That’s N+1.

How to fix it: Use JOIN FETCH so you load the relationship in a single query:

@Query("SELECT o FROM Order o LEFT JOIN FETCH o.items WHERE o.customerId = :customerId")
List<Order> findByCustomerWithItems(@Param("customerId") Long customerId);

LEFT JOIN FETCH ensures the items are loaded with the orders. One round trip. The Baeldung guide on N+1 walks through this and other approaches in detail.

Entity graphs are another option when you need more flexibility—different endpoints might need different subsets of relationships. You define a graph, attach it to the query, and Hibernate fetches what you’ve declared.

Batch fetching is a middle ground. Set hibernate.default_batch_fetch_size to 20 or 50. Instead of one query per order for items, Hibernate will load them in batches: “give me items for orders 1–20,” then “21–40,” and so on. It’s not as efficient as a single JOIN FETCH, but it’s a configuration change. No query rewrites. I’ve used it as a stopgap many times when we needed a quick win and couldn’t refactor immediately.

Step 4: Is PostgreSQL Doing Its Job?

Sometimes the query count is fine. One or two queries per request. But each one takes two seconds. That’s when you look at the database itself.

EXPLAIN ANALYZE is your primary tool. Take the slow query—copy it from the logs, with the actual parameter values substituted—and run it in psql or your DB client:

EXPLAIN (ANALYZE, BUFFERS) SELECT ...

You’ll get the execution plan, actual row counts, and timing. PostgreSQL’s EXPLAIN docs explain the output. A few things I always look for:

Seq Scan on a large table—often means a missing index. The planner chose a full table scan because it thought it would be faster. Sometimes that’s correct for small tables. For large ones, it’s usually a sign you need an index on the filter or join columns.
Nested Loop with a huge row estimate on the inner side—the join order might be wrong, or statistics might be stale. Run ANALYZE on the tables involved. The planner depends on up-to-date stats.
High Buffers numbers—you’re I/O bound. The query might be doing too much work, or the working set doesn’t fit in memory.

Missing indexes are incredibly common. If you’re filtering on customer_id, created_at, or status, there should be an index. If you’re joining on a column, there should be an index. Composite indexes matter when you filter on multiple columns—the order of columns in the index matters for how well it can be used.

One nuance: indexes aren’t free. They slow down writes. Add them where reads benefit, but don’t index every column “just in case.” I’ve seen tables with 15 indexes where three would have sufficed. Write performance suffered.

Connection pool exhaustion can masquerade as slow queries. If every request is waiting for a connection from the pool, latency spikes. Check HikariCP metrics—Actuator exposes them. hikaricp.connections.active and hikaricp.connections.pending. If pending is consistently high, you’re starving for connections. Either increase the pool size (with care—more connections mean more load on the database) or find and fix the leaks. Connection leaks—failing to return connections to the pool—are another pattern I’ve seen repeatedly. They show up under load.

Step 5: Application Logic and External Calls

If the database layer looks healthy—few queries, fast execution plans, no connection pressure—the bottleneck is elsewhere. In your code.

Blocking calls to external APIs, message queues, or file systems will block the request thread. One slow external service can stall your entire thread pool. Use WebClient for non-blocking HTTP when you can. Move work to @Async if it doesn’t need to complete before you respond. I’ve seen APIs that called five external services synchronously. Each took 200ms. That’s a second of latency before we’d even touched the database.

Heavy serialization—converting large object graphs to JSON—can surprise you. A response with hundreds of nested objects might spend more time in Jackson than in the database. Profile with JProfiler, VisualVM, or IntelliJ’s built-in profiler. See where CPU time goes. Sometimes the fix is to trim the response, or to add @JsonView to avoid serializing fields you don’t need.

Tracing ties it together. OpenTelemetry with Spring Boot gives you automatic spans for HTTP, JPA, and JDBC. A single trace shows the full breakdown: 50ms in the controller, 2.5 seconds in the database, 10ms in serialization. No guessing. You see exactly where the time went. I recommend instrumenting early. The first time you need it, you’ll be glad it’s there.

A Practical Checklist

When you’re in the trenches, it helps to have a sequence. Here’s the one I follow:

Identify the slow endpoint—Actuator metrics. Know which one, and what the latency distribution looks like.
Count the queries—Hibernate SQL logging and statistics. If it’s dozens per request, you’ve likely found the problem.
Fix N+1—JOIN FETCH, EntityGraph, or batch fetch. This alone has fixed more performance issues for me than any other single change.
Inspect query plans—EXPLAIN ANALYZE in PostgreSQL. Find the slow queries and understand why they’re slow.
Add indexes—On filter and join columns. Not everywhere. Where the plan shows they’ll help.
Check the connection pool—HikariCP metrics. Rule out starvation.
Profile application logic—If the database is fast, the CPU or I/O is somewhere else. Find it.

Start at the top. Work your way down. Most of the time, you’ll find the answer in steps 2 or 3. When you do, fix that first. Then measure again. Verify the fix before moving on.

The discipline that has served me best over the years is this: never optimize without measuring. Never assume. And when you think you’ve fixed it, measure one more time. Our intuition about performance is often wrong. The data is not.