What Is Latency? How It Differs from Downtime
What latency means, how it differs from downtime, what causes high latency, and how to measure and reduce it for better website performance.
Your website is up. Nobody is getting an error page. But visitors are waiting 6 seconds for pages to load, and they are leaving before the content appears. The site is technically available, but the experience is terrible.
That is the difference between downtime and latency. Downtime means the site is unreachable. Latency means the site is reachable but slow. Both cost you visitors, revenue, and search rankings, but they have different causes and different solutions.
Latency Defined
Latency is the time delay between a request and a response. When a visitor clicks a link on your website, latency is the time between their click and the moment the first byte of data arrives in their browser.
Latency is measured in milliseconds (ms). A latency of 50ms feels instant. A latency of 200ms is noticeable. A latency of 1,000ms (one second) feels slow. Anything above 3 seconds and a meaningful percentage of visitors will leave before the page finishes loading.
It is important to understand that latency is not the same as total page load time. Page load time includes latency plus the time it takes to download all assets (images, CSS, JavaScript) and render the page. Latency is just the initial delay, the time spent waiting before anything starts happening.
Types of Latency
Latency is not a single thing. The total delay a user experiences is the sum of several different types of latency.
Network latency. The time it takes for data to travel across the network between the user's device and your server. This is primarily a function of physical distance and the number of network hops in between. A user in New York connecting to a server in New York might experience 5ms of network latency. The same user connecting to a server in Sydney might experience 200ms.
Server processing latency. The time your server spends processing the request after it arrives. This includes running application code, querying databases, and assembling the response. A well-optimized server might process a request in 10ms. A slow database query could push this to 2,000ms or more.
DNS resolution latency. Before a browser can connect to your server, it needs to resolve your domain name to an IP address. DNS lookups typically add 20 to 100ms, though caching reduces this for repeat visitors.
TLS/SSL handshake latency. Establishing an encrypted HTTPS connection requires a handshake between the client and server. This adds one to two round trips of network latency. On a fast connection to a nearby server, this might add 10ms. On a slow connection to a distant server, it could add 200ms or more.
Latency vs Downtime
Latency and downtime are both availability problems, but they are fundamentally different.
| | Downtime | High Latency | |---|---|---| | Definition | Site does not respond at all | Site responds, but slowly | | User experience | Error page or timeout | Spinning loader, slow rendering | | Detection | Binary: up or down | Gradual: fast, acceptable, slow, unusable | | Monitoring approach | Uptime checks (pass/fail) | Response time tracking (continuous) | | Impact threshold | Immediate (any downtime is felt) | Progressive (small increases go unnoticed) |
Downtime is a binary state. Your site is either up or down. Latency is a spectrum. Response times can degrade gradually over hours or weeks before anyone notices.
This makes latency harder to catch. Your uptime monitoring tool reports 100% uptime, but response times have crept from 200ms to 800ms over the past month. The site is technically "up" every time it is checked, but the user experience has worsened significantly.
Good monitoring tools track response time alongside availability. If yours only reports up/down status, you are missing half the picture. See the uptime monitoring guide for guidance on setting up comprehensive monitoring.
What Causes High Latency?
Geographic Distance
The speed of light is fast, but it is finite. Data traveling through fiber optic cables crosses the continental United States in about 40ms. A round trip (request and response) doubles that. Add network routing overhead and the delay is typically 60 to 80ms for transcontinental requests.
For international traffic, latency is even higher. A user in London accessing a server in Singapore can expect 150 to 250ms of network latency, just for the data to travel there and back.
The solution is serving content from locations closer to your users, either through a CDN or by deploying application servers in multiple regions. See CDN and uptime for how content delivery networks help.
Slow Database Queries
The most common cause of server-side latency. A query that scans a million rows to find 10 results, a missing index on a frequently queried column, or a complex join across large tables can add seconds to your response time.
Database latency often gets worse over time as your data grows. A query that ran in 5ms when the table had 10,000 rows might take 500ms when it has 10 million rows. This is the kind of gradual degradation that flies under the radar until it becomes a serious problem.
Unoptimized Application Code
Inefficient algorithms, unnecessary computation, or synchronous calls that could be asynchronous all add processing time. A function that makes three sequential API calls when it could make them in parallel triples the latency for that operation.
Too Many HTTP Requests
Every resource on your page (images, scripts, stylesheets, fonts) requires a separate HTTP request. Each request incurs latency. A page that loads 80 resources has 80 opportunities for delay. Reducing the number of requests through bundling, sprites, and lazy loading directly reduces total latency.
Third-Party Scripts
Analytics tags, advertising scripts, social media widgets, and chat widgets all load from external servers. You have no control over the latency of these third-party requests. A slow analytics service can delay your entire page load, especially if the script blocks rendering.
Server Resource Contention
When your server is running at high CPU or memory utilization, it takes longer to process each request. Multiple applications competing for the same server resources slow each other down. This is why performance degrades during traffic spikes, even before the server is completely overwhelmed.
DNS Issues
A slow or unreliable DNS provider adds latency to the first request from each visitor. If DNS resolution takes 300ms instead of 30ms, every new visitor's first page load is 270ms slower.
How to Measure Latency
Time to First Byte (TTFB)
TTFB measures the time from the start of the request to the first byte of the response arriving. It captures network latency, server processing time, and the initial response delivery. TTFB is the single best metric for server-side latency.
A good TTFB for a typical website is under 200ms. Under 100ms is excellent. Over 600ms indicates a problem worth investigating.
Ping and Round-Trip Time
A ping measures the round-trip time for a small packet to travel to a server and back. It isolates network latency from server processing time. If your ping to the server is 150ms but your TTFB is 800ms, you know that 650ms is spent on server-side processing. See what is a ping for more on how ping monitoring works.
Response Time in Monitoring Tools
Most uptime monitoring tools record the full response time for each check. This includes DNS resolution, TCP connection, TLS handshake, and the complete response download. Tracking this metric over time reveals latency trends.
Set up alerts not just for downtime but for response time thresholds. If your average response time exceeds 1 second, you want to know about it before it gets worse.
High latency is a leading indicator of downtime. When response times climb, the underlying cause (resource exhaustion, database overload, network saturation) often progresses to a full outage if left unchecked. Catching latency increases early gives you time to fix the problem before it takes the site down.
How to Reduce Latency
Use a CDN
A CDN caches your content on servers around the world, reducing the physical distance between your users and your content. For static assets, a CDN can cut response times by 50% to 90% for geographically distant users. For more on how CDNs interact with availability, see CDN and uptime.
Optimize Database Queries
Add indexes to frequently queried columns. Rewrite slow queries to be more efficient. Use query caching for results that do not change on every request. Monitor slow query logs to identify the worst offenders.
Enable Compression
Gzip or Brotli compression reduces the size of HTML, CSS, and JavaScript responses by 60% to 80%. Smaller responses transfer faster, reducing latency. Most web servers support compression with a simple configuration change.
Minimize HTTP Requests
Combine CSS and JavaScript files. Use CSS sprites for small images. Lazy-load images that are below the fold. Each eliminated request removes one round trip of latency.
Upgrade Your Server
Sometimes the simplest fix is more resources. A faster CPU processes requests more quickly. More RAM means less disk I/O. An SSD instead of a spinning disk dramatically reduces data access time. Moving from shared hosting to a VPS or dedicated server eliminates resource contention with other tenants.
Use a Faster DNS Provider
If DNS resolution is slow, switch to a faster DNS provider. Cloudflare, Google Cloud DNS, and Amazon Route 53 all offer fast, globally distributed DNS.
Keep Connections Alive
HTTP keep-alive (persistent connections) reuses TCP connections for multiple requests instead of opening a new connection for each one. This eliminates the TCP handshake and TLS negotiation for subsequent requests.
Key Takeaways
- Latency is the time delay between a request and a response. It is measured in milliseconds.
- Latency and downtime are different problems. Downtime is binary (up or down). Latency is a spectrum (fast to slow).
- High latency is a leading indicator of downtime. Catching it early prevents outages.
- The most common causes are geographic distance, slow database queries, and server resource contention.
- TTFB (Time to First Byte) is the best single metric for measuring server-side latency.
- Monitor both availability and response time. Uptime monitoring that only checks up/down status misses latency problems.
Track response time alongside uptime
Uptime Monitor measures both availability and response time for every check. Catch latency problems before they become outages.
Try Uptime Monitor