What Is an SLA? Service Level Agreements Explained

What a service level agreement (SLA) is, how SLAs work, what they include, and why they matter for uptime and website reliability.

A service level agreement is a contract between a service provider and a customer that defines what level of service the customer should expect. It puts specific numbers on things like uptime, response time, and support availability, then spells out what happens when those numbers are not met.

SLAs are everywhere in the technology world. Your hosting provider has one. Your cloud vendor has one. Your CDN, your email service, and your payment processor probably all have one. Understanding what SLAs actually say, and what they leave out, is essential for anyone running a website or online business. For a deeper look at uptime commitments specifically, see the uptime and SLA availability guide.

What Does an SLA Include?

An SLA is a formal document, but the core of it comes down to a few key components. Every SLA is structured differently, but you will find the same basic elements in nearly all of them.

Service Description

The SLA defines exactly what service is covered. This sounds obvious, but the details matter. A hosting provider's SLA might cover server hardware uptime but exclude network connectivity. A cloud platform's SLA might cover the compute service but not the management console. If the thing that broke is not listed in the SLA, the provider is not obligated to compensate you.

Read the service description carefully. Providers are precise about what is and is not included, and that precision is intentional.

Performance Metrics

The SLA specifies measurable targets for the service. The most common metric is uptime percentage, but SLAs can also include:

  • Availability: The percentage of time the service is operational. This is the headline number, like 99.9% or 99.99%.
  • Response time: How quickly the service responds to requests. This might be expressed as a latency target (e.g., p99 latency under 200ms).
  • Throughput: The capacity the service can handle, such as requests per second or bandwidth.
  • Support response time: How quickly the provider responds to support tickets, often tiered by severity.

These metrics are the teeth of an SLA. Without specific, measurable numbers, the agreement is just marketing language. For a detailed breakdown of how uptime percentages translate to real downtime, see uptime nines explained.

Measurement Method

Good SLAs specify how the metrics are measured. This includes:

  • Monitoring frequency: How often the service is checked (every minute, every five minutes).
  • Monitoring locations: Where checks originate. A service might be up in one region and down in another.
  • Calculation period: Whether uptime is calculated monthly, quarterly, or annually. Monthly calculations are more customer-friendly because a bad month is not averaged out over good months.
  • Exclusions: Time periods that do not count against the SLA, such as scheduled maintenance windows.

If the SLA does not specify how measurement works, you are relying on the provider's internal reporting. That is a weaker position for the customer.

Remedies and Credits

When the provider misses an SLA target, the SLA defines what happens next. The most common remedy is a service credit, which is a percentage discount on your next bill.

Typical credit structures look like this:

| Uptime Achieved | Service Credit | |---|---| | 99.0% - 99.9% | 10% | | 95.0% - 99.0% | 25% | | Below 95.0% | 50% |

Notice that service credits are almost never full refunds. A provider that is down for an entire day might owe you a 25% credit on your monthly bill. That credit does not cover the revenue you lost, the customers who left, or the emergency engineering work your team had to do. SLA credits are a gesture, not compensation. See the cost of website downtime for more on the real financial impact.

Responsibilities

SLAs typically outline what both parties are responsible for. The provider commits to maintaining the service at the specified level. The customer commits to using the service within its documented limits, keeping their own systems updated, and following the provider's operational guidelines.

If you overload a server beyond its rated capacity and it goes down, the provider will point to this section of the SLA. Customer responsibilities are often the escape clause that providers use to deny credit claims.

How SLAs Work in Practice

Reading an SLA and living with an SLA are different experiences. Here is how SLAs actually play out in the real world.

You Have to File a Claim

Most providers do not automatically issue credits when they miss an SLA. You have to notice the outage, document it, and file a claim within a specific window (often 30 days). If you miss the window, you get nothing. This is why uptime monitoring is not optional. You need an independent record of outages to back up your claims. Relying on the provider to tell you when they were down is like asking the fox to count the chickens.

Exclusions Are Broad

SLA exclusions can be surprisingly wide. Common exclusions include:

  • Scheduled maintenance (even if it happens during business hours)
  • Force majeure events (natural disasters, government actions)
  • Issues caused by the customer's own code or configuration
  • Beta or preview features
  • Free-tier services
  • DDoS attacks (the provider may exclude outages caused by attacks)

These exclusions mean that the effective SLA is weaker than the headline number suggests. A provider advertising 99.99% uptime might exclude enough scenarios that the practical guarantee is closer to 99.9%.

The Credit Is Capped

Service credits almost always have a cap, typically 30% to 50% of the monthly bill for the affected service. Even in a catastrophic outage lasting days, the maximum you can recover through the SLA process is a fraction of one month's fee. For serious financial losses, you would need to pursue other legal remedies, and most SLAs include liability limitation clauses that make that difficult too.

An SLA credit is not designed to make you whole after an outage. It is designed to demonstrate that the provider takes its commitments seriously. For real financial protection, you need your own monitoring, your own redundancy, and your own incident response plan.

SLAs Across Different Service Types

Hosting and Infrastructure SLAs

Hosting providers and cloud platforms typically offer SLAs between 99.9% and 99.99% uptime. These are among the most mature SLAs in the industry because uptime is directly measurable and the providers have long track records.

AWS, Google Cloud, and Azure all publish detailed SLAs for each service they offer. The SLA for virtual machines is different from the SLA for managed databases, which is different from the SLA for serverless functions. Each service has its own uptime target, its own credit schedule, and its own exclusions.

SaaS Application SLAs

SaaS providers offer SLAs that cover application availability. These are trickier because "available" can mean different things. A project management tool might load its homepage but fail to save changes. Is that "up" or "down"? The SLA should define this, but many SaaS SLAs are vague about partial failures.

Enterprise SaaS agreements often include negotiated SLAs with higher uptime guarantees and faster support response times than the standard offering. If your business depends heavily on a SaaS tool, negotiating the SLA is worth the effort.

API and Integration SLAs

APIs that your application depends on often have their own SLAs. Payment processors, email delivery services, and data providers all publish SLAs for their API endpoints. These matter because an API outage can cascade into your own application's downtime, even though your servers are running perfectly.

When evaluating your overall availability, you need to consider the SLAs of every external dependency. If you depend on three services that each promise 99.9% uptime, your theoretical maximum availability is lower than 99.9% because any one of them going down affects you. For more on tracking reliability metrics, see incident response metrics.

How to Evaluate an SLA

Not all SLAs are created equal. Here is what to look for when comparing providers or evaluating an SLA you are already subject to.

Check the Uptime Definition

Does the SLA define "available" clearly? Some SLAs only count a total outage as downtime. Degraded performance, where the service is slow but technically responding, might not count. Look for SLAs that include latency or performance thresholds in their availability definition.

Look at the Measurement Window

Monthly measurement is better for customers than annual. If your provider has 99.9% uptime annually, they could have one terrible month with hours of downtime and still meet the annual target because the other 11 months were clean. Monthly measurement holds providers accountable on a shorter cycle.

Read the Exclusions

This is the most important section of any SLA. The exclusions tell you what the provider does not guarantee. Compare exclusions across competitors. Fewer and narrower exclusions mean a stronger SLA.

Verify Independent Monitoring

Do not rely solely on the provider's own uptime reporting. Use independent monitoring to track the actual availability of services you depend on. If there is a dispute about whether the SLA was met, your independent monitoring data is your evidence. Learn how to set up your own monitoring in the uptime monitoring guide.

Consider the Credit Value

Calculate what the maximum SLA credit would actually be worth in dollars. If you pay $100 per month for a service and the maximum credit is 30%, the most you will ever get back is $30, regardless of how bad the outage was or how much revenue you lost. If the financial exposure from an outage is large, you need additional risk mitigation beyond the SLA.

SLAs vs SLOs vs SLIs

These three terms are related but distinct. Understanding the differences helps you have more precise conversations about reliability.

SLI (Service Level Indicator) is a metric that measures some aspect of service quality. Examples: request latency, error rate, availability. SLIs are the raw measurements.

SLO (Service Level Objective) is an internal target for an SLI. Example: "99.95% of requests should complete in under 300ms." SLOs are goals the engineering team works toward. They are typically stricter than the external SLA.

SLA (Service Level Agreement) is a contract with customers that includes consequences for missing targets. The SLA is the external commitment, backed by credits or penalties.

For a detailed comparison of these three concepts, see SLI vs SLO vs SLA.

Writing Your Own SLA

If you provide a service to customers, publishing an SLA demonstrates confidence in your product and builds trust with buyers. Here is how to approach it.

Start with your actual performance data. Review your uptime over the past 6 to 12 months. Set your SLA target below your actual performance, leaving a buffer. If you have been running at 99.97% uptime, an SLA of 99.9% gives you room for the occasional bad day without triggering credits.

Define your terms precisely. Specify how you measure availability, what counts as downtime, what your maintenance windows are, and how customers can file claims. Vague language creates disputes.

Set credit tiers that are meaningful but sustainable. Credits should be large enough that customers feel acknowledged, but structured so that a single bad month does not bankrupt your business.

Review your SLA quarterly. As your infrastructure matures, you might tighten your commitments. As you add new features or integrations, you might need to adjust exclusions.

Key Takeaways

  • An SLA is a contract that defines measurable service quality targets and the consequences for missing them.
  • The most common SLA metric is uptime percentage, but SLAs can also cover latency, throughput, and support response time.
  • SLA credits compensate customers when targets are missed, but they rarely cover the full cost of an outage.
  • Always read the exclusions -- they define what the provider does not guarantee, and they are often broader than you expect.
  • Use independent monitoring to verify SLA compliance. Do not rely on the provider's own reporting.
  • SLAs are one piece of the reliability puzzle. Pair them with your own monitoring, redundancy, and incident response planning.

Track your uptime independently

Monitor your websites every minute from multiple locations. Get the data you need to verify SLA compliance and hold providers accountable.

Try Uptime Monitor