Skip to content

Commit 4c823e0

Browse files
authored
Refine SLO
1 parent 13f9c9e commit 4c823e0

File tree

1 file changed

+9
-10
lines changed

1 file changed

+9
-10
lines changed

docs/sla-policy.md renamed to docs/slo-policy.md

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ toc_max_heading_level: 2
44

55
# Service Level Objective (SLO) Policy
66

7-
We are committed to providing reliable, high-quality services to our customers. This Service Level Objective (SLO) outlines our availability commitments, incident response procedures, and the transparency measures we employ to keep you informed about the health of our services.
7+
We are committed to providing reliable, high-quality services to our customers.
88

99
## Incident Classification & Response Times
1010

@@ -15,14 +15,13 @@ We are committed to providing reliable, high-quality services to our customers.
1515
| **Severity 3 (Medium)** | Minor service degradation or non-critical functionality unavailable | 4 hours | 24 hours | As needed |
1616
| **Severity 4 (Low)** | Cosmetic issues or minor bugs with workarounds available | 1 business day | Best effort | As needed |
1717

18-
## How We Catch Problems Fast
18+
## Incident Detection Procedures
1919

20-
We've set up several systems to catch issues before you even notice them:
20+
We've set up several systems to identify incidents:
2121

22-
- **Real-time Monitoring**: Our systems watch critical endpoints 24/7 and alert us the moment something goes wrong
23-
- **Automated Testing**: We regularly test authentication and pipeline workflows to catch issues before they affect you
24-
- **Error Tracking**: We use tools like Sentry to get instant notifications when errors occur
25-
- **Support Monitoring**: Our team watches support channels during business hours to catch issues you report
22+
- **Real-time Monitoring**: We have observability and monitoring on our core infrastructure.
23+
- **Error Tracking**: We use tools like Sentry to aggregate and produce notifications of errors.
24+
- **Support Monitoring**: Our team watches support channels during business hours to catch issues you report.
2625

2726
## Communication & Transparency
2827

@@ -34,7 +33,7 @@ Here's what you can expect from us during an incident:
3433
- We've found the problem and are working on it
3534
- How bad it is and who's affected
3635
- When you'll hear from us next
37-
2. **Regular Updates** (every 2 hours for critical issues)
36+
2. **Regular Updates**
3837
- What's happening right now
3938
- What we're doing to fix it
4039
- Updated timeline if things change
@@ -45,7 +44,7 @@ Here's what you can expect from us during an incident:
4544

4645
### After We Fix It
4746

48-
For serious incidents (Severity 1 & 2), we'll publish a full report within 5 business days that includes:
47+
For serious incidents (Severity 1 & 2), we'll create a Root Cause Analysis that, upon request, will be shared with customers, including:
4948

5049
- **What Happened**: Step-by-step timeline of the incident
5150
- **Who Was Affected**: How many customers and what services were impacted
@@ -58,7 +57,7 @@ For serious incidents (Severity 1 & 2), we'll publish a full report within 5 bus
5857
This SLO doesn't apply to:
5958

6059
- Beta or preview features (they're still experimental)
61-
- Scheduled maintenance (we'll give you 72 hours notice)
60+
- Scheduled maintenance
6261
- Issues outside our control (internet outages, AWS problems, etc.)
6362
- Problems you caused (wrong configuration, hitting rate limits, etc.)
6463
- Third-party service failures

0 commit comments

Comments
 (0)