Added first draft for section on understanding and managing false positives
Context
The platform team has been investigating a series of networking issues that impact our customers' check performance to varying degrees (Unusually long TTFB, ERR_NETWORK_CHANGED failures etc.) - While we work on minimizing these kind of issues, is it also important to acknowledge that we're usually not dealing with solvable bugs here but "sometimes stuff goes wrong on the internet" issues. To quote Ben:
In the last 30 days our daily max failure rate for slow TTFB checks is around ~0.01% and never above ~0.0216%. I actually agree with Yves from AWS here. No one can guarantee 0% packet failure rate when public internet is involved. The rates we see are totally acceptable™. Said that, we cannot fix this "with tech" but with education. Instead, we should write docs on how internet works, what is an accepted failure rate at Checkly, when, why and how to use retries [...]
This docs draft presents a first attempt of creating a knowledge base article that can be shared with our customers and users (e.g. for support cases on this topic). I'd love to get some feedback on whether this is going in the right direction (as I'm pretty new to the topic myself). Open questions I have:
-
Does it make sense to mention private locations here as well as a way to stay in charge of infrastructure (and any server/networking issues)?
-
The more technical depths we can add the better, is there anything I am missing or could go into more detail?
-
Adding a more technical troubleshooting / what to look out for guide such as https://www.pingdom.com/blog/pingdom-says-my-site-is-down-but-it-is-not/ would be quite helpful - If we decide to do this I’d probably move the “What Is an Accepted Failure Rate at Checkly?” into there. Curious to hear what you think
Reference
https://www.notion.so/checkly/Product-Analysis-Networking-Issues-5014d3e76685418ca1699bd1c0db320b
The latest updates on your projects. Learn more about Vercel for Git ↗︎
| Name | Status | Preview | Comments | Updated (UTC) |
|---|---|---|---|---|
| checklyhq-com | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Oct 16, 2024 0:22am |
Preview available at: https://checklyhq-com-git-false-positives-checkly.vercel.app/docs/monitoring/false-positives/