Invalid No Data Alerts
Incident Report for Scout
Postmortem

At 12:18pm Eastern today Level3 experienced an issue in a router within their network that caused disruption for connections passing through this upstream.

Traffic re-routed through other available providers and no other providers were impacted by this outage. The Level3 upstream connection is restored and stable at this time and the provider is investigating this issue with their own vendors.

These network issues can be difficult for us to detect as they don't impact everyone. For example, it didn't impact our outside monitoring systems, so we weren't notified of the issue.

We've added a trigger on our Network Interfaces to alert us of sustained drops in network traffic to help us more quickly identify this problem in the future.

Posted Apr 28, 2014 - 14:40 MDT

Resolved
This incident has been resolved.
Posted Apr 28, 2014 - 13:35 MDT
Update
We're talking with our hosting provider to get more background on the network outage.
Posted Apr 28, 2014 - 11:06 MDT
Update
It appears that all servers are reporting again. This appeared to impact 2.3% of servers reporting to Scout.

It looks like a network issue - all of our Pingdom checks are passing and there are no stability issues on our servers. We've had multiple reports of issues coming from the EC2 us-west-2 region and Terramark Cloud in Virginia.
Posted Apr 28, 2014 - 10:47 MDT
Investigating
We're hearing reports of invalid "No Data" Alerts being sent from scoutapp.com. We're investigating the issue.
Posted Apr 28, 2014 - 10:25 MDT