From our hosting provider:
"On further investigation, this issue was due to an additionally damaged PDU from Friday's surge. It has been replaced with a new PDU and service has been restored. We are still awaiting an RCA from Zayo, and will be reaching out with more information as it becomes available."
Zayo has just released an RCA regarding the original incident on Nov 12th, which triggered this followup incident. The RCA mentioned a PDU inspection, but it appears one passed the inspection that later failed.
Scout's Next Steps:
This incident reinforces our efforts on better redundancy for networking issues impacting our datacenter. The majority of incidents server monitoring has experienced in 2016 have been network-related vs. application issues. This is our priority for server monitoring entering 2017.