The Gandi Community

[fixed] Incident on Hosting Filer

We have detected an anomaly on a hosting filer impacting several customer servers.  Our teams are currently working to resolve the issue as soon as possible.  We will update this notice as more information becomes available.

 

14:20 (GMT) / 10:20 (EST): We are still looking for the root of the problem before restarting your servers.

 

15:45 (GMT) / 11:45 (EST): Unfortunately at this stage we have no additional information available to relay.  Our entire team is mobilised to identify the cause of the problem and restablish service as soon as possible.

 

17:00 (GMT) / 12:00 (EST): The attempt to transfer to the backup storage controller did not yield a satisfactory result.

 

18:30 (GMT) / 13:30 (CET):  We have identified two or three potential sources of the problem, and our teams are attempting to apply the appropriate kernel patches.  The problem is centered around disk-write operations.  The “bug” appears to be known by Sun, but so far, not the solution.

 

20h30 (GMT) / 15h30 (EST) : Still working on the issue.  Some disks now function, but not all of them.  Unfortunately we still do not yet have an ETA to communicate, but we know that it will take several more hours. 🙁

 

20:50 (GMT) : 15:50 (EST) : A new kernel is being compiled currently and we will reboot the filer after the new kernel is installed.  (watch this space…)

 

23:00 (GMT) : 18:00 (EST) : The new Kernel is compiled and currently tested on a stage filer. Once tested, we will apply it on the broken storage unit.

 

00:00 (GMT) : 19:00 (EST) : Victory ! Filer seems to be back and running properly. We will restart all servers and monitore them to see if everything is ok. A detailled report will be sent tomorrow to all clients involved. Thank you for your patience.

 

We apologise for the inconvenience.