Delta Force-d to Reboot

With Delta re-establishing their normal routine in the wake of the power outage that took their computer systems offline, there has been much discussion of what they could and should have done to prevent such an event. While the most high profile airline to encounter such a problem, they’re certainly not the only company, with Quartz noting 24 major tech problems for airlines since 2015. In reference to Delta’s problems, John Parkinson, affiliate partner at Waterstone Management Group, said, "Without knowing more about what really happened, you don't know whether it's a black swan event or whether this was a piece of carelessness or cost cutting or poor redundancy design, but that the root indicates there's a bit of system architecture issue going on here."

For companies to avoid similar issues:

  • Be careful about centralizing too many functions, as it can create single points of failure. Notes Parkinson, under such situations: "As soon as one piece fails, you tend to get cascade failures".
  • Companies should not only plan for redundancy, with backup system being in place to prevent outages in the loss of primary systems, but also ensure those backup systems are fully operational.
  • While confirming the operational status of backups is useful, it’s also necessary to conduct regular testing of backup systems. Observes Zubin Irani, founder and CEO of cPrime, companies “might test [backups] when they put it in, but I guarantee you, I rarely see companies do annual testing on redundancy”.

Source:

http://www.ciodive.com/news/what-cios-can-learn-from-the-delta-outage/424105/