Certainly we can never be fully prepared for something as rare as Sandy but there are lessons we can pull from the last couple of days and I am sure there will be many more we can pull from the next few weeks. I did read an interesting article from the NY Times this morning that is related to thoughts on planning and a few items in particular stuck in my head.
I have talked about critical infrastructure as a "system of systems", tightly interwoven in some cases and in others loosely connected. One sentence from the article relates to this;
"As more of life moves online, damage to critical Internet systems affect more of the economy, and disasters like Hurricane Sandy reveal vulnerabilities from the sometimes ad hoc organization of computer networks."
Much like the interconnected systems of gas, electrical, transportation, finance, telecommunications and others, the Internet arose from the interconnection of very different systems which were built for very different reasons. As Internet services grew so did the companies that provide services and this in turn led to elements of geographic disbursement of capabilities and further interconnectedness through telecom systems and power systems. This growth naturally means greater opportunity for interruption based on the fact that the target space is greater. Of course, in theory it also means greater opportunity for high availability and reliability but that only works when the specific service is built with that in mind. The moral here is that one needs to ensure that the services that you pick at least meet the reliability needs of the service that you offer.
Another item that jumped out at me was raised in relation to the power situation.
"Power is the primary worry, since an abrupt network shutdown can destroy data, but problems can also stem from something as simple as not keeping a crisis plan updated."
So when should a crisis plan be updated? Certainly it is something that should be looked at annually to ensure that the plan itself is inline with business needs but awareness of the environment you are operating in should also cause one to consider if the situational environment will have an impact on business. Is a hurricane ,or some other naturally occurring but foreseeable event, bearing down on facilities that you rely on, whether they are your own or those of service providers? Has the geopolitical climate changed whereby the threat of cyber- or physical terrorism against a facility becoming a more significant risk? These are just some examples of situations that should have you pulling out your crisis plan to ensure that the plan does not need to be updated or altered.
Finally there was one element in this article that demonstrates the need for planning.
"Another downtown building ... had one generator in the basement, which was damaged by water. There is another generator, but it is on a higher floor. ... “We’ve got a truck full of diesel pulled up to the building, and now we’re trying to figure out how to get fuel up to the 19th floor.”"
It was great that they had planned for two generator but a 19th floor backup without a plan for getting the fuel to where it needs to be? When thinking about your plan do not overlook the little things. It is great to have redundancy but if the redundancy is reliant on other systems then make sure you are aware of that and have plans to address any potential gaps.
All of these ideas are ones raised due to a very rare and dramatic event but the underlying principles are the same whether it is physical infrastructure or cyber infrastructure:
- Understand the business needs for operations in regular and emergency circumstances
- Understand the assets that you are reliant on and classify them into ones you have control of and those that are outsourced
- Create a Crisis Plan and test it to ensure it meets the business needs and is executable
- Review the plan on a regular basis and when significant events occur ensure to consider the impact on the plan
Know what you have, know what you need, monitor to ensure steady state and be prepared for events that disrupt the steady state.