In “Different Seasons,” Stephen King wrote, “ … there’s no harm in hoping for the best as long as you’re prepared for the worst.” Disruptions occur and disasters strike. Sadly, on a personal level and in business, we often find ourselves acting in unplanned and uncoordinated ways when an incident occurs.
Preparation, on the other hand, allows us to act in pre-planned ways. Preparedness enables us to avoid chaos, or at least confusion. Officers in the U.S. military talk about the seven Ps: Proper Planning Practice Prevents Poor Performance. I know, I left one “P” out, but my mother told me to never use such a word.
Business Continuity and Disaster Recovery (BC/DR) planning is the process of developing the plans, processes and procedures to respond to the range of incidents. We start with understanding the essential functions of an organization, called Business Impact Analysis (BIA). In life, we set the same priorities: protection of family and friends, shelter, food and water and other life-giving essentials.
In his 7-Habits series, Dr. Stephen Covey asks “What matters most?” Once we’ve determined this, then we can understand the challenges to those thing we hold most dear in life or most important at work.
The second step in the process is to understand the threats to the continuity of the business or our well-being. Performing risk analysis helps us prioritize those things that are essential. We can also choose the responses to threats, whether we avoid them through mitigations or by simply not engaging in a behavior or business. We call this “rejecting the risk.” We may also choose to transfer the pain through outsourcing or insurance. Lastly, we can simply choose an approach of “Que Sera, Sera” and accept that we’ll have to handle the response to a problem when it occurs.
Having performed the BIA, identified the critical assets to support our business or our lifestyles and prioritized and budgeted, it’s time to get to work protecting those things. We also need to understand that we’ll need plans to respond to incidents and that we’ll need to execute them at some point.
Methods to Protect Your Assets
When I was a young padawan, one of my security masters emphasized to me that there is no security without physical security. Locked doors, windows and controlled physical access to computers and networking devices are essential asset protection. This extends to the security of the actual location: Heating/Ventilation/Air-Conditioning (HVAC), water and sewage must be protected as well. At home, you probably protect your belongings by keeping the doors locked and, potentially, your windows secured — at least on the ground floor. You depend on the security of the utility providers to ensure that the resources you use are safe and healthy. For businesses, someone with the title or job of “security manager” helps ensure these physical aspects.
A second way to protect your assets and information, whether personal or business-related, is through backups. You could choose to backup just your data or your entire set of systems. You could also store your backups locally, in a remote location or on the Internet. As I discuss in my blog about backup, each has its drawbacks and advantages. With local backups, you run the risk that your saved data would be destroyed in a disaster that affects your business. On the other hand, Internet-based solutions tie you to the product vendor and restoring your information may be slow or cumbersome.
Technically speaking, making backups is a form of using redundancy. That is, you’ll have multiple copies of your data and system information in the event of a disaster. There are other forms of redundancy, however.
You can keep multiple copies of your important information on duplicate hard drives. We call that technology Redundant Arrays of Inexpensive Disks, or RAID. You might even have duplicate copies of entire systems in, for example, a disaster recovery site or hosted on the Internet.
Much of your protection strategy comes from your recovery strategy. You will need to evaluate whether the risks to the business as a whole are greater than the risks to individual systems or the data stored there. If your recovery strategy focuses on making sure that you restore complete systems in case of emergency, you may choose a backup strategy centered on whole-system and bare-metal recovery. The emphasis here is to resume essential functions as part of disaster recovery. On the other hand, if individual files, folders, or databases are more important (as in the case of business continuity,) you may choose a backup-and-restore strategy protecting your data instead. To learn more about these differences, I encourage you to visit my articles titled “Backing Up Your Data vs. Backing Up Your PC” and “The Benefits of Backing Up Your Data Online” on Global Knowledge’s blog site.
Incident Management and Business Continuity
Another of my security masters told me, as a young padawan, that there were two kinds of motorcycle riders. There are those who “have put the bike down” — as in fallen — and “those who will”. Extrapolating, there are only two kinds of networks: those that have been hacked and those that will be hacked. There are only two kinds of computers: those that have crashed and those that will crash. Depending on your point of view, you could see this dichotomy as fatalistically realistic or just morbid.
Based on your risk analysis and business strategy, you should develop Incident Management plans. At home, we would hope that you have fire-evacuation plans, for example. In business, you can build incident response processes that are completely detailed and step-by-step, so you can automatically implement your recovery processes. Airlines are the best example of having thorough, detailed incidence response processes — more on this later.
Another lesson that can be learned from the airline industry is the importance of practice and rehearsal. Airlines spend huge amounts on airplane simulators for that purpose. Likewise, the amount of time that captains and first officers spend simulating normal activities and their responses to disastrous conditions makes flying incredibly safer than just about any other activity which we could undertake. One of my favorite videos about this relates to the US Air flight (Cactus 1471) that landed in the Hudson River after losing both engines to “bird strikes.” This video is a great testament to the skills of the pilots, but also speaks volumes about being prepared.
Measuring the effectiveness of a security policy, incident management and BC/DR is to calculate the Security Return on Investment (S/ROI). This is often difficult, but techniques exist. For example, if you accept that incidents will occur regardless of all the protections in place, then lowering the costs of the responses raises the S/ROI.
Please bear in mind that incidents can range from a minor disruption where the simplest business continuity process could be implemented through completely recovering after a catastrophic disaster. The range might include, for example, repairing a single virus-infected computer to responding to a rampant malware infestation on large sets of computers.
When Disaster Strikes
In many ways, BC/DR covers a continuous range of situations. Looking at business continuity, an organization might consider how to serve their customers during inclement weather such as a blizzard. In that same blizzard, parents might worry about feeding their family in the absence of electrical power.
Meeting the needs of an organization’s customers or a family’s responsibilities starts with the risk analysis and the defined response processes when something goes wrong. Understanding these threats leads to pre-planned activities.
Part of the incidence response process is the implementation of checklists so that events can be handled in a consistent and repeatable way. Again, the airline industry spends a great deal of time training their employees and also building the checklists to every imaginable type of crisis.
Practice and rehearsal are critical. The simplest example of this is initiating regular fire drills.
For those involved in incidence response, perhaps the hardest task is to declare the emergency. In J.K. Rowling’s “Harry Potter” series, Ron asks “Can we panic now?”
The steps are straightforward but rarely simple: Decide that there is a problem. Then, figure out what the problem may be. Declare the emergency. Implement the recovery processes.
It sound simple, at least in theory. Heraclitus of Ephesus said, “Expect the unexpected” someplace between 535-475 B.C. While improvisation is often viewed negatively because it deviates from proven techniques, situations occur that are outside the proverbial rulebook. Then, experience and cool heads often solve the problem.
As someone thinks about communication in the BC/DR context, there are really three elements: communicating information about the incident itself, interaction between team members and discussions with outside entities such as law enforcement or press and media representatives.
The first step in incidence response and invoking the BC/DR processes is recognizing that there’s a problem. After achieving that realization, the communication process starts with notifying the appropriate parties.
Coordinated communication among the incidence response team is essential as well, to implement and provide essential services in a business continuity situation or restoring critical functions in the presence of a disaster.
When interacting with the press and law enforcement, individuals need to be properly trained and prepared. If not, then the incident could become a PR disaster as well. The only people who should communicate outside of the organization are those who are empowered to do so, and with the proper messages.
BC/DR and “The Emergency”
Formally or informally, someone will declare the emergency. On a personal level, this might take the form of something regular and expected such as a power failure. Sadly, people and communities experience more traumatic events for businesses; the problem might range from run-of-the-mill to headline-generating.
When the incident occurs, having the pre-defined response processes and automated checklists will help maintain what’s important.
The next two steps are recovering and restoring life to a state of normalcy and then evaluating the situation in hindsight.
A phrase in the trade describes this: “The emergency is over when the ambulances go home.” A common tendency is to remain in the state of emergency long past the actual critical time. Part of the goal of disaster recovery is to return to normal operation. This may require restoration functions. If one is recovering from a natural disaster, for example, once the cleanup is completed, then one can return to the original location and operation.
My faculty advisor in college taught me that I should, “Take good notes, that way your mistakes are repeatable.” Keeping records serves several purposes. First, the implementations of your plans become dependable and predictable.
Next, there will be changes in how the response works, whether because of time or changes in other circumstances. The things that you learn can help improve responses in the future.
Lastly, the development of BC/DR processes is iterative — they are done many times. Taking the lessons from the last time can help prepare for the next time.
In the first part of this article (on Global Knowledge’s Blog website,) we looked at Business Impact Analysis and Risk Analysis. This is also a major component in Global Knowledge’s “Cybersecurity foundations” and “Managing Risk in Information Systems” classes. All told, the BC/DR process is one that is continually updated as the world, our businesses, our lives and the threats to them change.
Heraclitus of Ephesus also said, “No man ever steps in the same river twice, for it’s not the same river and he’s not the same man.” The process of designing a BC/DR strategy involves looking forward and behind.
On the other hand, the more work put into preparedness and rehearsal, the more consistently the processes can be implemented. In the Cybersecurity Foundations classes that I teach, we spend significant time discussing risk management, incidence response, and BC/DR. Using airlines and air travel safety as an example, contingency planning, redundancy and backup and rehearsal are critical. There are lessons for all of us to learn as well.
Summing up, please remember the motto of the Boy Scouts: “Be prepared.”