Label

Creating Disaster Recovery and Incident Response Plans

Whole books have been written about both DR and IR planning. It’s impossible to do either topic justice on a single webpage, much less both topics!

Creating Plans

Whole books have been written about both DR and IR planning. It’s impossible to do either topic justice in a single webpage, much less both topics. In keeping with the subject of this site, this page will focus on how ransomware should figure into your IR and DR plans. Ransomware attacks have been so rampant over the last several years that they’ve prompted organizations that never had IR and DR plans to suddenly develop them, and they’re almost entirely focused on ransomware.
Of course, IR and DR plans shouldn’t focus just on ransomware; there are a lot of other threats out there from both nation state and cybercriminal groups. It’s not just the ransomware itself that these plans have to take into account, but all phases of the ransomware attack:
• Initial Access
• Reconnaissance
• Exfiltration
That being said, it’s understandable that the ransomware threat would prompt many organizations to start preparing for attacks. Recovering from a successful ransomware attack can take months or years and cost millions of dollars—if your organization doesn’t have to close its doors first. The possibility of getting hit with a ransomware attack scares everyone, rightfully, and being unprepared for that attack is even scarier.
Let’s see how organizations can better prepare themselves for ransomware attacks and, if not stop the attacks, then at least be able to quickly and somewhat painlessly recover. As Tony Stark famously said to Loki, “If we can’t protect the Earth, you can be damned well sure we’ll avenge it.”

OK, maybe it’s not that dramatic, but still …

What’s the Difference 
Between DR and IR?

Most of the time, when we talk about ransomware attacks, we talk about detection because the initial goal is always to stop the attack before it takes over the entire network. Unfortunately, many organizations don’t stop a ransomware attack in time and will be forced to activate their IR and DR plans.
A DR plan is a living document that contains detailed instructions on how to respond to acts of nature, catastrophic errors, or—increasingly—cybercriminal attacks.
An IR plan should be part of a DR plan. But in most organizations, DR and IR plans are distinct documents maintained by two different groups. That’s because, historically, DR plans were managed by the risk management groups within an organization, whereas IR plans were managed by IT or security teams. IT and security teams haven’t traditionally reported to the same leadership as risk teams. So, while DR plans often had a high-level overview of how to handle IT systems, it was usually in terms of how to manage these systems in the event of a natural disaster.
This mindset has started to change (albeit slowly), and it absolutely must. IT and security teams don’t usually speak the same language as risk management and compliance teams do, but they need to be able to adapt to the risk management world to create better IR and DR plans for dealing with cyberattacks. That is why DR and IR planning are part of the same section on this site: they need to be tied together, even if they aren’t in many organizations today.
This Content Made Available 
Compliments of
Our Community Thanks You!

Points to Consider 
for Your DR Plan

AS A REMINDER

Again, the goal of this section is not to act as a guide on how to build a DR plan from scratch. Instead, the goal is to advise organizations on ways they can incorporate ransomware recovery into a DR plan. Some of the ransomware DR plan will include the ransomware IR plan discussed in the next section, but DR is really focused on the long, slow—often mundane, and sometimes painful—part of ransomware recovery: getting the organization back up and fully operational.

OBLIGATIONS

Depending on the size of an organization, or the outsourced IR team, ransomware DR may be going on simultaneously with IR. Organizations have an obligation to get up and running as quickly as possible. Their constituents—patients, customers, students, and so on—will have expectations that at least some services will be back online quickly. Others could be brought back more gradually.

COORDINATION

Of course, the IR and DR teams must coordinate their work. The ransomware attack must be truly contained before systems are bought online or there’s a good chance of reinfection. The DR team has to restore servers in isolation, making sure they’re restored from a point before the ransomware or other tools the ransomware actor used during the earlier phases of attack were installed. Otherwise, the ransomware can be reintroduced into the network.
GPS

Need Some Help Getting Started?

If your organization needs to create a DR plan from scratch, there are a lot of great resources that can guide you. Ready.gov has a document that describes how to build out an IT DR plan. For a more comprehensive look at DR, take a look at the book, "Modern Data Protection: Ensuring Recoverability of All Modern Workloads," by W. Curtis Preston.
WATCH OUR FREE DR TRAINING VIDEO COURSE

Setting DR Goals

DR goals are usually measured as Recovery Point Objective (RPO) versus Recovery Point Actual (RPA) and Recovery Time Objective (RTO) versus Recovery Time Actual (RTA). RPO is defined as the amount of data acceptable to lose in a disaster. For example, if an organization is conducting hourly backups, RPO for a ransomware attack should be one hour. RPA for a ransomware attack is the amount of data lost in an attack. RPA can be affected by backup data that was encrypted by the ransomware group (discussed on the "Ransomware Backup Strategy" page) and the need to use an earlier image because you can’t clean the ransomware actor’s tools off a backup image. RTO is the amount of time between incident detection and the point when service is fully restored. RTA, as expected, is the actual time it takes to restore a service.

DR from a ransomware attack often experiences a big discrepancy between RTO and RTA.

Why is that? Most DR plans are written around having to restore a single server or cluster of servers. One scenario might be that a Microsoft Exchange server crashes and is unrecoverable. The DR plan says to take the most recent backup and restore from that point. Recovery from backup takes three hours and the last backup was completed 30 minutes before the server crashed, so the RTA is 3.5 hours. An example of RPO and RTO is shown in the diagram to the right.
This diagram shows the differences between RPO, RPA, RTO, and RTA and how they're affected by a ransomware attack
The problem with a ransomware attack is that there are often hundreds or thousands of servers that need to be restored. If the RTO for restoring a server is four hours, and there are now 2,500 servers that need to be restored, they potentially require 10,000 hours to restore (roughly 416 days with teams working around the clock). Of course, it won’t be a single team restoring servers, but even with multiple DR teams working simultaneously, there's only so much bandwidth (literally and figuratively). It’s easy to see why it often takes so long to fully recover from a ransomware attack and recovery time is taking longer and longer. In 2016, the average recovery time from a ransomware attack was 33 hours. By the first quarter of 2019, ransomware recovery time had jumped to 7.3 days. In the second quarter of 2021, average ransomware recovery time was at 21 days and that’s just the average, some organizations take months, while others never recover.

RPO and RTO goals in the DR plan should be adjusted to account for the likelihood of a total network shutdown during a ransomware attack.
Deep dive

Homer Simpson: 
They Have Ransomware on ESXi Now?

There’s a quote from Homer Simpson that’s often overused in IT security circles, “Oh, they have the Internet on computers now.” The quote wonderfully captures how surprised people are by things that seem like a natural progression to people who understand a topic deeply. In this case, more and more ransomware groups are creating versions of their ransomware specifically designed to encrypt VMware ESXi systems.

Why? Because if a ransomware actor can encrypt an ESXi server, they can instantly remove dozens or hundreds of machines from the network, creating significantly more chaos. Being able to knock an ESXi server offline allows the attacker to do a lot of damage in a shorter period of time, not just because of the number systems, but also because of the type of data stored on ESXi servers. ESXi systems usually store backups, file storage, code repositories, databases, and other critical files making their encryption a serious business disruption.

But there’s another advantage: Many organizations have virtualized their DR environments. Whether it’s a hosted environment or a Disaster Recovery as a Service (DRaaS), organizations can save a lot of money by going virtual and can restore servers very quickly after a ransomware attack. However, if the DR site is reachable from the network, the ransomware attacker can use that connectivity to access and encrypt the DR servers. This isn't a hypothetical scenario. Unfortunately, it has happened to several ransomware victims.

Organizations relying on virtual servers for DR should ensure those servers are fully segmented from the live network, to avoid encryption by a ransomware group. In addition, these systems should have the same security systems installed and monitoring that are applied to live servers. DR servers are critical to ransomware recovery and should be monitored as such.

Prioritization

Given the scenario laid out in the previous section, a ransomware DR plan has to focus on prioritizing which servers will need to be recovered in what order. It’s critical to define which systems are core to the success of the organization and how quickly those can be restored to try to get some operations back to normal.

The reason this needs to be documented ...

... in the DR plan is that the decision requires leadership input, and the time to ask this question is not after a catastrophic ransomware event. There will be a lot of different groups making demands of the DR team in the aftermath of a ransomware attack, and every group will think their systems are top priority. Having a clear, prioritized list of systems that need to be restored and in what order allows the DR team to get to work without having to deal with the natural chaos that’s part of any recovery from a ransomware attack.
Documenting the priority of system recovery is important, but so is some level of flexibility. There may be scenarios that weren’t considered during the DR planning, so the DR team needs to be able to make adjustments as advised by leadership. For example, if the ransomware attack happens at the end of a quarter, there may be some sales systems that need to be prioritized over other systems that would normally take precedence. Ideally, all of these scenarios would have been considered and there will be plans in place, but even the best DR plans often have holes. Sadly, too many of those holes are discovered during an actual disaster. This is why the tabletop exercises are so important—they help discover these holes.

Outside Help

After a ransomware attack, there’s a good chance that an organization will need to bring in outside help for both IR and DR. On the DR side, it’s important to document the steps for recovery so well that even someone from outside the organization can easily understand what needs to be done and carry out the necessary tasks. This is a basic tenet of good DR planning, but it’s not always practiced for IT recovery.
One common problem that outside organizations run into is outdated network diagrams or an opaque environment. Network diagrams, asset inventory, and software installations can change rapidly. If updating the DR plan isn’t part of the change control process when these changes happen, it can quickly become outdated. This is a slightly different stance than the discussion around IR diagrams. The reason for the difference is that there is a little more leeway for error when an internal team is looking at a DR plan than when an external company is looking at an IR plan. The internal team has some institutional knowledge and they can, hopefully, deal better with mistakes. External IR teams don’t have that institutional knowledge to fall back on. This lack of planning can significantly slow down the recovery process or force an organization to rebuild the network from scratch, causing significant delays.

Paying the Ransom

No one likes to talk about paying the ransom.

"The Real Cost of Ransomware" page goes into more detail on this topic, but knowing when it’s time to pay the ransom is an important decision that should be settled before a ransomware attack happens. Documenting what conditions would force a ransom payment ahead of time allows an organization to avoid a panic decision.

Documenting both when and how the ransom will be paid is critical. 

If a ransom payment is covered by cyber insurance, that should be noted in the DR plan and should be checked annually. There are several ways that a ransom can be paid. There are ransomware negotiators who will handle interaction with ransomware groups and often pay the ransom on the victim’s behalf (for a fee, of course).

Having a few hundred Bitcoin or other cryptocurrency on hand to pay the ransom used to be common practice. 

The location and procedure for accessing that wallet would be included in any ransomware DR plan. Because the price of Bitcoin has increased so much, and ransom demands are now regularly in the millions of dollars, this is only a practical solution for the largest of organizations.

Some aspects of ransomware DR need to be considered as part of a larger DR plan.

An effective DR plan for ransomware and its aftermath takes into consideration the unique nature of a ransomware attack, as well as the challenges involved in having most or all of an organization’s systems encrypted and having to recover everything.
In summary, a good DR plan for ransomware should include:

• Clearly defined goals for recovery
• Realistic RPOs and RTOs
• A plan to test the goals, and make adjustments to the plan based on the results
• Knowing when it’s time to get outside help
• An understanding of when it will be necessary to pay the ransom

GOOD TO KNOW

Watch Out!

Cyber insurance companies are getting more selective about who they cover and whether they pay a ransom in the event of a ransomware attack. Most policies renew annually. Part of the cyber insurance policy renewal process should involve updating the DR plan to confirm that cyber insurance will still pay the ransom in the event of a ransomware attack.

Points to Consider 
for Your IR Plan

USED TO BE

There was a time when IR plans were static documents that were primarily written up for compliance purposes. IR plans were stored in binders that were pulled off the shelf and dusted off once a year to demonstrate that an IR plan existed, then were put back on the shelf until they were needed for the next audit. As one would expect, these plans bore very little semblance to reality and were often not used at all when there was an emergency.

TIMES CHANGE

Those kinds of plans still exist, but more meaningful IR plans are thankfully becoming more common. Ransomware has altered the IR landscape and made IR planning a critical business function. IR has gone from an obscure activity to claiming the attention of senior leadership and often even the board.
Wait! If organizations are taking IR more seriously than they used to, why are ransomware attacks still increasing? Shouldn’t the focus on IR mean that more ransomware attacks are stopped, or at least, are more quickly contained?

INTERESTINGLY

Interestingly, most ransomware attacks are stopped. It doesn’t seem like it, given that dozens of attacks are made public every week, often against very large companies, but many other attacks are quietly blocked. Still, most organizations do a relatively poor job of IR planning, especially when it comes to ransomware. That’s why, despite the focus on IR, ransomware attacks are still occurring at a breakneck pace.

Want More Ransomware Content Delivered Directly To Your Inbox?

Sign Up To Receive Our 
Monthly Ransomware Newsletter

Don't Worry, We Hate Spam Too!

Why Is Ransomware a 
Unique Problem in IR?

In a lot of ways, ransomware is no different from other threats. Ransomware actors rely on the same delivery mechanisms and use the same tools as a lot of cybercriminal and state-sponsored groups. The way they move around the network is the same as other threat actors: They still have to gain administrative access, they target Active Directory servers in the same way most other sophisticated actors do, and they even steal files in the same manner as other threat actors.

What separates ransomware from almost every other type of attack is the payload.

If they successfully strike a victim network, a lot of the questions that incident responders try to answer are immediately known. An incident responder walking into a ransomware attack might know the organization that infiltrated and what they want, but what the incident responder might not know is:
The strain of ransomware that infected the organization
What the initial access vector was
How long the ransomware group was in the network
What files were stolen
Because ransomware turns a lot of traditional IR on its head, many organizations have had to rethink their IR plans to address ransomware.
GOOD TO KNOW

Pay Attention

Some ransomware groups are better at branding than others. That sounds like a silly statement, but it’s true. Although most of the time, incident responders can look at a ransom note and know which ransomware group encrypted a network, that’s not always the case. Some ransomware groups simply steal the text of ransom notes from other groups and don’t include a name or anything else that would help the incident responder identify which ransomware was used in the attack. Fortunately, there are services such as ID Ransomware and No More Ransom that allow victims to upload a ransom note or encrypted file to determine which ransomware was used in the attack. Keep those sites bookmarked!

Gotta Have a Plan

As with the DR section, the purpose of this section is not to help an organization build an IR plan from scratch. That’s too much to cover in a single section or chapter of a book. Instead, the purpose of this section is to help organizations think about how to properly tie ransomware response into their IR plan.

A ransomware plan can't be tied into a non-existent IR plan, and an IR plan should deal with more than ransomware.

There are a lot of basics that need to be defined in an IR plan, starting with: What is considered an incident? Obviously, a ransomware attack is an incident. In fact, a modern ransomware attack is likely made up of at least three separate incidents (depending on how an organization defines an incident):
Initial Access: How the ransomware actor gained access (or the Initial Access Broker)
Exfiltrated Data: What data was stolen from where
Ransomware Deployment: How and when the ransomware is executed

There may be more incidents involved in a ransomware attack.

For instance, many organizations would consider gaining access to an Active Directory server an incident in and of itself. The point is, the threshold for what types of events or collection of events qualifies as an incident should be well-defined within an IR plan, as well as what the response should be.
An IR plan needs to:
Include a contact tree, both through normal and outside channels
Specify who needs to know about an incident, when they’ll need to know, and what their role is
Document who will be performing forensic analysis
Define how forensic evidence will be preserved
Outline any regulatory frameworks that need to be followed
Finally, the IR plan for a ransomware incident has to include instructions for when and how systems and network segments can be handed over to the DR team so that the team can start restoring services. For smaller organizations, the same team may be doing both IR and DR, but the IR plan still needs to document when and how IR stops and DR starts for each affected system or department.
SCHOOLHOUSE

Let’s Switch This Conversation to Signal

Nation-state groups often monitor email communication for indications that an organization is on to them. Often, they’ll specifically track an email thread that might reveal their presence and look for comments like “let’s take this conversation offline” or “let’s switch over to Signal” to indicate it’s time to back out (or destroy everything, depending on the group and their goals).

Cybercriminals have picked up on this tactic as well, quite by accident. It turns out there’s a lot of juicy and embarrassing information sent via email, so stealing email communication for extortion purposes makes sense. But it’s also a great way to track whether your ransomware attack has been noticed during the reconnaissance phase. Most organizations don’t use email as their primary form of communication during an IR, preferring ticketing systems, but for critical incidents that may involve communication outside of the core security team, there should be an out-of-band communication plan, which should be in place before an incident is detected.

Outside Help

Undoubtedly, any ransomware IR plan is going to involve outside organizations.

Even large companies with talented IR teams will need outside help. At the very least, a ransomware attack is going to trigger a call to an organization’s cyber insurance provider, but the matter can go a lot further. Often, organizations engage outside legal counsel during a ransomware attack and, of course, it’s not uncommon to bring in outside IR teams.

The ransomware IR plan should document who engages (and when) with the different outside organizations.

Information about cyber insurance policies and legal or IR retainers should be included in the ransomware IR plan, especially because a lot of those documents may have been encrypted in the ransomware attack. It bears repeating that the time to sign a retainer with an outside IR organization is not after a ransomware attack. This information should be decided ahead of time. It will save the organization time and money in the long run, even if there’s more of an upfront cost.

A clear understanding of the victim’s environment and access to the tools needed to conduct IR will be required.

This information should also be included in the IR plan. Understand that even in the most well-run organizations, network diagrams and asset inventory are usually incomplete. IR firms know that. The documentation included in the IR plan is at least a place for them to start. 

Accurate information, even if incomplete, is better than outdated information.

The onsite incident response teams will undoubtedly find services, assets, and sometimes even network segments that weren’t properly documented. That problem is unfortunate but expected. Keep network diagrams and asset inventory as up-to-date as possible. 

The same preparation rules apply to logs. 

IR teams are going to need access to logs from a number of different sources within the organization. The IR plan should document how to get this information to the team as easily as possible. Some (but not all) of the information that the IR team will likely need access to include:
Most recent internal and external vulnerability scans
Web proxy logs
Mail server logs
DNS logs
Logs from endpoint software (AV/EDR/asset management)
Firewall logs
Windows event logging
VPN Logs
Logs from any remote access system (RDP/Citrix/TeamViewer)
Active Directory logs
PowerShell logs

There may be other sources for logs ...

... that the IR team needs, depending on the type of ransomware attack. Not every organization collects all these logs, but the IR plan should document which systems or servers the logs are being collected from, how long those logs are stored, and how to provide third parties with access to those logs. It’s worth noting that some IR companies will want the raw logs sent to them for analysis because they have their own tools for managing logs. The IR plan needs to allow a large amount of log data from a variety of sources to be extracted, transferred to a portable drive, and delivered to the IR team for analysis. The process determining the format needed should be discussed with the IR company when the retainer is signed.
In summary, a good IR plan for ransomware will include:
A larger IR plan for all types of attacks
Well-documented and up-to-date network maps and asset inventory
Guidance on which log sources are available and how they can be analyzed
An understanding of who needs to be involved and when they need to be notified
A clear outline of legal, regulating, and reporting requirements
A handoff plan for when systems can be turned over to the DR team
Scope of the retainer with an outside IR firm
Guidance on when to call that outside IR firm
Plan to feed everyone involved in IR
Finally, the IR plan for a ransomware incident has to include instructions for when and how systems and network segments can be handed over to the DR team so that the team can start restoring services. For smaller organizations, the same team may be doing both IR and DR, but the IR plan still needs to document when and how IR stops and DR starts for each affected system or department.

Storing and Updating 
the DR and IR Plans

Here’s a “fun” story: An IR team is called in to help an organization that has been devastated by a ransomware attack. They walk in to find the organization’s IR team in disarray, running around trying to contain the attack, figuring out who needs to be notified, and who’s going to run everything. The problem? Their IR and DR plans were both stored on the fileserver hosted on an ESXi cluster that was encrypted in the ransomware attack. While the organization’s incident response team knew how to handle localized attacks affecting a single server or part of the network, without the IR and DR plans they were essentially operating blindly.

Unfortunately, this scene occurs time and time again in ransomware cases.

Keep an offline version of both the IR and DR plans.

That used to mean printing everything out and keeping the plans in a set of binders. But printing complex plans is surprisingly difficult. Given how often networks change, new plans have to be printed monthly (if not several times a month) which, on top of everything else, is bad for the environment. 

Offline copies should be in digital format.

For some organizations, that means simply storing it on a flash drive—as long as anyone who may need access knows where that flash drive is, and the drive is properly secured and regularly tested to ensure it hasn’t failed (unfortunately, that does happen). Another solution is to store copies of IR and DR plans in a cloud environment. As with backups and other parts of the cloud network, the cloud environment where the IR and DR plans are stored should not be reachable from the network; otherwise, both the original and backup copies of the IR and DR plans could wind up being encrypted in a ransomware attack.

Update both IR and DR plans in all locations. 

The plans should have numbering systems in the file name or somewhere easy to find so that teams always know they’re dealing with the most current version. Ransomware IR is, by its very nature, hectic. You’ll have trouble recovering if some teams are working from one version of the DR or IR plan and other teams are working from a different version. To that end, ideally no one should have “their own” copy of the plan, as their version could quickly become outdated.

The focus of this section ...

... has been on preparing for a ransomware attack. Consider reading more about how ransomware attacks work, how ransomware groups gain initial access, and how they move around networks, steal files, and finally encrypt victims. Understanding how attacks work will better enable organizations to protect themselves.

If You Liked This, You'll Love The Free 313 Page Book:
Ransomware: Understand. Prevent. Recover
Download It Here

Get the Book 
in Your Inbox

Download The 
"How To Prevent Ransomware"
Cheat Sheet

Grab this free PDF resource on how to prevent Ransomware
DOWNLOAD THE PDF

Share This Resource With Others

Embed The "How To Prevent Ransomware" resource on your site or blog using this code.

Share this Infographic On Your Site

Want More?

This site is adapted from a book on Ransomware. 
If you would like to learn more keep reading ...
READ MORE ABOUT ACTIVE DEFENSE INTRUSION
envelope
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
Share via
Copy link
Powered by Social Snap