The road to recovery
This is part two of a two-part series on data security measures. For part one, please see the February 2011 issue of Long-Term Living (“Hail your new digital gatekeeper,” p. 36).
Imagine the panic that ripples through a nursing home during a large-scale flood. “Have residents been moved to safety?” “How bad is the structure damaged?” Those are vitally important questions to be sure, but not the only ones that should be addressed. Now imagine the sense of dread overcoming those in charge when realizing their communication network, electronic database, and all other essential systems have been rendered inoperable. Data loss of incalculable proportions has potentially been suffered, leading administration to wonder, “How will we recover?”
The consequences of having no clear or detailed plan in place to recover data in the face of disasters, computer thefts, or hard disk failures jeopardize a long-term care provider's sustainability. Without one to rely on, they are essentially playing with fire. Such is the message of Joyce Miller-Evans, CIO of Ohio Presbyterian Retirement Services (OPRS), a nonprofit faith-based provider with 11 retirement communities and other home- and community-based services throughout Ohio. Long-Term Living Editor Kevin Kolus spoke with Miller-Evans on what makes for good disaster recovery protocol and how OPRS went through the process of revamping its own plan.
What is the essential role of a disaster recovery plan?
Miller-Evans: Anything that we are doing with our residents we need to be able to replicate during an emergency, based on priorities and such. One is, of course, care and another is to be able to provide the system in the midst of a disaster. So the disaster recovery plan is really the plan that guides you to recover your systems. It addresses what you do during that downtime.
OPRS was operating with a three-page disaster recovery plan that you came to find was inadequate when you were hired in 2004. What were its shortfalls?
Miller-Evans: The three-page document was what I found when I arrived, and I think you'd find that commonly in smaller organizations because so much is considered understood by all who are responsible. I come from larger organizations with 200 IT staff and I know you need to be able to communicate with a concrete plan in every direction in the time of a disaster.
The three things this existing plan identified were the recovery location, call chain of who would need to be notified, and the applications that would need to be restored at the recovery site. And while that was good, it just wasn't complete. Having been in acute-care environments-one was University Hospitals in Cleveland-I understood that a full assessment of the needs of the users during a downtime had to be available.
Please explain the components that went into your redeveloped plan at OPRS.
Miller-Evans: First we must be able to do a damage assessment when a disaster occurs. The next piece is the strategy and the review process for declaring a disaster. You have to identify who is authorized because at any one point, you may not be able to contact either the CIO or whoever else is in an executive position within the organization. So we created a chain of who would be authorized to declare a disaster.
Next are activation procedures, and that really means “activating the team.” I have 12 people in IT including myself, and that probably sounds big for long-term care facilities that are single nursing homes, but OPRS is a health system. If you look at it, we have 11 CCRCs as well as full hospice and adult day care services. We're located around the whole state and we need to be able to activate appropriate personnel during a disaster. My team includes the technology director, the senior systems engineer, and/or consulting vendors that would be notified that we are using them as disaster resources. And I think that is more like what others in the long-term care environment would do. They're not going to go six engineers deep in their organization like hospitals would, so you would more than likely want vendor backup.
Then we would have to take a look at our procedures, what it is we need to purchase, what it is we need to take from whatever site to begin the recovery process. This process comes after we've done our damage assessment, when people need their systems returned. We have a table as to how we're going to activate and get things back online. For example, with e-mail and BlackBerry use so prevalent, and being able to communicate throughout a network so important, restoring e-mail quickly is one of the biggest needs we have on our list.
On a daily basis we have to attend to our backup, making sure we do data backups-complete backups-and that they're getting off site into fireproof storage. That's critical and is something we test every six months or so.
In redeveloping the plan, we talked with our clinical divisions both in the long-term care and skilled nursing environments asking what applications are absolutely critical to their operation-so clinical is priority one in our recovery plan. Priority two is other essential applications. And then priority three is delayed applications. When we look at priority one, those systems are what have to be back up within 72 hours. Priority two can go from two to five days. Priority three can be sustained for 10 days or longer without coming back up.
How did you come to assign these designations in a specific order of importance?
Miller-Evans: I worked with Mac McMillan of CynergisTek prior to coming here. I'm not a security expert even though I'm the security officer, so I needed CynergisTek's assistance in constructing the recovery plan. Now we didn't want them to build it because we're so used to doing things on our own here, but we needed its project management and expertise, its template, and we used the company to keep us on target. We would do calls with them to make sure we were hitting the right points, getting the right information, and we worked together to create those designations. Is our network out? Is our Internet access out? These things are of great importance to the organization, so they're going to play into how we bring systems back up and in what order.
Then we would have to take a look at our procedures, what it is we need to purchase, what it is we need to take from whatever site to begin the recovery process. This process comes after we've done our damage assessment, when people need their systems returned.
You mentioned earlier that systems would be restored at a “recovery site.” Where is that located?
Miller-Evans: It's actually at one of our facilities. We created our own data center for backup there. Something we are currently looking into is the appropriateness of having the recovery site at this location given our size. We always have to assess this because it takes a lot of work to keep up the duplication of the things that we need, so we are looking at the possibility of doing cloud computing in the future for our disaster recovery site. It would make it a lot easier.
What routines do you perform in adhering to the plan?
Miller-Evans: On a daily basis we have to attend to our backup, making sure we do data backups-complete backups-and that they're getting off site into fireproof storage. That's critical and is something we test every six months or so. After that, I think just probably taking the time to put the plan together on the recovery and making sure you have all the pieces-making sure you have the technical resources and the people to do it right. I think another vital habit is keeping your plan up to date. We've taken some new hardware to the recovery site, we have to update our procedures, and we have to go through the whole process again. Lastly is doing tabletop test drills where you sit around a table and say, “This is the disaster-what's your department going to do, what's your department going to do….”
Are there any other practice drills that you routinely do?
Miller-Evans: We're due for one! I would say we do them annually, and we could do them more often. That would not hurt. We just added a staff person who is our technical support coordinator and I think we're going to be more attentive to running these drills.
Would you say most long-term care organizations aren't as prepared as they should be?
Miller-Evans: It would be an assumption I would make knowing that many people report up through CFOs, and unless those CFOs are really attentive to the IT world, some of this may not be getting addressed. But if you are taking HIPAA seriously, disaster recovery is a part of that. And if you are meeting regulations within the HIPAA requirements for security, you would be doing disaster recovery. Reading the HIPAA regulations is really the key here. Overall, I think preparedness is relative to how people are meeting those regulations, but I know the push back I can get here is, “Other people aren't doing this, why do we have to?”
What is the doomsday scenario you explain to employees who ask those questions?
Miller-Evans: Data loss is the biggest thing. Today we have all this rich data, we can support the residents on the computers, we communicate electronically with physicians. If the systems went down all of that would stop. You'd go back to paper and you wouldn't have the strong communications and the quality of the records.
How do you ensure the plan is always current?
Miller-Evans: The plan shows where we were prior to HIPAA security regulations being required, and in the end we're meeting those, exceeding those, and doing due diligence. The whole thing is constant attention and continuous improvement to make things top of mind to keep it all going in the best direction. It's a culture you create and not just a snapshot. Data recovery is something dynamic we constantly have to work through. LTL
Long-Term Living 2011 March;60(3):52-55