Paper vs. Practice: Can you deliver whats documented?By Sonny Discini
November 28, 2005
Many organizations go to great lengths to provide documentation of critical processes to satisfy business requirements, legal mandates and audit findings. Some even spend hundreds of thousands of dollars to become NIST certified and achieve that warm and fuzzy feeling that theyre covered in the event theres a problem.
While documented procedures look impressive to shareholders, auditors and senior management, should the situation arise, can you be certain that everything expressed in these documents is a hard deliverable?
Lets take a look at a document that most organizations have Computer Security Incident Response (CSIR). Many of these documents are written at a high level so that it can be used to address a wide array of incidents and situations. While this certainly gives you blanket coverage, what happens when you cant deliver specifics needed?
On a crisp fall evening, when employed by a previous employer, I received a phone call from the Network Operations Center (NOC). Active Directory accounts were being locked out at an alarming rate. Clearly something was wrong. Before this incident can be deemed as an incident per the definition set in the CSIR doc, some basic information gathering had to be done.
First, I had to be sure that the activity was malicious. This step was very easily done. A single host was responsible for the activity as reported by the AD admin and at that point, I was cruising through the IR document. Then I realized something horrible. Remediation was going to be an issue.
Because IT roles were segmented, I had to approach the Active Directory administrator who contacted the NOC and ask him for very specific log information. While this is simple enough on paper, the actual information I asked for was not readily or easily obtainable. Because I needed the IP address of the offensive account, the admin had to scramble through his toolbox to parse out the data using a skunk works parser application.
Now, many might ask why not check our IDS/IPS or other security device for this information. Well, we did, but the problem was that the segment that the traffic was coming from was not monitored by any of our solutions (first amendment issues) thus our security infrastructure was blind to the event. To make matters worse, the attack was made possible by a misconfiguration in a firewall during a routine maintenance upgrade.
Much like the RMS Titanic, no single event was my downfall, but rather the accumulation of several conditions. I now had to race the clock in order to comply with the response time in my IR document or I would be buried in explanation paperwork for weeks. To make matters worse, management was aware of the issue and was pressuring the IT staff for answers. To make matters worse, the Active Directory admin was unfamiliar with the only tool that could extract the information I needed for evidence requirements. This included statistics on lockouts and such.
It looked like I was going to flounder until I received a lucky phone call from another field office. A local admin in the Dallas office noticed their domain controller had account lockouts in progress and had the foresight to fire up NETMON to get a packet capture of the traffic. I was saved! I was then able to instantly identify which city the traffic was coming from and which hub site needed a visit.
In the end I was able to complete my IR documentation and submit it however the experience got me thinking. How can I better cover situations that documented processes appear to cover but in reality do not?
Many places explain that they test their documented processes on a fixed schedule but in a myopic manner they fail to test unfamiliar scenarios or those outside of their core responsibilities.
When performing scenario testing, have another division, department or even a consultant craft mock scenarios. Many times you will shake loose competency, process and coverage gaps that were never identified in the past. In my case, I recognized that we had no processes in place to produce data that could be used for prosecution purposes in a timely manner and I also identified that our AD admin needed more experience in producing detailed information from AD. Going further, we identified that the change management document did not properly facilitate post install/upgrade testing. If it did, we wouldnt have had the incident to begin with.
Reworking the Documents
Its not uncommon for people to push back and say that there is no way to test for every issue and this is certainly true. However, you dont have to account for every issue in the universe, only those that may impact your business operations.
The key here is to understand that what you define as business operations is much different than what Finance or Human Resources considers business operations. That said, after running through scenario testing, try your best to identify the tests that qualify as those which may impact the business from the perspective of all departments.
Also, dont be afraid to point to detailed business/IT processes from your CSIR in order to handle situations where these types of processes are necessary. If your document structure is setup in such a way that all detailed documents/procedures are hung on a frame of general practices, the modification of your documents should not interfere with the cohesion and flow.
An example of flow interference would be when an incident has been declared; the network team doesnt clip a switch as part of their response while the security team is attempting to grab packet captures from hosts on the subnet behind that same switch. Each team while operating independently must do so in a way that wont impede other teams involved. More than likely, this will be one of the most common issues identified during scenario testing.
An excellent way to spot cohesion and flow issues is to lay out a flow chart or your response document. For a look at an example chart, take a peek at the USAID site (PDF, 118 KB): http://www.usaid.gov/policy/ads/500/545mad.pdf
Specifically, look at section C, Incident Reporting Process Chart.
If your process and procedures are in disarray, then it would be advisable to fix this issue first as it has many looming legal implications and your organizational issues overshadow your incident response handling.
Last but not least, it is important to understand that every organization has to be custom fitted to suit their processes, but there are ubiquities among all organizations that must first be addressed and forged as the backbone of written processes. This is not a simple or overnight fix and it can take years to polish and perfect your documented processes.
But in the end, when you truly need them, perhaps you wont find yourself relying on blind luck during an incident as I once had.