Get Adobe Flash player

blackfriar

Discere Dev Update

Discere is approaching a usable alpha release. I just need to add support for zip/jar/tar/etc archives, and incorporate my prior PST handler from black friar, then add rudimentary document tagging. Spreadsheets are still a bane – I don’t know what the point is really, they never come out in a readable format without human intervention; I’m wondering about rendering them from an HTML table extraction.

I thought I would just post the above dev log update since I have so little time to blog right now. As a quick explanation, Discere is a part of my dissertation work, and is a subset of my Blackfriar project. Discere is intended to be an eDiscovery / Document review system with robust index/search capabilities, while Black Friar is the overarching project linking digital forensic acquisition / preservation data into a reviewable / produce-able format. All of it relies on a substantial number of Open Source projects to leverage existing, robust projects into a tool usable by anyone. It is also cross-platform to address the increasing number of Mac and Linux systems found in the legal environment (trust me, half of my law school classmates are on macs right now, and I am noticing an increase in usage with clients I do expert work for).

AAFS 2010 – Changes in Approach to Scalability in Digital Forensic Analysis

This year I attended the American Academy of Forensic Sciences (“AAFS”) conference in Seattle and presented in the digital and multimedia section. The following post is a summary of the oral presentation along with my slide set.

For those who do not know, I hold an M.S. in computer science with concentration in Information Assurance. I am presently a Ph.D. student in Engineering and Applied Sciences at the University of New Orleans. I expect to be ABD by fall of this year when I start law school as a J.D. student. Professionally, I work for a litigation support vendor in New Orleans dealing primarily with the civil side of digital forensics, eDiscovery, and other related areas. I have a somewhat unique perspective on the field by having one foot in academia and the other in industry.

One cannot begin a discussion of future trends and the need for new approaches without first examining the current state. At present the field has three main phases of practice: acquisition, analysis, reporting. Acquisition originated in dead acquisition where the data storage medium, such as a hard drive, is imaged byte-for-byte to produce an exact duplicate when the system is powered off. The duplicate is hashed for later verification after analysis is complete. In a more modern twist, Live analysis involves acquiring data from a system while it is still running. Live acquisition allows for preserving more ephemeral data such as memory dumps, active network connections, logged on users, running programs, etc which would otherwise be lost in powering the system down for dead acquisition. Live acquisition risks the triggering of anti-forensics tools, malicious commands from still logged in users, and damaging the system state. Continue reading

DC3 2009

The results for the 2009 challenge are due in 6 days. This year there were 1153 entries with 44 submissions, a slightly lower rate of return than last year. The challenge format was different this year. Last year’s format was a set of discrete problems at various levels of difficulty with some of the higher difficulty problems being more complex forms of the lower problems. This year the challenge was a simulation. We received a case file with information from the investigators and a type of work order for what we were to investigate. The challenge data was a single hard drive image from a system used by the suspect.

Evidence was located in a variety of places from simple chat logs to the windows registry. There were some red herrings along the way including files from previous years, but all in all it was a decent challenge. Some of the documents felt rushed, such as the case file still having track changes enabled, but given the difficulty in constructing believable simulations I cannot call the DoD to task overly much.

Below the fold is our primary report for the challenge we submitted earlier in the month. The full report including the registry report, the evidence files, and so forth will likely be released when the results are announced as they were last year. If DC3 does not release them, I will post a copy for download if anyone is interested.

Continue reading

DC3 – Digital Forensics Challenge 2009

Team NSSAL met Tuesday to digest the DC3 challenge packet for 2009. The 2008 challenge was a more structured series of well defined tasks split up into categories and difficulty levels; the 2009 challenge is set up to mimic a real investigation. We were provided with some documentation regarding seized evidence, and an affidavit submitted to obtain the warrant. The scenario centers around an individual purporting to be highly skilled at hiding his data so we are preparing to encounter all the techniques from last year’s challenge in the current one, but in a realistic application scenario. Thankfully we have a tool belt of tools we used (and some we built) from last year to meet the challenge head on.

I will not be blogging about any specifics of our findings  while the challenge is going on, but do expect a full write up at the end. I did want to mention I will be using the early alpha builds of Black Friar in the course of the challenge. I have indexed the drive image we were provided with using Black Friar, and from some initial triage it looks to be working quite well. Expect specific details on how it performed in the challenge when the time comes.

What is Black Friar?

It has been a while since I posted, but I’ve made some headway with my pet project that I wanted to post about. Most modern forensics tools have a single system or sequential paradigm when it comes to analysis. These tools, while very powerful, were not designed to scale. There is another market dealing with similar issues of scale which forensics can learn from.

Since 2006 there has been enormous growth in the Electronic Discovery or eDiscovery market. In a nutshell, eDiscovery is the legal concept and practice of discovery but with electronic documents instead of paper ones. A lot of the industry and its practices are smoke and mirrors in my opinion. There are a lot of acronyms, jargon, and puffed up resumes complicating a fairly simple process – documents are (sometimes) indexed, (usually) converted to a PDF or TIFF format for printing, and are reviewed to determine if they are relevant and if they have to be turned over. The process of converting to PDF or TIFF is actually done via a print driver which is horrendously time consuming since it spools to the hard drive resulting in massive I/O bottlenecks. Some of the EDD software tries to get around this by using primitive clustering setups – mostly involving network shares – to speed up the process.

Continue reading