First 2012 Discere Dev Update
My work on Discere continues. PST files have been supported for a while now, but lotus notes is the other half of the corporate email coin. Unfortunately, there are no cross platform open source libraries which support LotusNotes NSF files in pure java. Instead, I am using the IBM Domino API – unfortunately this requires Domino Designer to be installed which makes it a Windows solution. This is not optimal as I’ve been aiming for complete platform independence, but this may be a necessary evil for now.
As with my test metrics for PST files, the speed doing a native -> PDF conversion directly instead of using PDF/Tiff printer drivers etc is very promising. The test file I am developing against converted ~3.5k emails to PDF in less than 2 minutes. This currently does not handle attachments, but that will be fixed and attachments will be handled as per the native format just like Discere handles attachments in PST files.
Discere Dev Update
Discere is approaching a usable alpha release. I just need to add support for zip/jar/tar/etc archives, and incorporate my prior PST handler from black friar, then add rudimentary document tagging. Spreadsheets are still a bane – I don’t know what the point is really, they never come out in a readable format without human intervention; I’m wondering about rendering them from an HTML table extraction.
I thought I would just post the above dev log update since I have so little time to blog right now. As a quick explanation, Discere is a part of my dissertation work, and is a subset of my Blackfriar project. Discere is intended to be an eDiscovery / Document review system with robust index/search capabilities, while Black Friar is the overarching project linking digital forensic acquisition / preservation data into a reviewable / produce-able format. All of it relies on a substantial number of Open Source projects to leverage existing, robust projects into a tool usable by anyone. It is also cross-platform to address the increasing number of Mac and Linux systems found in the legal environment (trust me, half of my law school classmates are on macs right now, and I am noticing an increase in usage with clients I do expert work for).