Preserving email integrity in eDiscovery

I was recently asked the question, Can eDiscovery systems prove the integrity of the email that they discover? If so, how?

Email differs from loose files in that emails are contained within a database such as Microsoft Exchange or IBM Lotus Notes.  Loose files are typically hashed using a mathematical algorithm that can show that the data on a system and the data captured are the same but each message in a mailbox does not directly correlate to a single file that could be hashed.  In some cases, mailboxes are contained in a single file such as a PST when exported or when mailbox segmentation is implemented on a Lotus Notes mailbox database but most often, one or more databases contains the messages for many different users.  These databases are binary files that cannot be opened using a text editor.  Furthermore, many systems deduplicate their databases so that multiple copies of emails are stored as one with pointers to others.

So, to answer the question, systems that harvest email must interface with the email system database.  They preserve the integrity of email by querying the database for mailbox metadata such as the byte count and message count and then this data is compared to what is retrieved from the database to ensure that all relevant messages were retrieved and that byte counts match.  Additionally, each message contains a header that identifies the source, destination, timestamp, sender, recipient, and the path the email followed from source to destination.  This information is also helpful in validating the email and can be compared to messages other messages to confirm the content.  For example, a recipient message could be compared to the version on the sender’s machine or a message sent to multiple recipients could be compared to messages retrieved from each recipient mailbox.

Lastly, errors can occur in collecting emails so it is important for any system to include a log of errors and irregularities and any changes that are made as emails are processed through the system.  For example, some characters may not be processed or video and music files may be excluded in a review tool.  These items are noted in the log and sometimes placed as notes attached to messages so that there is no confusion in the review process.

Gmail Improvements

In recent months I have submitted a variety of suggestions for
improvement to the gmail system. I figured I would blog about those
suggestions to see if others would benefit from them as well.

1. I find myself using gmail for everything. The gmail search
features are wonderful but I can only search back to the point when I
first opened the account a few years ago. I have many archived
messages that I have received over the years at other accounts that I
would love to import into gmail so that they could be searched as
well. Mine are stored as Outlook .pst files. I suggested that gmail
support a variety of message import methods so that others could truly
migrate over to gmail without ever having to rely on another current
or previous email account.

2. Security is more important than ever and it is a facination of
mine. It has been possible for quite some time for users to digitally
sign the emails that they send. I suggest that Google set up a
certificate server and distribute digital certificates to gmail
account holders who present information to prove their identity. The
process would be optional. These digital certificates could be used
to sign emails and Google Talk chat sessions so that the other user is
assured that they are talking with the person they think they are
talking with. Filters could also be set up in gmail to allow
digitally signed messages or treat unsigned messages with heavier
security measures. Since Google is so large and they have many gmail
users, it would be very feasible for them to set up their certificate

3. Gmail allows the use of groups in sending messages. Many other
providers have offered this in the past but I was happy to see the
feature in one of the last few updates from gmail. One thing annoys
me about it though. I would like to filter messages and apply labels
based on group membership but I cannot. I am forced to create filters
individually for each email account. This is time consuming and also
it requires me to edit my filters if membership in a group changes. I
suggested that gmail allow groups to be used in filters as well as

Down with the SPAM King

Alan Ralsky, the "SPAM King" and one of the largest spammers in the world, was jailed by the Department of Justice last month. The Detroit News said the following:
"Warrants unsealed last week revealed that agents
in September seized computers, laptops, financial records and disks
from the 8,000-square-foot home of Alan M. Ralsky. The $750,000 West
Bloomfield mini-mansion was built off profits from the 100 million
electronic offers for everything from Botox to mortgages that Ralsky
sends every day."  

It is rumored that he will rat out many other spammers and hackers as well. 80% of the SPAM you receive is sent from large spammers such as Alan Ralsky. Alan Ralsky has been spamming since 1997. Ralsky sent out millions of messages per year and also hosted many other spammers. He began using dial-up accounts and then moved to setting up dummy ISPs. He bought his own IP space from ARIN and spammed using an address from his address space until many complaints would cause him to switch to a new address in the space. Later, he moved his operations to China to attempt to avoid US authorities. He hosted websites on the same dial-up connections he
used to send SPAM. He then used an auto-updating DNS server to point
to a new IP address whenever one of the DUN’s cut him off.
Ralsky also hosted quite a bit of the spammed website
content on servers in the US, He used a VPN connection to route the
traffic from the Chinese IP’s back to his systems in the US.

Here is an interview about it with the hacker "Memehacker".

Posting blogs to MSN Spaces via email

I am happy about the new features added to MSN Spaces, but I am unhappy that I had to reset my profile information. So far none of my book pictures show up from Amazon. The latest feature I am taking advantage of is the email posting to my blog. Now I can just type entries in my email and have them posted immediately. This is so much easier for me. Maybe you will see more posts from now on and maybe not.

Google Gmail advertising, privacy and other considerations

I was using Gmail the other day and I noticed that the ads kicked in. I did not have ads on my account for a long time. They are not annoying or anything. They sit on the right side of the screen and they are related to the content of your emails. This caused some controversy when Google first said they would target ads based on email content because people did not want their messages scanned.

It does not bother me. I certainly want to protect my privacy but email is all sent in plain text  unless you use PGP or some other encryption. The main this is to remember what to really keep in email.  Email is not a storage system, it is a communication method.  As long as they do not profile me and then sell it.  I actually clicked on one of the advertising links too. It was not an accident. I noticed the ads and then one of them was for information on the Shinkansen so I went there. I was just thinking of looking for more information on the trains and this time I did not even have to search for it. I doubt I will be “using” the ads often but at least they are relevant and not disturbing. Yahoo’s ads are terrible and Hotmail’s are flashy. These are just text.