Personal computer archaeology

Posted on 2009-10-23 19:10

reviewing the massive amount of data I've accumulated and thinking about organizing it a bit.

I gave a little thought to how I might proceed with my journal printing project, and I've spent a little time poking through the mess that is my digital life. Piecing together a log of what you did and thought about ten years ago from various disparate sources like emails, notes, documents and photos - even when you can be sure of the date on the files - is a lot like doing archaeology. I've started peeling away the dirt, finding clues in snippets of text files and bookmark backups, things I wrote and things written to me. The information abounds but its all disjointed and disconnected, located in folders here and there, and no simple algorithm exists to pull it all together and easily create a cohesive story. The job seems monumental even if I only aim to create a simple linear log of entries.

I am starting to form an idea of what I might like out of the project - a yearbook and scrap book of sorts. Journal-like entries in chronological order, illustrated with photos and sidebars of interesting information. Just looking at the stuff on my main Wordpress blog on turned up a potential 1,300 pages (in a 6"x9" format) and this blog is only one source of writings! Faced with this huge amount of data, my first thought was to split it up by year, so I started creating directories named by year where I could store anything I found.

Old emails I started going back through old emails and beginning with the year 2000, saving anything from that year that was even remotely like a journal entry into that folder. There are lots of these in email especially since by that time I was the only one of my immediate family still living here, and we corresponded regularly in this way. Going through the emails from my parents gave me the idea to eventually put together a similar book of these letters in the future. I'm pretty sure I want to keep this project journal-like and to not include writings of others, but as we'll see that's not easy. Emails are where I'm finding the most personal journal-like pieces, but they aren't like writing for myself as I used to do in my old hand written journals - obviously they are written to someone else. Its really been amazing reading some of that stuff again and its only given me more interest in going ahead with this silly project.

Websites I've run I've run several different websites since that time, so I started looking through the old backups of these sites (Blosxom, Yabb, Geeklog) for anything relevant as well. A lot of the posts on the Geeklog site made it into the current site, but only a few of those on the Yabb forum did (I'm guessing that was because the other stuff wasn't worth saving but I feel compelled to look anyway). While the topics on the blog are not overly personal, the addition of reader feedback brought out responses which wouldn't otherwise have made it into a journal - things I didn't think about when I wrote the original post, or defenses of my position. This stuff can be interesting sometimes but I'm torn about how to deal with it - I don't really want to have reader's comments in a journal but I can't see how to avoid it if I want to include my own responses to them. How to present them is another question (ie: the style of the text - what should a commenter's text look like?)

Documents and Files I started doing general searches on the hard drive for any files created in that year - this gets complicated because of my move to a new operating system in 2002 (from Linux). While not a fault of the Mac, when I moved over I wasn't careful to preserve file attributes and as a result most of the older files carry the date of the move - October, 2002. Emails are dated and don't have this kind of problem, but documents (unless I bothered to date them in the document itself) are going to be a problem. Then I realized I had backups of most documents from before the move, but even though most of my backups from that era fit on a couple CDs, its still a vast amount of stuff to go through. Searching by date narrows the stuff down, but I still have to eyeball things to find out what's worth printing.

Projects and Photos I've always been creating things on the computer, be it tape case covers or business cards, or Christmas cards or invitations, programming or scripting projects, plans for websites, 3D or CAD work, stuff done for games I was into at the time, lists of songs I liked, songs I was writing or learning, etc. It would be cool to include reference to some of this stuff in a journal - but of course there would have to be some kind of caption or explanation for why I included it or what it is. Also the question arises as to how to present some of these kinds of things. Trying to do that with only contemporary docs (things I wrote then) might prove difficult if not impossible - dropping down an image in association with an email or a journal entry discussing that project for instance. Photos are another class of item entirely. Printing the journal would be very expensive if I tried to do a full color piece, though I'd hate to have a black and white print end up as the only extant copy of an image that is really important to me one day. Also, if I go with a black and white print (the most likely case), the images I choose would have to look good in black and white. It seems stupid to not include some photos, but including them will make this so much more work.