A happy place for all the sarcastic displaced kiwi's of the world.
Tuesday, November 27, 2007
Onward and Upward with the Arts: Digitization and its discontentsI have a personal dream, that is, to digitize my life. Words, images and audio video of everything surrounding myself and my family. At first glance it seems doable, I have copies of almost every letter I've written and received. Copies of over 28 years of writing, fanzines, articles, books, pithy little letters of comment to other publications are all filed away in semi-archival states. Photos and videos I've taken and those of my parents are faithfully stored away and are close to being digitally scanned. I think I crossed the 50% mark recently. I also have kept cassette and video tapes of nearly every appearance I have made on the radio or on TV. Lots of embarrassing moments preserved for all posterity. Finding film and video remnants of the rest of my family is proving more difficult. Apparently I'm the only person (so far) interested in preserving this stuff.
A few years back I started cataloguing and sorting all my father's and late mother's possessions prior to him moving into a retirement home. This provided a clear insight into where I had acquired my habits from. My mother and father had kept over 50 years worth of family correspondence, including their letters of courtship and my father's professional letters of reference from the mid-nineteen 30's. Combined with my deceased grandparents saved letters from our family I had both sides of a 25 year conversation after my parents immigrated from London in the mid 60's to distant New Zealand.
This veritable treasure trove of detail into the smallest details of my family's life is invaluable in supplementing my patchy (and growing patchier) memory. And there's the rub of this digital desire, is it better to recollect the past through direct memories? Or to replace and layer on top the actual as-it-happened detail from these easily accessed digital records? As the onset of my late father's Alzheimer's made painfully clear, a memory backup is always useful.
In an exhaustive review of the history of libraries and the rise of aids to quickly locating items stored within them, the New Yorker tackles the issue of the joys of handling (and smelling) original books versus - some would say, the sterile environment of an always reachable digital library containing everything ever published.
Putting aside the obvious benefits of bringing literacy and knowledge to the poorest parts of the world without access to libraries, there's some merit to the argument of maybe rethinking some parts of say Google's desire to scan and digitize the world's knowledge. However on balance I say scan, scan, scan and sort out the aesthetics of interacting with digital materials later. There's always scratch and sniff technology still to be embedded into printed-on-demand 500 year old books right?
Share on Facebook
Thursday, May 24, 2007
Great Idea, Fight Spam and Digitize Books
Every now and then a great idea is announced that makes you slap your head and exclaim "why didn't I come up with that!" If you have ever run Optical Character Recognition (OCR) on a scanned document and had to fix those words that the OCR software couldn't decipher, then you know first hand the limitations of document scanning. Now combine this thought with the number of times each day you identify some weird distorted combination of letters and numbers to prove to a Web application that you are a real person. This latter process is known as a CAPTCHA.
A group of Carnegie Mellon students has developed a clever and elegant solution to the OCR problem by designing a simple Web service called reCAPTCHA to replace the standard CAPTCHA with an "unknown" word from a real life OCR scanning project. In this case the unknown words are coming from the book scanning project run by the Internet Archive.
They have also made it very easy for anyone to incorporate this new tool into their own Web projects, including obfuscating e-mail addresses on Websites, automating free sign-up processes and other means of authenticating users.
It's been estimated that 60 million CAPTCHAs happen each day. If only a fraction of those were replaced with reCAPTCHAs, then that's a lot of OCR mistakes being fixed with very little effort.
Share on Facebook
Sunday, November 13, 2005
My Dream MachineMy Dream Machine
"Please, please, I've been "so" good this past year." So what if it only costs $120,000? It is the archivist's wet dream. When we win the lottery this is one of the first things I'm buying.
The APT BookScan 1200™ automagically scans up to 1200 pages per hour of bound books and other documents. This is twelve times faster than a room full of Chinese prison labourers.
Feel free to order one on my behalf ...
Share on Facebook
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.