Musings |
|
|
A happy place for all the sarcastic displaced kiwi's of the world.
www.flickr.com
This is a Flickr badge showing photos in a set called Badge Fotos. Make your own badge here.
|
Thursday, May 24, 2007
Great Idea, Fight Spam and Digitize Books
Every now and then a great idea is announced that makes you slap your head and exclaim "why didn't I come up with that!" If you have ever run Optical Character Recognition (OCR) on a scanned document and had to fix those words that the OCR software couldn't decipher, then you know first hand the limitations of document scanning. Now combine this thought with the number of times each day you identify some weird distorted combination of letters and numbers to prove to a Web application that you are a real person. This latter process is known as a CAPTCHA. A group of Carnegie Mellon students has developed a clever and elegant solution to the OCR problem by designing a simple Web service called reCAPTCHA to replace the standard CAPTCHA with an "unknown" word from a real life OCR scanning project. In this case the unknown words are coming from the book scanning project run by the Internet Archive. They have also made it very easy for anyone to incorporate this new tool into their own Web projects, including obfuscating e-mail addresses on Websites, automating free sign-up processes and other means of authenticating users. It's been estimated that 60 million CAPTCHAs happen each day. If only a fraction of those were replaced with reCAPTCHAs, then that's a lot of OCR mistakes being fixed with very little effort. ![]() Labels: authentication, scanning, Web Services Share on Facebook 1 comments
![]() This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. |