reCAPTCHA
From Wikipedia, the free encyclopedia
reCAPTCHA is the process of utilizing CAPTCHA to improve the process of digitizing the text of books. It takes scanned words that optical character recognition software have been unable to read, and presents them for humans to decipher as CAPTCHA words.
[edit] Operation
In order to verify that humans can decipher these previously undetectable words correctly, two words are displayed; one is a word which a OCR software has been unable to read, and the other is a word which several other human users have already been able to identify. If the user recognises the identified word, it is assumed that they were also correct about the new word.[1][2]
reCAPTCHA tests are taken from the central site of the reCAPTCHA project[3] as they are supplying the undetected words. This is done through a Javascript API with the server making a callback to reCAPTCHA after the request has been submitted. The reCAPTCHA project provides libraries for various programming languages and applications to make this process easier. reCAPTCHA is a free service, except for users who would require a prohibitive amount of bandwidth.[4]
reCAPTCHA has the same goal as Distributed Proofreaders, although DP uses conventional proofreaders.
[edit] Notes
- ^ May 24: Carnegie Mellon Project Boosts Book Digitization Efforts - Carnegie Mellon University. Retrieved on 2007-06-23.
- ^ Spam weapon helps preserve books — BBC news report by Paul Rubens, 2007-10-02.
- ^ The reCAPTCHA project - part of the Carnegie Mellon School of Computer Science at Carnegie Mellon University.
- ^ http://recaptcha.net/faq.html
[edit] External links
- The reCAPTCHA projectde:Captcha#Weiterentwicklung
fr:ReCAPTCHA sk:ReCAPTCHA pl:ReCAPTCHA

