Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Transcript of Arabic reCAPTCHA
Prof. Dr. Slim Abdennadher
M.Sc. TA Mohamed Khamis
What is CAPTCHA & RECAPTCHA?
RECAPTCHAs are used to digitize books
* 100 million CAPTCHAs are typed everyday
* Each CAPTCHA takes 10 seconds to be filled
Can we use this effort for the Good of humanity?
ik>SAL + 6S
How can we trust the user ?
RECAPTCHA has two words: one is known, and the other is unknown
Example from a Book
Top Arabic OCR engines
3. APTI Arabic Words
My project (Arabic RECAPTCHA)
Recent events made us lose many Egyptian books
Arabic RECAPTCHA is introduced for digitizing Arabic books
4. Word Spotting
1. Sakhr Automatic Reader
ABBYY Vs. Tesseract
Tesseract which is developed by Google was chosen due to
3 main reasons:
1-Word-recognition is more suitable for the application.
2- has no limit for digitization amount per request since it is free.
3-Recognizes images with multiple columns and gives better results.
Client side code
* Adding some degradations of random dots and lines
* Following the same standards of the english recaptcha
Arabic RECAPTCHA web service recieves two http requests:
1. The first request retrieves a new AreCAPTCHA.
2. The second validates the input of the user.
The Code is divided into 3 components
Sends html code
Recieves html code
1. Sustain the formation signs of Arabic text after digitization.
2. Use an ICR or an OCR that gives better detection for Arabic
3. Making a game with the purpose of classifying the words instead of the admin to save his time.
Words's classification process
Different CSS Styling colors
Testing Phase output