Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Copy of open-book-prezi-template

null
by

yasser omer

on 1 September 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Copy of open-book-prezi-template

F
b
D
R
t
e
c
q
e
k
a
Y
m
R
H
y
G
U
O
p
S
C
B
D
o
E
Abstract
English Optical Character Recognition is a soft ware application used to recognized English character by applying Artificial Intelligent using Neural Network algorithms.

In the past, most companies and organizations rely on paper-transactions , lead to increase number of papers in the archive.

Access to the desired information take time and effort.

OCR transfers a scanned image to an editable text file.

Off-line English OCR used for (printed text &handwritten).

This consists of segment and recognize English text data image.

Introduction
Writing is a way to communicate between humans.
Take notes and described knowledge.
The written can be represented in two ways:
-printed text used in books, newspapers, magazines .
-handwritten text can be found in notes and personal letters.
The accuracy of the English OCR has become very high but not in recognizing the characters due to the presence of forms to some character.
Until now EOCR & ETXR is still under development.
Project idea
Useful using scanned text easily.

Decrease time and effort of digital transmission.

OCR make choice of updating & manipulated it more efficient for user.

English Characters
consists of 26 characters (5 vowels, 21 consonants).
Written from left to right.
English script is divided into three fictitious zones.
Each character has height and width.
Height of character varies an important feature for font recognition.
Off-line character Recognition
Intelligent Character Recognition (ICR).
ICR is an advance of (OCR) can recognize handwriting.
Performed after the writing or printing is completed.
The recognition is done on bit for machine-printed or hand- written text.
The writing is usually captured optically by a scanner.
has problem like variability in peoples handwriting.
On-line character Recognition
The computer recognizes the symbols as they are drawn.
This two dimensional coordinates of successive points are represented as a function of time and the order of strokes made by the writer are also available.
Design a system for recognizing off-line text words.

Analysis the current available system .

Evaluate the system accuracy and performance.


Objectives
To preserve digitized documents.

Fully accessible, Search able and Process-able in digital form.

Knowledge contained in paper is more valuable.

Automatic sorting of postal mail.

Convenient editing of printed documents.

Official communication.

Motivation
Our system is" off-line English text recognition".

Dataset to support English language.

By using java programming language that executed .

The recognition of character done by "ANN" which.

The entire is being by scanner .

Used to recognition text without imaging catography .

Editing the recognition text on specific interface.

All be saved on word text files.

Scope
Problem definition
The optical character recognition process translates the scanned image into a machine editable form or electronic format. That can be used to translate books, articles and documents etc. into editable form.

The techniques are computationally difficult and require good amount of time to perform the training of the systems in order to provide good results.

Using the analytical approach to extract features included in English characters.

The optical character recognition process translates the scanned image into a machine editable form or electronic format. That can be used to translate books, articles and documents etc. into editable form.


The techniques are computationally difficult and require good amount of time to perform the training of the systems in order to provide good results.


Using the analytical approach to extract features included in English characters.

Image acquisition
Paper document is being scanned.

It translates paper document into an electronic format.

Given an image that is already in digital form.

Stored in computer in digital.

The input documents can be typed text.

Pre-processing
Pre-processing aims to produce data that is easy for the character recognition systems to operate accurately.

Gaussian filter is normally used to reduce noise in an image


-It reduces noise and distortion.
skewness of the image.

Segmentation
Stage to segment the document into its sub components.


-It separates the different logical parts, like line of a paragraph, and characters of a word.

Types of segmentation:

a) External segmentation .
b) Internal segmentation .

The track the difference of black and white area is a good technique for segmentation.

Feature Extraction

Character recognition involves analyzing segmented part of the image and comparing its features against a set of rules stored on OCR engine that distinguishes each character to identify a character.
The selection of a stable and representative set of features is the main part of pattern recognition system design.

Recognition

It is the main stage of an OCR system.
Use the features extracted in the previous stage to identify the text segment according to preset rules.
Use a neural network algorithms to train and validation the English alphabets and symbols.

The algorithms which we use are:

1) Perceptron neural network.

2) Auto associative neural network.

Perceptron is helpful for testing the alphabets which are found (recognized).
Auto associative is helpful for testing the alphabets which not found (don’t recognized).



Post-processing
Improves recognition by refining the decisions taken by the previous stages and recognizes words using context.

It is responsible for outputting the best solution.

Implemented as a set of techniques that rely on character.



OCR "
Optical Character Recognition
" types .
UML

System Work flow
workflow diagram of English text Recognition
Activity diagram
Tests and Results
Implement the number of alphabets effect on train time:
The result of perceptron & auto associative algorithms in Train & Test phases for 22 character.

Tests and Results
System Function
Tests and Results
Shows a Performance diagram of all process in EOCR.
Tests and Results
The Plane which view the relation between size of an image and its result in Feature Extraction stage:
Performance
Future work
There are a several ways in which the work done in this project can be extended.

Ability to save text as in image.

Ability to recognize Arabic language printed or handwritten.

Increase supporting more formation of scripts like Monotype cursive ,Arial …..etc.

Apply the system as android application.

conclusion
The system proposed could efficiently recognize
the English character whereby printed or handwritten and also efficiently applying image processing on browsing image to be ready entering in segmentation phase that gets the segment image in separately character.


Full transcript