Definition
Optical Character Recognition:
- The process of using imaging systems and computers to analyze and recognize printed text.
Optical Character Recognition
Matthew Muncy, Nick Galewski, & Stephen Smith
Our Goal
To explore the following topics:
Mobile Development
Image Processing
Optical Character Recognition
Heterogeneous Computation
Tesseract
RenderScript
History:
- Developed by HP in 1985
- Open Source
Heterogeneous Computation
Architecture:
Analogous to CUDA and OpenCL
Dr. SlowCode: or how I stopped waiting and learned to love the GPU
Example
- Text Recognition
- Training Data
Conversion to grayscale in RenderScript
// called by for each module
void root(const uchar4 *in, uchar4 *out) {
// get input pixel
float4 f4 = rsUnpackColor8888(*in);
// calculate gray
float gray = ((0.21 * f4.r) + (0.71 * f4.g) + (0.072 *b));
float3 pix = { gray, gray, gray };
out* = rsPackColorTo8888(pix);
}
Consider grayscale conversion for 4 MP image:
Cropping an Image
Image Manipulation Techniques
Sharpening
- Improve the legibility of letters
Example
Before
After
- 4 million function calls to get the pixel from the image
- 16 million function calls to get each color from the pixel
- 4 million calculation of the gray color
- 4 million function calls to place the pixel in the image
Contrast Manipulation
Increase difference between background and foreground
// calculate factor for manipulating contrast
double factor = (259 * (contrast + 255)) / (255 * (259 - contrast));
// perform the following step for all parts of all pixels
// grab the value of the color
Red = Color.red(pix);
// calculate the new value
Red = (int) (factor * (R - 128) + 128);
// ensure that the value is valid
Red = preventInvalid(R);
Example
Before
After
Grayscale
- Conversion of the image from color to grayscale
- An intermediary step between color and binary
// calculate grayscale pixel color
gray = (int)(0.21 * red + 0.71 * green + 0.072 * blue);
Binarization
- The process of converting the image from grayscale to binary
- Uses Otsu's adaptive thresholding method
// determine the threshold by creating a histogram for the grayscale image
// use the threshold to set each pixel in the output image
if ( gray > threshold) {
newPixel = 255;
} else {
newPixel = 0;
}
Before
Example
After
Results
Issues
After
Dropbox Functionality
PDF Generating
Utilization of iTextG Library
Before
- Ability to save PDF Documents to and pull images from the cloud
- No ability to correct for perspective distortions
93.2%
- Dynamic Generation from Parent Method
+
WriteTextAbsPX(Document doc, PdfWriter writer, String text, float x, float y)
- IDE Configuration Difficulties
Y Coordinate Correction
- Initial Location: Bottom Left
After
…
// Flip Y Axis also add 28px as the top() isn't the absolute top
y = doc.top() - y + 28;
content.moveText(x, y);
…
Before
79.47%