Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

Definition

Optical Character Recognition:

  • The process of using imaging systems and computers to analyze and recognize printed text.

Optical Character Recognition

Matthew Muncy, Nick Galewski, & Stephen Smith

Our Goal

To explore the following topics:

Mobile Development

Image Processing

Optical Character Recognition

Heterogeneous Computation

Tesseract

RenderScript

History:

Android Specific API

  • Developed by HP in 1985
  • Open Source

Heterogeneous Computation

C99 derived syntax

Architecture:

Accessed via Java API

=

Analogous to CUDA and OpenCL

Dr. SlowCode: or how I stopped waiting and learned to love the GPU

Example

  • Text Recognition
  • Training Data

Conversion to grayscale in RenderScript

// called by for each module

void root(const uchar4 *in, uchar4 *out) {

// get input pixel

float4 f4 = rsUnpackColor8888(*in);

// calculate gray

float gray = ((0.21 * f4.r) + (0.71 * f4.g) + (0.072 *b));

float3 pix = { gray, gray, gray };

out* = rsPackColorTo8888(pix);

}

Consider grayscale conversion for 4 MP image:

Cropping an Image

Image Manipulation Techniques

Sharpening

  • Improve the legibility of letters

Example

Before

After

  • 4 million function calls to get the pixel from the image
  • 16 million function calls to get each color from the pixel
  • 4 million calculation of the gray color
  • 4 million function calls to place the pixel in the image
  • Image Sharpening

Contrast Manipulation

Increase difference between background and foreground

// calculate factor for manipulating contrast

double factor = (259 * (contrast + 255)) / (255 * (259 - contrast));

// perform the following step for all parts of all pixels

// grab the value of the color

Red = Color.red(pix);

// calculate the new value

Red = (int) (factor * (R - 128) + 128);

// ensure that the value is valid

Red = preventInvalid(R);

Example

Before

After

Slowness

Grayscale

  • Conversion of the image from color to grayscale
  • An intermediary step between color and binary

// calculate grayscale pixel color

gray = (int)(0.21 * red + 0.71 * green + 0.072 * blue);

  • Contrast Manipulation

=

Binarization

  • The process of converting the image from grayscale to binary
  • Uses Otsu's adaptive thresholding method

// determine the threshold by creating a histogram for the grayscale image

// use the threshold to set each pixel in the output image

if ( gray > threshold) {

newPixel = 255;

} else {

newPixel = 0;

}

Before

Example

After

  • Conversion to Grayscale
  • Binarization

Results

Issues

After

Dropbox Functionality

PDF Generating

Utilization of iTextG Library

Before

  • Ability to save PDF Documents to and pull images from the cloud
  • Powerful Library
  • No ability to correct for perspective distortions

93.2%

  • Text positioning
  • Dynamic Generation from Parent Method

+

  • Universally available

WriteTextAbsPX(Document doc, PdfWriter writer, String text, float x, float y)

  • Ease of use for user
  • IDE Configuration Difficulties

Y Coordinate Correction

  • Initial Location: Bottom Left
  • 28 Pixel offset
  • Version Controls

After

// Flip Y Axis also add 28px as the top() isn't the absolute top

y = doc.top() - y + 28;

content.moveText(x, y);

Before

79.47%

  • Discard Unnecessary Information
  • Reduction in Error Rate
  • Reduction in Computation Requirements

Android

  • Open Operating System
  • Test Devices Available
  • Java Familiarity
  • Simple Tools

Tools

  • Android
  • Tesseract
  • Android Development Kit
  • Java
  • Renderscript API

Convolution Matrix applied to each pixel in the bitmap

0 0 0 0 0

0 -1 -1 -1 0

0 -1 9 -1 0

0 -1 -1 -1 0

0 0 0 0 0

Learn more about creating dynamic, engaging presentations with Prezi