Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Embedded Image Processing using ARM based FPGA
Transcript of Embedded Image Processing using ARM based FPGA
Danielle Sullivan and Eliza Bailey
May 6th, 2014
FPGAs and Image Processing
Altera Cyclone V SoCKit FPGA
Major Benefits of FPGAs
Time to Market
System on Chip (SoC) FPGA Benefits
Improve system performance through high-bandwidth interconnect between Hard Processor System (HPS) and FPGA
Interface between software and hardware
Design process simplified
One chip compared to two
Embedded ARM based FPGA
ARM based SoC FPGA combines the following in a single SoC:
multiport memory controller
dual-core ARM processor (CPU/ microprocessor)
System Development Flow
System Development Flow Cont.
JTAG - FPGA fabric programming
Embedded Linux - precompile image saved on SD card
Serial over USB - serial communication with embedded Linux
Enables communication with embedded Linux file system
Cross Compiler - compiler which is capable of creating executables for systems other than which the compiler is running on
How This applies to us:
We separate our build environment from the target environment by pre-compiling our vip_mpeg file
Embedded Linux is our target environment
Altera Golden Reference Design
Tested and confirmed functional HPS configuration
Setup peripheral functions
Configured Qsys system
integration between the modules
Enable SW/HW handoff
GHRD is very specific design, control very specific customizable peripherals
Configuring this setup gave us experience in:
Design flow process
System integration within SoC designs
Demo 1: Blinking LED
Demo2: VIP Reference Design
Comparing Hardware and Software
Hardware vs. Software run-time comparison of bouncing picture-in-picture application
Processing of application run on the Embedded ARM FPGA architecture versus the pure software implementation on CPU via OpenCV's Python platform
OpenCV implementation had run-time of 3 min and 33 sec compared to 15 min and 46 sec SoC run-time
Using hardware for image processing move function was faster compared with Open CV
memcpy was long and misleading
copying data from DDR3 on HPS to DDR3 on FPGA
Thank you for listening.
Specifically we would like to thank:
Professor Chen-Huan Chiang
PuTTY - Cross Compilation and Debugging
SoCKit Board Development
Qsys - Make appropriate changes to VIP mixer demo parameters
MatLab - Generate new logo
Memory Initialization File (.mif)
Modify C Code to meet new design needs
Change sizing parameters
Modify moving function to our new needs
Developed our program via OpenCV's Python extension
optimized for fast image reading, display, and processing
One output image, three different portions of image
smaller second image
Values changes by a set delta x and y value
If the smaller picture hits the boundary of the window, the delta x or y values are negated
Function Run-time Analysis
reading, decoding, moving
FPGA - purely hardware approach (no processor involvement)
Hardware used the Golden Reference Design
Provide link from one pin of CPU to LED output
GPIO running in the embedded Linux
Demo: OpenCV Implementation
Results - Moving Function
Final application shows the lag resultant in the background frame of our SW/HW version
Partially attributed to the slow reading time and decoding times
OpenCV reading time of the background image is on average 74.4 times quicker than the libmpeg2 equivalent
libmpeg2 reading of layer 1 is also slower
So small resulting in no delay visible to the naked eye
OpenCV's reading function incorporates both reading an decoding
The move function, was approximately 3 000 times faster in FPGA compared to OpenCV
Real time processing of images is computationally expensive
As image technology progresses, pixel number and quality are growing quickly
Image processing requires an increasingly large amount of processing power
These computationally intensive tasks’ performance could be greatly improved with the parallelism offered by an FPGA
Compare run-times of software and hardware image processing
Gain familiarity with Embedded Design System for SoC FPGA
Learn IP Core-based SoC design methodology
ex: Altera Video Image Processing (VIP) Design Suite with Qsys
Solution combining OpenCV with SoC platform would allow to take advantage of each systems benefits
Comparing SoC and OpenCV implementations for more complex image processing techniques could result in run-time gains
Board does not support real time video input
Results - Moving Function
Timing in the software part of the SoC only incorporates the time needed to write the data to the registers
Hardware moving is done in parallel with software
Hardware cannot be timed in software so we have to rely on hardware latencies to approximate the time
Approximations found by multiplying the clock period time by the latencies (1/300MHz *(<1 + <1 + <1 +3))
 Altera. Cyclone V SoC Development Kit User Guide. Altera Organization, November 2013.
 Altera. Video and Image Processing Suite User Guide. Altera Organization, February 2014.
 Altera Corporation. SoC Embedded Design Suite. Altera.
 Altera Corporation. Altera video image process- ing (vip) solution. Online PPT, 2008.
 Altera Corporation. Altera SoC Embedded De- sign Suite User Guide. Altera, November 2013.
 Altera Corporation. Embedded Linux Getting Started Guide. Altera, March 2013.
 Steven Kravatsky. Arrow sockit evaluation board - how to boot linux. RocketBoards.org publication, April 2014.
 Steven Kravatsky. Gsrd user manual - arrow sockit edition. RocketBoards.org publication, March 2014.
 RocketBoards.org. DC934 Linux Application Users Guide For the SoCKit Board, May 2013.
 Y. Sorel. Real-time embedded image pro- cessing applications using the algorithm archi- tecture adequation methodology. In Proceed- ings of IEEE International Conference on Im- age Processing, ICIP’96, Lausanne, Switzerland, September 1996.
3 000 Speed Up
Demo: SoC Implementation