This title appears in the Scientific Report : 2013 

Parallelisation potential of image segmentation in hierarchical island structures on hardware-accelerated platforms in real-time applications.
Suslov, Sergey (Corresponding author)
Zentralinstitut für Elektronik; ZEA-2
Jülich Forschungszentrum Jülich GmbH Zentralbibliothek, Verlag 2013
211 S.
Universität Mannheim, Diss., 2012
978-3-89336-914-0
Book
Dissertation / PhD Thesis
ohne Topic
Schriften des Forschungszentrums Jülich. Reihe Information / information 30
OpenAccess
OpenAccess
Please use the identifier: http://hdl.handle.net/2128/18552 in citations.
The aspects of application development on parallel computing platforms are highly acute today. With the intensive increase in the integration level of silicon devices enabling parallel computing technologies on a single chip the power of supercomputers became available for computing systems of the compact class. Thus, more sophisticated and calculation intensive computing methods have become broadly available for the scientific society and industry. However, the inevitable drawback for this computing boost is often a cardinal change in the application design and development approach.The present thesis addresses two types of compact HPC platforms found to be most successful nowadays: FPGA-based expansion cards and graphics processing unit coprocessing boards. To define their place in the parallel platform domain a profound overview of modern digital single-chip systems is presented. The characteristic features of FPGA and GPU architectures are discussed to identify the major aspects of the application design for these platforms.A special attention is paid to the methodological aspects of both application design concepts. A thorough literature study has been carried out to systematise the complete application development cycle starting with the comprehensive system-level analysis and finishing with the workflow of the implementation on the target compute architectures. Several new ideas and interpretations have been introduced in this work. The application in focus is a fast automated image segmentation method based on hierarchical island structures, the so called GSC (Grey Value Structure Code) worked out by Vogelbruch. This complex segmentation method is feasible for different application areas and provides high-quality segmentation results by the combination of both local precision and global connectivity.Steered by the proposed methodological guideline a comprehensive analysis of the parallelisation potential of the applied method is carried out in this work. Relying on many statistical measurements and results of versatile system models the GSC algorithm is specially reelaborated for the implementation on the two massive parallel computation platforms to achieve a high performance of the segmentation needed for real-time application set-ups. A special attention has been paid to the question of an effective computation organisation for the target platforms. The details of the implementations on both platforms are discussed thoroughly. Finally, the two implementations are compared to highlight their relative merits and downsides for this complex and computation intensive application.The results of the work show that even having a considerably longer development cycle the FPGA-based solution on the Xilinx Virtex II Pro architecture can compete with the implementation on the specialised nVidia GT200 GPGPU card of the next technological generation and can even notably outperform it for image resolutions below 1024$^{2}$, while the nVidia G80 GPU of the same technological evolution cycle cannot be considered as a competitor. Compared to the processing speed on a single CPU (Opteron 2.6 GHz) the FPGA accelerates the application in dependence on the image resolution by a factor of about 23, while the GPU outperforms with factors of 13 to 20.