Please enable JavaScript.

Coggle requires JavaScript to display documents.

GPU acceleration (using Hadoop/Map Reduce (Haggis - HadoopGIS extension,…

- - - - PixelBox - algorithm for GPU implementation for spatial join with a hybrid framework of GPU+CPU. Objective is to reduce the compute intensive floating point computations, by pixel comparisons, on GPU by using rasterization method. Has shown speed up of 18X over parallel SDBMS (PostGIS is used as SDBMS - geometric libraries GEOS, CGAL were tested which use plane sweep algorithm for polygon intersection and union. Not suitable for parallelization)
        
        uses ray tracing algorithm to compute the pixel containment in one polygon or both or completely outside. No.of pixels in each category indicate the area of intersection and union of polygon pairs. This is used in Jaccard Similarity computation. As pixel positions can be checked independently, it is parallelized on GPU
        
        GPU used to compute intersect area and union area partially per thread, reducing it to the total is done later by CPU, as this computation cost is very less, not efficient for GPU. Pixelization is to treat the polygons as set of pixels - Pixelization threshold defines the sample box approach to take region wise chunks of pixels - instead of single.
        
        compute intensity is controlled by having region by region checks for overlap, instead of single pixels, considering the locality of reference. Only near boundary pixels should be carefully subdivided and checked. Otherwise region wise checks fastens the computation.
        
        Programming - NVIDIA CUDA 4.0.
        Intel Threading Building Blocks, Pthreads, H/W - NVIDIA GeForce GTX 580 GPU. other platform is an Amazon EC2 instance (for parallel SDBMS) with two Intel Xeon X5570 2.93GHz CPUs
        (totally 8 cores, 16 threads) and two NVIDIA Tesla M2050 GPUs.
        
        Performance - considers only compute time, ignoring disk load & other overheads. PixelBox as algorithmic choice was compared against GEOS on standalone system and showed 2X speed up (CPU used). various parameters of the algorithm are tested and effect of these noted. PixelBox with all optimizations shows speed up of about 18X and more.
      - pathology dataset exhibits rectilinear polygons - vertex coordinates are integer valued and edge directions are either horizontal or vertical.
    - - GPU used for query engine acceleration, MR layer not modified.(Empirically shows the compute intensity of Polygon intersection & intersecting area computation using stand alone implementation. using PixelBox)
        
        Hadoop-GIS: Spatial query is submitted as set of operators - that are later converted to Map Reduce to run. Spatial operations are implemented as part of the query engine RESQUE. To accelerate these computations - RESQUE is extended with operators for GPU. This is Haggis: computations are either run on CPU or GPU based on a predictive cost model.
        
        Performance gain of hybrid system vs CPU alone, is not satisfactory - is in the range of 0.9 to 1.3X, max 3X , with Reducer count variation. When no. of CPU cores increased, this speed Up still got reduced.
  - - - Word count example was tried with file size variations of 16, 32 and 64M. Overall performance was analyzed with effort in writing & translating code & testing and time taken. .
        
        JCUDA showed better performance but is very complex to write and test. Hadoop pipes shows similar performance, with much ease of writing & testing the code.
- - - - multicore, GPU standalone, GPU cluster, spatial spark cluster compared. Nvidia Tegra K1 SoC boards with Kingston's SSD (120 GB), 4 node cluster used. ARM cores programmed with C/C++ and OPENMP. GPU with CUDA programming.
        
        GPU cluster shows upto 3X speedup than Multicore, spatial spark had slower results. Analysis of this left for future work.
      - Cloudera Impala (Relational SQL engine based on HIVE) extended for In memory spatial processing. Its front end (Abstract Syntax Tree (AST) extended for spatial join operation - left with tiled parttions, right broadcasted.) GPU programming done in CUDA. ray tracing algorithm used for testing containment (point in polygon test). This has O(V) complexity for polygons with V vertices.
        
        GPU cluster is upto 1.5X faster than MC for smaller data(NY city taxi trip), upto 3X faster for larger (global ecological data)
  - - - suggests spatial join and overlay is not easily parallelizable, as requires complex tree searches,so not suitable for GPU implementation.
      - cost based model to select the device for computation - CPU/GPU, depending on the availability as well. Multi key index search operation is suitable for GPU. B+ tree array based index was used. Search operation showed 6 to 13 X speed up.
        
        System had Nvidia GEFORCE GTX 470 with 1.25GB memory and 448 cores. CUDA programming
  - - - quad tree indexing used for spatio temporal index - temporal slicing and spatial indexing, structure is stored as an array to suit GPU - phase 3 in index building.
        
        query is processed with filtering and refine phases using GPU - index table look up and performing all the steps in parallel by GPU cores. Speed up not discuseed
        
        System architecture discussed with the component modules, but hardware config is not given. (CUDA or Non-CUDA, not specified)
        
        Spatial Join operation is implemented using GPU