Please enable JavaScript.
Coggle requires JavaScript to display documents.
GPU acceleration (using Hadoop/Map Reduce (Haggis - HadoopGIS extension,…
GPU acceleration
using Hadoop/Map Reduce
Haggis - HadoopGIS extension, Aji ,Ablimit, SIGSPATIAL 2014
PixelBox - VLDB 2012 (Spatial cross referencing - on GPU, but non Map Reduce
PixelBox - algorithm for GPU implementation for spatial join with a hybrid framework of GPU+CPU. Objective is to reduce the compute intensive floating point computations, by pixel comparisons, on GPU by using rasterization method. Has shown speed up of 18X over parallel SDBMS (PostGIS is used as SDBMS - geometric libraries GEOS, CGAL were tested which use plane sweep algorithm for polygon intersection and union. Not suitable for parallelization)
uses ray tracing algorithm to compute the pixel containment in one polygon or both or completely outside. No.of pixels in each category indicate the area of intersection and union of polygon pairs. This is used in Jaccard Similarity computation. As pixel positions can be checked independently, it is parallelized on GPU
GPU used to compute intersect area and union area partially per thread, reducing it to the total is done later by CPU, as this computation cost is very less, not efficient for GPU. Pixelization is to treat the polygons as set of pixels - Pixelization threshold defines the sample box approach to take region wise chunks of pixels - instead of single.
compute intensity is controlled by having region by region checks for overlap, instead of single pixels, considering the locality of reference. Only near boundary pixels should be carefully subdivided and checked. Otherwise region wise checks fastens the computation.
Programming - NVIDIA CUDA 4.0.
Intel Threading Building Blocks, Pthreads, H/W - NVIDIA GeForce GTX 580 GPU. other platform is an Amazon EC2 instance (for parallel SDBMS) with two Intel Xeon X5570 2.93GHz CPUs
(totally 8 cores, 16 threads) and two NVIDIA Tesla M2050 GPUs.
Performance - considers only compute time, ignoring disk load & other overheads. PixelBox as algorithmic choice was compared against GEOS on standalone system and showed 2X speed up (CPU used). various parameters of the algorithm are tested and effect of these noted. PixelBox with all optimizations shows speed up of about 18X and more.
pathology dataset exhibits rectilinear polygons - vertex coordinates are integer valued and edge directions are either horizontal or vertical.
Spatial join (Jaccard Similarity), Spatial indexing used.
GPU used for query engine acceleration, MR layer not modified.(Empirically shows the compute intensity of Polygon intersection & intersecting area computation using stand alone implementation. using PixelBox)
Hadoop-GIS: Spatial query is submitted as set of operators - that are later converted to Map Reduce to run. Spatial operations are implemented as part of the query engine RESQUE. To accelerate these computations - RESQUE is extended with operators for GPU. This is Haggis: computations are either run on CPU or GPU based on a predictive cost model.
Performance gain of hybrid system vs CPU alone, is not satisfactory - is in the range of 0.9 to 1.3X, max 3X , with Reducer count variation. When no. of CPU cores increased, this speed Up still got reduced.
-
HadoopCL - IEEE PhD forum, 2013
-
Embedding GPU computations in Hadoop, Journal, 2014
GPU integration tested with 4 approaches - JCUDA, JNI, Hadoop Pipes, Streaming Hadoop
Word count example was tried with file size variations of 16, 32 and 64M. Overall performance was analyzed with effort in writing & translating code & testing and time taken. .
JCUDA showed better performance but is very complex to write and test. Hadoop pipes shows similar performance, with much ease of writing & testing the code.
for Spatial operation
Zhang,Simin You, conf,2015
Point in Polygon, Point to polyline joins
multicore, GPU standalone, GPU cluster, spatial spark cluster compared. Nvidia Tegra K1 SoC boards with Kingston's SSD (120 GB), 4 node cluster used. ARM cores programmed with C/C++ and OPENMP. GPU with CUDA programming.
GPU cluster shows upto 3X speedup than Multicore, spatial spark had slower results. Analysis of this left for future work.
Cloudera Impala (Relational SQL engine based on HIVE) extended for In memory spatial processing. Its front end (Abstract Syntax Tree (AST) extended for spatial join operation - left with tiled parttions, right broadcasted.) GPU programming done in CUDA. ray tracing algorithm used for testing containment (point in polygon test). This has O(V) complexity for polygons with V vertices.
GPU cluster is upto 1.5X faster than MC for smaller data(NY city taxi trip), upto 3X faster for larger (global ecological data)
Chavan Harshada, Workshop(ICDEW), conf, 2016
GPU acceleration for spatial operation - Spatial filtering operation showed 8X speed up. other parameters as index size and degree of parallelism checked for the effect on speedup. challenges in GPU usage - appropriate data structures as array vs tree structures for GPU, load balancing, memory access patterns and data skew.
suggests spatial join and overlay is not easily parallelizable, as requires complex tree searches,so not suitable for GPU implementation.
cost based model to select the device for computation - CPU/GPU, depending on the availability as well. Multi key index search operation is suitable for GPU. B+ tree array based index was used. Search operation showed 6 to 13 X speed up.
-
Chavan Harshada, ACM SIGMOD, 2017 : Demo paper
-