Please enable JavaScript.
Coggle requires JavaScript to display documents.
FusionInsight Spark 2x (major roles (client, Resource manager, aplication…
FusionInsight Spark 2x
Core concepts
RDD
Definición
read only
partinioned
elastico
spark lineage
operadores
transformacion
accion
Caracteristicas
Dependencies
stages
major roles
client
Resource manager
aplication master
Node manager
Driver
Executor
modelos
YARN client
failure
app master
testing
YARN cluster
production
failure resistance
arquitectura
Spark SQL
Dataset
Dataframe
Diferencias con RDD
comparacion HIVE
speed
bucket
Engine
dependencias
compatibilidad
custom functions
metadata
Streaming
Spark structured streaming
unbounded table
spark streaming
RDD y map reduce
RDD lineage mechanims
Spark engine
Fusion insight
Procesos
JDBC server
Job history
Componentes
opcional
kafka
hbase
mandatory
Hive
zookeper
YARN
HDFS
definición
comparación map reduce
volumen de información
rapidez de procesamiento
aplication scenarios
interactive analysis
stream procesing
machine learning
ETL