Please enable JavaScript.
Coggle requires JavaScript to display documents.
SKR3202 - Coggle Diagram
SKR3202
Chapter 4
-
-
-
communication
-
Factor to consider
Cost of communications
communications require synchronization between task that can result in task spending time "waiting" instead doing work
-
-
-
-
synchronization
barrier
-
-
when last task reach barrier, all task is synchronized
-
synchronous
-
when a task performs, some form of coordination is requires with the other tasks
-
-
-
I/O
bad new
-
-
-
read operations will be affected by fileservers' ability to handle multiple read request at the same time
-
-
-
performance
debugging, monitoring and analyzing parallel program execution is singnificantly more of a challenge than for serial programs
-
Chapter 2
Processing Elements
-
SIMD
Data parallel, vector computing
-
-
MISD
-
-
Few built, but commercially not available
-
-
MIMD
very general, multiple approaches
unlike SISD, MISD, MIMD computer works asynchronously
Shared memory
easy to build, SISF can easily be ported
-
-
Distributed Memory
Network can be configured to Tree, Mesh, Cube
-
-
-
-
-
-
Operating system models
framework that unifies features, services and task performed
-
simplicity, flexibility and high performance are crucial for OS
Parallel Terminologies
-
-
-
-
-
-
-
synchronization - the coordination of parallel tasks in real time, very often associated with communications
-
-
Chapter 1
-
Why parallel computing
-
-
-
-
-
-
Limitation of serial
Limit serial computing
both physical and practical reasons pose significant constraint to simply building ever faster serial computers
-
-
Concepts and terminology
Von Neumann Architecture
-
-
Basic design
-
-
-
A CPU gets instruction from memory, decodes the instruction and sequentially performs them
-
Chapter 7
-
Performance metrics
-
Speedup
-
ratio of time taken to solve problem on single processor to solve the same problem on parallel computer with p identical processors
-
-
Efficiency
-
-
-
-
-
in ideal parallel system, speedup is equal to p and efficiency is equal to 1
in practice, speedup is less than p and efficiency is between zero and one, depending on the degree of effectiveness with which processor are utilized
Cost
-
the cost of solving a problem on single processor is the execution time of the fastest known sequential algo
-
parallel system is said to be cost optimal if the cost of solving problem on a parallel computer is proportional algorithm on a single processor
since efficiency is the ratio of Ts to Tp , a cost optimal parallel system has an efficiency of one
-
-
Scalability
scalability of parallel system is measure of its capacity to increase speedup in proportion to the number of processor
-
a scalable parallel system can always be cost-optimal if the number or processors and the size of the computation are chosen approximately
ability to maintain efficiency at fixed value by simultaneously increasing the number of processors and size of problem
Chapter 3
Memory architecture
shared memory
-
-
-
-
disadvantages
-
-
programmer responsibility for synchronization constructs that insure correct access of global memory
-
distributed memory
-
-
-
when a processor need data from other processor, its usually the task of the programmer to define when and how data is communicated
advantages
-
the increase of processor, the size of memory increase proportionately
-
-
-
Programming models
shared memory model
tasks share a common address space, which read and write asynchronously
-
advantages - no need to specify explicitly the communication of data between task. program development can often be simplified
-
implementations
on shared memory platform - native compilers translate user program variables into actual memory addresses, which are global
No common distributed memory platform implementation exists. KSR ALLCACHE approach provide shared memory view of data even though physical memory of the machine was distributed
threads model
a single process can have multiple, concurrent execution paths
-
implementation
-
-
in both case, programmer is responsible for determining all parallelism
These implementations differed substantially from each other making it difficult for programmers to develop portable threaded application
-
message passing model
-
-
-
-
MPI was formed with primary goal of establishing a standard interface for message passing implementations
-
for shared memory architectures, MPI implementation usually don't use network for task comm instead they use shared memory (memory copies) for performance reasons
data parallel model
a set of task work collectively on the same data structure, however each task works on a different partition of the same data structure
-
-
-
-
-
-
hybrid
-
-
this hybrid lends itself well to the increasingly common hardware environment of network SMP machines
-
data parallel implementations on distributed memory architecture actually use message passing to transmit data between task, transparently to the programmer
SPMD
-
-
-
at any moment in time, task can be executing the same or different instructions
SPMD have necessary logic programmed that allow different task to be branch or conditionally execute only those part of program
-
MPMD
-
Same with spmd, can be built with any combination of parallel programming
-
while the application is being run in parallel, each task can be executing the same or different program as other tasks
-
-
Chapter 6
MPI
Background
library standard defined by a committee of vendors, implementers and parallel programmer
100% portable: one standard, many implementations
-
-
-
-
-
basic communication
-
-
-
Asynchrounous - call indicates a start of send or receieve, and another call is made to determine if finished
Chapter 9
Big data
large dataset, complex and changeable
cannot be stored, manage, and processed adequately by traditional software management and software tools
Solutions - relying on technology that handle, process and analyze large quantities of data
-
-
-
-
challenge
-
-
diversity of data
structure, semi-structured, multi-structured and unstructured
-
-
Store data processing
batch-based stored
-
-
batch result are produced after data is collected, entered and processed
separate technique or program for input, processing and output
-
-