Please enable JavaScript.
Coggle requires JavaScript to display documents.
Advanced Programming in Unix Environment (Process Control p261 (fork p263,…
Advanced Programming in Unix Environment
File
File Type
p128
Regular
text/binary data
Directory
only Kernel can write directly to a directory file
FIFO (Named Pipe)
communication between
local
processes
Socket
network
communication between processes
Symbolic Link
indirect pointer to another file
hard link
points directly to i-node of a file
requires the link and file reside in the same FS
Only superuser can create hard link to directory
can link across FS
any user can create symbolic link
Block Special
buffered
IO access in
fixed-size
units to devices
Character Special
unbuffered
IO access in
variable-sized
units to devices
File System
p146
IO
Standard IO
p177
Stream
when open/create a file, a
stream
is associated with the file.
when open a stream, fopen returns a pointer to a FILE object
stream is multibyte/singlebyte if multibyte/singlebyte IO function is used on it
freopen
clear stream's orientation
fwide
set orientation to single/wide
Open a Stream
fopen
freopen
fdopen
Read/Write
Char-at-a-time IO
Line-at-a-time IO
p187
fgets, fputs
implemented using
memccpy
(3), which is implemented in Assembly for efficiency.
Much faster
than char-at-a-time IO
Direct IO
p191
fread
fwrite
ONLY work for data
on the same system
, because:
offset of a member in struct can differ between compilers and system
the binary formats used to store multibyte integers and floating-point values differ among machine archs
Positioning
p191
ftell, fseek
assume file position is within a long integer
ftello, fseeko
support position longer than long int
fgetpos, fsetpos
use abstract data type fpos_t to record position
use this when
porting app to non-UNIX
systems
Memory Stream
p205
no underlying files, but still accessed with FILE pointer
All IO is done by transferring bytes to/from buffers in
main memory
fmemopen
take a buffer to be used for the memory stream,
if
null
, create a buffer, and
auto-free
when stream closes
FILE object
structure containing fd of actual IO, pointer to buffer, buffer size, current buffer offset, error flag etc.
fflush(FILE *fp)
pass unwritten data for the stream *fp to kernel
if fp is NULL, flush all output streams
Buffering
179
Fully buffered
actual IO occurs when buffer is filled
usually used for files on disk
obtained from malloc in first IO in stream
Line buffered
acutal IO occurs when a new-line char is found on input or output
usually used on terminal - stdin, stdout
buffer size is
fixed
, so
IO may happen before new-line if buffer is full
whenever input is requested thru std IO lib from unbuffered/line-buffered stream, all line-buffered output streams are flushed
Unbuffered
no buffering
stderr stream is unbuffere
Formatted IO
p193
Advanced IO
p514
Nonblocking IO
p514
system calls related to disk IO are NOT considered slow
turn on the
O_NONBLOCK
file status flag when open file descriptor / update already opened file
Record Locking
p518
prevent other processes from modifying a region of a file while the first process is reading or modifying that portion of the file
lockf / fcntl
allow caller to
lock arbitrary byte ranges
in a file
Implied Inheritance and Release of Locks
p526
Locks are associated with a process-file pair.
when a process terminates, all its locks are released.
when a file descriptor is closed, any locks on the correspoding file are released
locks are
never
inherited by the child across
fork
if
close-on-exec
is set, all locks are released when the file descriptor is closed by exec. otherwise locks are inherited by the new program
IO Multiplexing
p534
build a list of file descriptors we are interested in,
call a function that does not return until one of the descriptors is ready for IO
select
p536
send a list of file descriptors to kernel, with conditions we listen on each descriptor.
kernel returns total count of descriptors ready, and which conditions they are ready for
Asynchronous IO
p542
Memory-Mapped IO
p558
map a file on disk into a buffer in memory
when fetch bytes from buffer, the bytes in file are read
when store bytes into buffer, the bytes are auto-written to the file on disk
mmap
tell the kernel to map a given file to a region in memory
cannot allow more access than the open mode of the file
cannot append to a file with mmap. bytes beyond the mapping scope will not be reflected to the file on disk
relevant signals
SIGSEGV
tried to access memory not available
SIGBUS
access a portion of mapped region that does not make sense, e.g. accessing where the file was truncated
inherited by child process across a
fork
, since it's part of parent address space
changes to mapped region are not written back to file immediately. kernel daemons decide when to write dirty pages
msync
flush changes on mapped memory to the file on disk
Process Environment
p231
main
special start-up routine
called before main is called, which is the starting address of the program (set by link editor, invoked by C compiler).
takes values from the kernel (command line args and env vars)
if main returns, call exit funtion
exit
atexit(void (*func)(void))
p235
execute in reverse order of registration, like a stack
register a function pointer to execute before exit
automatic IO cleanup call, like fclose to all open files/streams under the process
Environment List
p237
each program is passed an environment list
extern char **environ - an array of character pointers, each containing the address of a null-terminated C string
logic control
setjmp / longjmp
p247
setjmp
(
jmp_buf
label) - called from the location we want to jump to
longjmp
(
jmp_buf
target) - jump to the location of passed-in target
a
goto
for
cross-function
jump
Memory Model
p239
Memory Layout of C Program
p239
Text Segment
sharable, only 1 copy in memory is enough
read-only
machine instructions that CPU executes
Initialized Data Segment
variables (outside any function) explicitly initialized/assigned in the program
Uninitialized Data Segment
variables (outside any function) declared but not assigned, initialized to default 0/null by kernel before program executes
Stack
store automatic and temporary variables
info (address to return to, caller environment/context) when each time a function is called
Heap
dynamic memory allocation happens here
Memory Allocation
p241
malloc
allocates number of bytes, initial value of the memory is not determined
calloc
allocates number of objects of given size, all initialize to 0 bits
realloc
resize existing allocated memory area
usually implemented by
sbrk
system call
free
deallocate space pointed by the passed-in pointer
freed memory is usually put into a mem pool for later use
Resource Limits
p255
Process Control
p261
system process - pid 0
scheduler process (swapper)
part of kernel, no program on disk for this process
init - pid 1
invoked by kernel at the end of bootstrap procedure
never dies
normal user process, but with superuser priviledges
init process becomes parent of any child process whose parent terminated
whenever init's children terminates, init calls one of the wait functions to fetch termination status
fork
p263
create a new child process
called once, returns twice (both parent and child processes)
both
parent and child processes continue executing instruction after fork
child process gets a
copy
of parent's data space, heap and stack, NOT shared. But text segment is shared
parent and child processed
share file table entry for each open file descriptor
, hence share
the same file offset
(file locks are NOT inherited)
child process becomes
zombie
if terminated before parent waited on it
vfork
NO copying parent memory address space to child process
guarantees the child runs first, until the child calls exec or exit
wait / waitpid
wait blocks until one of the child processes terminates
waitpid controls which process it waits for
process termination
when a process terminates, the kernel sends SIGCHLD signal to the parent process
default action on SIGCHLD is to ignore. but can also provide a signal handler function
exec
when a process calls exec, it's completely replaced by the new program
process ID remains the same, exec merely replace text, data, heap, stack segments with the new program from disk
Process Group
p326
collection of multiple processes associated with the same job
processes in a group receive signals from the same terminal
Process Group Leader
has the same pid as the group
can create proc group, create process in the group, and terminate processes
proc group
still exists
as long as
any
process still alive,
NOT necessarily the leader
shell pipeline
can create process group from processes, e.g. proc1 | proc2 | proc3
Process Relationship
p319
login
init process reads /ect/ttys, for every terminal device that allows login, does a fork -> exec getty
login program call getpwnam to fetch password file entry,
call crypt to encrypt password typed from input and compare with pw_passwd from shadow password file entry
Signals
p347
no signal has number as 0 (null signal). it's used by kill as a special case
Source of signals
terminal
-generated by user pressing certain terminal keys like ctrl-C -> SIGINT
hardware exceptions
generate signals and notify kernel. then kernel generates signal for the process that was running at that time
kill
function allows a process to send any signal to another process, given that we are owner of the target process or superuser
software conditions
generate signals, e.g. SIGURG, SIGPIPE, SIGALRM etc.
action upon signal
Ignore the signal (SIGKILL & SIGSTOP will
never be ignored
)
Catch the signal by running a provided function upon the signal
apply default action
slow system calls
Reads
that can block caller forever if data is not present
Writes
that can block the caller forever if the written data is not accepted
Opens
on certain file types that block until some condition occurs
pause
and
wait
function
some
ioctl
operations
some
interprocess communication functions
disk IO
returns and
unblocks quite quickly
a signal is generated for a process when the event occurs. the kernel usually sets a
flag
of some form in the
process table
Threads
p417
a process can have multiple threads, sharing everything of the process
Components
thread ID - identify thread
within
a process
register values
stack
scheduling priority and policy
signal mask
errno variable
Thread Synchronization
p431
modification on memory can take
more than one memory cycle
need a lock that allow
only one thread to access
the variable at a time
Mutex
p433
pthread_mutex_t
MUST initialize before use by
a) setting to constant PTHREAD_MUTEX_INITIALIZER (for statically allocated mutexes only)
b) calling pthread_mutex_init
MUST call pthread_mutex_destroy before freeing memory (for dynimically allocated mutexes)
pthread_mutex_lock - lock a mutex, block calling thread if it's already locked
pthread_mutex_trylock - return EBUSY without blocking if it's already locked
pthread_mutex_timedlock - block until timeout if it's already locked
Condition Variables
447
guarded by mutex, must first lock the mutex to change/evaluate the condition state
must be initialized first by
pthread_cond_init
or set to PTHREAD_COND_INITIALIZER (for statically allocated)
pthread_cond_wait - wait for a condition to be true
Spin Locks
p451
instead of blocking a process by sleeping, the process is blocked by
busy-waiting (spinning)
until the lock can be acquired.
used when
locks are being held for short periods of times
, and threads
don't want to incur the cost of being descheduled
.
which means the
cost of keeping CPU busy must be lower than sleeping the thread
often used as low-level primitives to implement other types of locks
useful in a nonpreemptive kernel, block interrupts so an interrupt handler cannot deadlock the system by trying to acquire a spin lock that is already locked
Barriers
p453
allows each thread to
wait until all cooperating threads have reached the same point,
and
continue
executing from there on
Thread Control
p459
fork
when thread calls fork, a copy of entire process address space is made for the child
child inherits state of every mutex/lock from parent thread
call exec immediately after fork can discard the address space states
pthread_atfork
p490
install 3 functions to help clean up locks as if
the parent acquired all its locks,
child acquired all its locks,
parent released its locks,
child released all its locks
to ensure thread-safety during forking
prepare - called in parent process before fork creates the child process
parent - called in context of parent after fork created child process, in order to unlock all the locks acquired by prepare
child - called in context of child process before returning from fork
pread
atomically (set the offset AND read data)
Daemon Process
p496
live for long time, often started from system initialization scripts (
/etc/rc*
or
/etc/init.d/*
), and terminates when the system shutdown
no controlling terminal, run in background
Coding Rules
p500
call umask to set the file mode creation mask to a known value, usually 0
call fork and have the parent exit
call setsid to create a new session
change the current working directory to the root directory
close unneeded file descriptors
open file descriptors 0,1,2 to /dev/null to mute console input/output
Error Handling
p502
Kernel routines call
log
function to generate log msg, accessible by any user process at
/dev/klog
user processes (daemons) call
syslog
function to generate log msg, send logs to UNIX domain datagram socket
/dev/log
other hosts connected by TCP/IP network, can send logs to UDP port 514.
*syslog never generate UDP
datagrams, require
explicit network programming by the caller process
syslogd
daemon process that read all three forms of logs. On start-up,
it reads config in usually
/etc/syslog.conf
, to determine
where different types of logs are to be sent
Singleton Daemon
p506
each daemon
creates a file (usually in /var/run/daemon-name.pid) with a fixed name
, and
places a write lock on the entire file
, then only one such write lock can be created, ensuring only one instance of the daemon is running
the write lock is
automatically removed
if the daemon exits
configuration of daemon is usually in
/etc/daemon-name.conf
to
auto-restart
a daemon, arrange for
init
to restart it by
including respawn entry
for the daemon in
/etc/inittab/etc/inittab
some daemon will catch
SIGHUP
to
re-read configuration changes
w/o restarting
Client-Server mode
p512
:star: set
close-on-exec
flag for all the file descriptors that the executed program won't need
Interprocess Communication
p566
Pipe
568
half-duplex
can be used only between processes having a
common ancestor
usually between parent and child by a
fork
when you type sequence of commands in shell, the shell creates a process for each command and links the standard input/output in between
read from a write-closed pipe returns 0, indicating end of file has been reached
write to a read-closed pipe signals SIGPIPE
FIFO (named pipe)
586
any processes can use, not restricted to common-ancestor ones