Please enable JavaScript.
Coggle requires JavaScript to display documents.
COMP15212: Operating Systems - Coggle Diagram
COMP15212: Operating Systems
Memory
Simple Memory Model
Each location has an address and stores some data
Each process should have different sets of addresses
Processes use an idealised set of virtual addresses that are mapped to physical addresses so they don't have to worry about where they are in memory
Memory Management
Done by the memory management uniy
Takes a virtual address and outputs the physical address
Sizes
Processor architecture determines the size of the virtual address space
Usually locations are a byte
2^n addresses usually (where n is the length of data)
Simple Approach
To map virtual addresses to physical ones with base (start) and limit (size) registers
Variable Partition
Considers speed of placing a process and its location
First-fit: finding a big enough space
Best-fit: smallest hole big enough
Worst-fit: find the biggest hole
Fragmentation
Free memory gets broken into lots of pieces
External: free memory but not in a contiguous piece
Internal: unused memory in partitions
Memory
A large array of equal sized storage spaces
Each location has its own identifier expressed as a hex number
Inside the store is a value which is the data which can represent anything
A software variable name is simply an alias for an address
This address space may not be fully populated with actual memory
Most of the memory will be RAM, implying that the memory is writable
Memory Management
There is limited memory resource in a computer.
Any process will require one or more logical memory segments which are contiguous parts of the address space. Memory can be allocated when the process starts or dynamically so the exact memory needs cannot be determined in advance
Memory management can be done by both the application and the OS. An application will request 'large' blocks of memory and the allocation from them, thus saving some system calls
The OS can keep records of the memory and physical memory in use and for what
The OS may need to allocate spaces when a process is loaded and if the stack or heap overflows, manage access permissions and keep track of process; use and recover the resource when the process terminates
Memory Sizes
The size of the virtual address space of a particular computer processor is fixed by the architecture of that processor
The physical address space follows the trend of the size being 2^n. The memory size is governed by technology and economics. Price per bit falls as tech progresses.
Also as time goes on, memory becomes more affordable and the address space looks smaller, but we can map an address space that is larger than the virtual space visible to any single process.
Memory Mapping
Computer memory is a physical storage medium
A running process uses an address to specify its location to load and store data
Each process has its own independent set of addresses it wants to use
Observations
Few processes use the whole of their virtual address space
It may be hard to predict which virtual addresses a process chooses to use
There is significant locality of access
In a multiprocessing system it is important that processes do not interfere with each other
Address translation
Addresses are translated from a virtual addresses which the memory hardware deals with. The OS is included because it sets up the translation
Memory segmentation
The Von Neumann computing architecture has a single memory address space which is shared by code and data in all their forms
The memory is divided logically into code and data
A particular segment will have a set of attributes and can be mapped to physical memory using dedicated hardware or hardware pages
Advantages of paging hardware
Using page tables is more expensive than having a single mapping entry for the segment
Memory Protection
Memory needs to be protected as it is needed by everything that executes. Each process wants its private space, so this prevents contamination from one process to another
It is important to protect the OS space from being altered by the user, including OS code, data and memory mapped peripheral devices.
Protection can be given by memory mapping as a process cannot see memory which isn't mapped in. The OS has to be able to do the mapping and interact with the applications.
Kinds of permissions
No access. The page is not present due to the process not being granted access to the address space
Supervisor read-only and Read-only: Instruction code is fixed and should not be written to.
Supervisor read/write: The operating systems data spaces are obviously sensitive areas where alterations need to be prevented. A system call provides trusted access.
Read/write by any privilege level: permissions which a user's data pages would have.
Virtual Memory
OS provides an abstracted virtual environment as applications have an idealised view of the machine but don't consider the address space.
The OS has a private map which translates the virtual address into non-conflicting physical addresses.
Benefits
Virtual address space has to mirror Moore's law, like memory
Virtual memory has to appear in the places it has worked
Late in a processor generation there may be more physical memory than fits in the address pace and VM allows the physical memory to be used sensibly
Swapping
If more storage is needed, the solution is secondary storage
If the demand on the physical store becomes too great, the pages that are deemed not needed in the near future are copied onto the disk
Swapping keeps everything running but each swap is time consuming
The MMU is the hardware that implements keeping the application code supplied with memory and securing it. Paging supports swapping to disk
Virtual Address Space
The virtual address space is determined by the processor architecture. The physical space is determined by the hardware. Mapping can put physical memory where it is required.
Locality
Spatial locality
If one address is used, it is likely that addresses around it are also used
Temporal locality
If something was used recently, it will probably be used again
Paging
Rather than mapping individual addresses, we match a chunk of addresses to a similar location
Compromises between mapping everything as a chunk and mapping individual locations
Problems of allocation and fragmentation
Addresses organised into pages
Pages then matched as a chunk
Page number given by most significant bits of virtual address, offset's the rest
Page Tables
A hierarchical page table uses a tree structure to improve the efficiency of storage, however a fully expanded hierarchical scheme will require more space
This creates levels of table instead of one massive table
Hashing
Hash function maps data to a fixed size collection of values
Hash tables structures for storing key/value pairs. The hash function identifies a slot or bucket for the item
Paging
Page Lookup: MMU takes a virtual address and finds the appropriate page
If the page is not present, raise a page fault
OS checks that virtual address and operation are legitimate
Paging: Find a frame where we can put it, but we could evict a page to fit it
Caching
Cache reduces the distance and computation cost
If the cache gets full or stale we need a policy for replacing things and flushing the cache
Translation Look-aside Buffer (TLB)
Holds recently used set of translations (like a cache)
Policies and mechanisms
A policy describes the activities that we're going to perform
A mechanism is about how those policies are performed
Multi-level caching
Each processor might have two-levels of cache and then all the processors would share a level three cache
Standard swapping
The whole process is moved out to backing store
OS maintains metadata, but it is time costly
Swapping
You can also swap out individual pages, which is more common and is more commonly referred to as paging
Paging
An inverted page table maps physical frames to logical pages rather than vice versa.
Strategies
Cache eviction strategies include
Random
Least Recently Used
Cyclic
We prefer to evict clean pages (those that haven't been edited), as we don't need to worry about writing them back to secondary storage
Page monitoring
Mark pages as read only when first read
On first write, raise a page fault and change permissions
If the page is writable, it must have been changed, else we just overwrite it
Mobile
A mobile OS will tend to avoid page swapping
Flash memory used for secondary storage has a limited number of writes
Shared memory
Different processes could be using the same piece of physical memory
Inter-process Communication: processes can exchange data through some shared memory
Shared Libraries: a single copy of frequently used routines shared with any process that needs them
Libraries
Collections of code that can help reduce development time and provide standardisation of implementation
Can expose OS functionality or could be higher level
Allow code to be agnostic to the underlying system
Applications only need to know the appropriate system calls or API
Libraries should be
Re-entrant: Local variables. No externally visible side effects
Relocatable: No absolute addresses so can be executed at any address
Static libraries
The code is included into the application at compile time
Disadvantages
Larger object code
Duplication
Advanatges
Self-contained
No OS involvement
Dynamic libraries
Library loaded when process is running
The application invokes a linker at startup/execution
Advantages
Maintenance
Library may be a single shared copy
Disadvantages
More management required
Direct Memory Access
The processor can work from cached data while the DMA controller performs disk transfers
Structures in memory
C
A high-level language with low-level features
Access to and management of memory
Imperative: sequences of statements that are executed in turn
Types: variable declared with a type
Control structures: if, for/while loops etc.
Memory managements: direct manipulation of pointer structures
Data structures
C compiles map data structures into memory
Pointers
We can perform operations on pointers
This allows us to access memory locations
Arrays
Collections of items
Items are the same size
Items are stored sequentially in memory
Arrays and pointers are essentially interchangeable in C
Structures
Traditionally known as records
Similar to the data parts of an object in Java or Python
A struct has elements which are accessed via their names
Elements can be different types
Differ from arrays as they produce a container where the things contained could be different sizes
Structs in memory
Elements are allocated to memory in the order that they are declared
We read in chunks out of memory, so we prefer to have contiguous data grouped together
Processes
How is the CPU able to do multiple things at once?
Does a little bit of each job and then swaps to another
This is done fast enough to feel like they are done simultaneously
Terminology
Application
Job
Process
a container that holds all the information needed to run a program
Program
OS state model
Only the running state is executed on the CPU as part of the 'Application'
The rest of the model is due to the 'Operating System'. It is also executing on the CPU
Batch Processing
FIFO - does not need to process the jobs before running them
Shortest Job Next - requires sorting by time, gets the most jobs done faster
Priority scheduling - most important job first
Interactive Scheduling
mechanism allows you to interrupt what a job is doing, or a process is doing and schedule it out and bring something else back in
Processes
Context
Process Control Block (PCB)
Process scheduling
Scheduler
Process States
Child Processes
A child process does not know its parent, but a parent knows it's child
A fork creates two identical processes, the child often replaces itself with another piece of executable code
If a process crashes or is unexpectedly killed, the init daemon will adapt any orphan processes
When a process dies it briefly becomes a zombie. The wait command is called so that the child can return values to the parent
Threads
Multi-threading
Inter-process Communication
Deadlock
Criteria
Mutual exclusion
Hold and wait
No pre-emption
Circular wait
Ostrich algorithm - ignoring deadlock as it is infrequent or not caring
You can:
ignore it
prevent it
avoid it
detect (and fix) it
Deadlock prevention:
If you make one of the criteria impossible, there cannot be deadlock, but this is very difficult
Deadlock Avoidance - having a check in place to make sure the criteria can't happen simultaneously
Deadlock Detection - having a monitoring process to break the deadlock
File Systems
Data Storage Devices
CPU -> RAM -> Hard Disk/SSD -> Network/Cloud
Solid State Memory
Flashcard/SD Card
Solid State Disks
left to right has a bigger capacity and slower
What is a file?
Some data: a sequence of bytes that represent something
Some metadata: information about the file and its contents
Disk Terminology
Track - concentric rings around a disk
Sector - divisions of a disk
Track-sector - their intersection
A collection of contiguous track sectors is a cluster
File Allocation Table
Simple to implement
A section of the device is called a directory to remember everything about the file
Trade off between directory space and data space
Filesystems
A directory is just a file containing filenames and some way of identifying where the data for that file lives
Files can have multiple different names and paths
When you move a file to somewhere on the same filesystem, you don't always need to move the data, just change the directory
Filing System
File Systems
Files
File Attributes
File Types
File Permissions
Options for disk space
Split it up as much as needed to use up any gap
Move the used space
When there is not enough room for writing
live with it
Make write a more atomic action that checks for space in advance
HDDs
You would put the file index on the outside as that is the fastest part to read
The problem with the file index idea for a 'cheap' flash memory is that it is constantly being read
Don't defrag on an SDD as there would be unnecessary read/writes
Operations for files
open, close, read, write, seek, rename/move, delete
Everything is a File
A file is not a file when it is a peripheral i.e. keyboard, printer, harddisk
printer is a write-only file
keyboard is a read only file
Virtualisation and Malware
All problems in computer science can be solved by another level of abstraction. Except for the problem of too many level of abstraction.
We do this for:
Software emulation for exploration
Digitising material
Running legacy software
Hypervisor
Type 1 interacts with the hardware and hosts several types of OS
Type 2 interacts with the Host OS and hosts several guest operating systems
Virtual Networks
Logical structure is separated from physical structure
Can restrict a network
Internal virtual networks e.g. with guest OS in type two hypervisor
Types of malware analysis
Static analysis
Reviewing file metadata and format
Open source research and hashes
Imports and libraries
Embedded resources and strings
Code Analysis
Specific target/information needed
Load into disassembler or debugger
Find area of interest
Execute, explore and modify step by step
Behavioural Analysis
Analyse in a safe environment such as a virtual machine@
Load and activate monitoring tools
Executing malware and allow to run
Review tool output for evidence of activity
Malware is often compiled for smaller addresses width to increase its backwards compatibility and does not run in a debugger because it checks to see if it is being run
What is an operating system?
System calls are how the software asks the hardware to do something on its behalf
Interrupt is the hardware's way of asking the software to do something on its behalf
Operating system definitions
acts as an intermediary between the user of a computer and the computer hardware. It provides an environment in which a user can execute programs in a convenient and efficient manner. It is also a software that manages the computer hardware
it provides the user programs with a better, simpler, cleaner model of the computer and to handle managing resources
system software that manages computer hardware, software resources and provides common services for computer programs