Please enable JavaScript.
Coggle requires JavaScript to display documents.
Topic 3: File Management, John Tsai, Syazwan, ARIF - Coggle Diagram
Topic 3: File Management
4.Describe the directory structure:
Directory Structure: Introduction
The file systems of computers, can be extensive.
Some systems store millions of files on terabytes of disk.
To manage all these data, we need to organize them.
This organization involves the use of directories.
Directory Structure/ Storage Structure
A collection of nodes containing information about all files
Both the directory structure and the files reside on disk
Backups of these two structures are kept on tapes
A disk (or any storage device that is large enough) can be used in its entirety for a file system.
It is also desirable to place multiple file systems on a disk or to use parts of a disk for a file system and other parts for other things, such as swap space or unformatted (raw) disk space.
These parts are known variously as partitions, slices, or minidisks (IBM).
combined to form larger structures known as volumes, and file systems can be created on these as well.
We refer to a chunk of storage that holds a file system as a volume.
Volumes can also store multiple operating systems, allowing a system to boot and run more than one.
This information is kept in entries in a device directory or volume table of contents.
The device directory (known as directory) records information—such as name, location, size, and type—for all files on that volume.
A Typical File-system Organization
Directory Overview
The directory can be viewed as a symbol table that translates file names into their directory entries.
Operations Performed on Directory:
Search for a file
Create a file
Delete a file
List a directory
Rename a file
Traverse the file system (to access every directory and every file within a directory structure)
Single-Level Directory
The simplest directory structure is the single-level directory.
All files are contained in the same directory, which is easy to support and understand.
A single-level directory has significant limitations, however, when the number of files increases or when the system has more than one user
Even a single user on a single-level directory may find it difficult to remember the names of all the files as the number of files increases.
It is not uncommon for a user to have hundreds of files on one computer system and an equal number of additional files on another system.
Keeping track of so many files is a daunting task.
Two-Level Directory
A single-level directory often leads to confusion of file names among different users. The standard solution is to create a separate directory for each user.
Each user has his own user file directory (LTD).
The UFDs have similar structures, but each lists only the files of a single user.
When a user job starts or a user logs in, the system's master file directory (MFD) is searched.
The MFD is indexed by user name or account number, and each entry points to the UFD for that user.
Separate directory for each user
~ Path name
~ Can have the same file name for different user
~ Efficient searching
~ No grouping capability
1
concept of the file.
File is a
sequence of bits, bytes, lines, or records, the meaning of which is defined by the file's creator and user.
named collection of related information that is recorded on secondary storage.
files represent programs (both source and object forms) and data.
Data files may be: numeric, alphabetic, alphanumeric, or binary.
Files may be: free form, (e.g: text files), or may be formatted rigidly.
Many different types of information may be stored in a file; source programs, object programs, executable programs, numeric data, text, payroll records, graphic images, sound recordings, and so on.
Type:
A text file is a sequence of characters organized into lines (and possibly pages).
A source file is a sequence of subroutines and functions, each of which is further organized as declarations followed by executable statements.
An object file is a sequence of bytes organized into blocks understandable by the system's linker.
An executable file is a series of code sections that the loader can bring into memory and execute.
File Attributes
File is named, for the convenience of its human users, and is referred to by its name.
A name is usually a string of characters, such as example.c.
Some systems differentiate between uppercase and lowercase characters in names, whereas other systems do not.
Consists of this
Name – only information kept in human-readable form
Identifier – unique tag (number) identifies file within file system
Type – needed for systems that support different types
Location – pointer to file location on device
Size – current file size
Protection – controls who can do reading, writing, executing
Time, date, and user identification – data for protection, security, and usage monitoring
File operation
Provide system call to
create,
write,
read,
reposition,
delete, and
truncate files.
Open(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memory
Close (Fi) – move the content of entry Fi in memory to directory structure on disk
File types
operating system recognizes the type of a file, it can then operate on the file in reasonable ways.
A common technique for implementing file types is to include the type as part of the file name. The name is split into two parts—a name and an extension, usually separated by a period character. The user and the operating system can tell from the name alone what the type of a file is.
File Structure
Source and object files have structures that match the expectations of the programs that read them.
Certain files must conform to a required structure that is understood by the operating system.
For example, the operating system requires that an executable file have a specific structure so that it can determine where in memory to load the file and what the location of the first instruction is.
Some operating systems impose (and support) a minimal number of file structures. This approach has been adopted in UNIX, MS-DOS, and others.
The Macintosh operating system supports a minimal number of file structures. It expects files to contain two parts: a resource fork and a data fork.
The resource fork contains information of interest to the user.
The data fork contains program code or data—the traditional file contents.
2.Describe the following file organization techniques:
File Organization Techniques
Files store information.
When it is used, this information must be accessed and read into computer memory.
The information in the file can be accessed in several ways.
Some systems provide only one access method for files, and another chose to support many access methods (like IBM).
Sequential Access
Information in the file is processed in order, one record after the other.
The most common.
For example, editors and compilers usually access files in this method.
Direct Access
Also known as relative access.
A file is made up of fixed length logical records that allow programs to read and write records rapidly in no particular order.
The direct-access method is based on a disk model of a file, since disks allow random access to any file block.
For direct access, the file is viewed as a numbered sequence of blocks or records. Thus, we may read block 14, then read block 53, and then write block 7.
There are no restrictions on the order of reading or writing for a direct-access file.
Index
The index, like an index in the back of a book, contains pointers to the various blocks.
To find a record in the file, we first search the index and then use the pointer to access the file directly and to find the desired record.
In large files, the index file itself may become too large to be kept in memory.
One solution is to create an index for the index file.
The primary index file would contain pointers to secondary index files, which would point to the actual data items.
Identify which file organiation technique is appropriate for a specific device
file organization techniques
sequental
Easiest to implement because records are stored & retrieved serially one after other.
To speed process some optimization features may be built into system
E.g., select a key field from record & then sort records by that field before storing them
Aids search process
complicates maintenance algorithms because original order must be preserved every time records added or deleted
direct
uses direct access files can be implemented only on direct access storage devices
give users flexibility of accessing any record in any order without having to begin search from beginning of file
Records are identified by their relative address (their addresses relative to beginning of file)
logical addresses computed when records are stored & again when records are retrieved
use hashing algorithms
Uses direct access files which can be implemented only on direct access storage devices
Give users flexibility of accessing any record in any order without having to begin search from beginning of file
direct is the best techniques
Advantages
Fast access records
can be updated more quickly than sequential files because records quickly rewritten to original addresses after modifications
Disadvantages
Several records with unique keys may generates same logical adress (collision)
indexed sequential
Advantages: fast access to records
Disadvantages: several records with unique keys may generate same logical address (collision)
Almost similar to sequential method only that, an index is used to enable the computer to locate individual records on the storage media. For example, on a magnetic drum, records are stored sequential on the tracks. However, each record is assigned an index that can be used to access it directly.
Serial file organization
Records in a file are stored and accessed one after another.
The records are not stored in any way on the storage medium this type of organization is mainly used on magnetic tapes.
Advantages
It is simple
It is cheap
Disadvantages
It is cumbersome to access because you have to access all proceeding records before retrieving the one being searched.
Wastage of space on medium in form of inter-record gap.
It cannot support modern high speed requirements for quick record access.
John Tsai
Syazwan
ARIF