Please enable JavaScript.
Coggle requires JavaScript to display documents.
Disk Structure, Whether or not O_DIRECT flag is used, it is always a good…
-
Whether or not O_DIRECT flag is used, it is always a good idea to make sure your reads and writes are block aligned. Crossing segment boundary will cause multiple sectors to be loaded from (or written back on) disk as shown on images above. Using the block size or a value that fits neatly inside of a block guarantees block-aligned I/O requests, and prevents extraneous work inside the kernel.
-
-
In summary, Virtual Memory pages map to Filesystem blocks, which map to Block Device sectors.
Standard IO
Standard IO uses read() and write() syscalls for performing IO operations. When reading the data, Page Cache is addressed first. If the data is absent, the Page Fault is triggered and contents are paged in. This means that reads, performed against the currently unmapped area will take longer, as caching layer is transparent to user.
During writes, buffer contents are first written to Page Cache. This means that data does not reach the disk right away. The actual hardware write is done when Kernel decides it’s time to perform a writeback of the dirty page.
When the read operation is performed, the Page Cache is consulted first. If the data is already loaded in the Page Cache, it is simply copied out for the user: no disk access is performed and read is served entirely from memory.
Otherwise file contents are loaded in Page Cache and then returned to the user. If Page Cache is full, least recently used pages are flushed on disk and evicted from cache to free space for new pages.
write() call simply copies user-space buffer to kernel Page Cache, marking the written page as dirty. Later, kernel writes modifications on disk in a process called flush or writeback. Actual IO normally does not happen immediately. Meanwhile, read() will supply data from the Page Cache instead of reading (now outdated) disk contents. As you can see, Page Cache is loaded both on reads and writes.
Logic behind Page Cache is explained by Temporal locality principle, that states that recently accessed pages will be accessed again at some point in nearest future.
Another principle, Spatial Locality, implies that the elements physically located nearby have a good chance of being located close to each other. This principle is used in a process called “prefetch” that loads file contents ahead of time anticipating their access and amortizing some of the IO costs.
-
-
It is discouraged to open the same file with Direct IO and Page Cache simultaneously, since direct operations will be performed against disk device even if the data is in Page Cache, which may lead to undesired results.