A few weeks ago I was talking with a colleague about Linux file systems. There are several file systems supported by the Linux operating system some of which have been created for it (e.g., Ext2, Ext3, Ext4) and other by different vendors (e.g., FAT, FAT32, HFS, MINIX, NTFS, System V, BSD).
Every time I run into MINIX it brings up good old memories. It was used at school to teach operating systems. I purchase a book that came with a copy of the OS. It could boot in a regular PC.
MINIX was written by Andrew S. Tanenbaum in the 1980s as a Unix-like operating system whose source code could be used freely in education (e.g., University of Minnesota). The MINIX file system was designed for use with MINIX; it copies the basic structure of the Unix File System but avoids any complex features in the interest of keeping the source code clean, clear and simple, to meet the overall goal of MINIX to be a useful teaching aid. The Extended file system (ext; April 1992) was developed to replace MINIX’s, but it was only with the second version of this, Ext2, that Linux obtained a commercial-grade file system.
I am going to do a top-bottom approach to get from a Linux disk device down to an i-node.
In Linux operating systems, a device file or special file is an interface for a device driver that appears in a file system as if it were an ordinary file. They allow software to interact with a device driver using standard input / output system calls (e.g., open(), write(), read(), close(), ioctl() and mmap()), which simplifies many tasks and unifies user-space I/O mechanisms.
Device files often provide simple interfaces to peripheral devices such as printers and serial ports, but they can also be used to access specific resources on those devices, such as disk partitions. Finally, device files are useful for accessing system resources that have no connection with any actual device such as data sinks and random number generators. There are two general kinds of device files in Unix-like operating systems, known as character special files and block special files. The difference between them lies in how data written to them and read from them is processed by the operating system and hardware.
A hard disk drive (HDD), hard disk, hard drive or fixed disk is a data storage device used for storing and retrieving digital information using one or more rigid rapidly rotating disks (platters) coated with magnetic material. The platters are paired with magnetic heads arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored or retrieved in any order and not only sequentially.
A HDD have the following:
|Cylinder||An early method for giving addresses to each physical block of data on a hard disk drive.|
|A circular path (a cylinder) on the surface of a disk on which information is magnetically recorded and from which recorded information is read.|
|A subdivision of a track on a HDD. Each sector stores a fixed amount of user-accessible data, traditionally 512 bytes. Geometrically, the word sector means a portion of a disk between a center, two radii and a corresponding arc, which is shaped like a slice of a pie.|
|Physical Block||A sequence of bytes or bits, usually containing some whole number of records, having a maximum length, a block size.|
A disk drive is a physical electro mechanical device that contains one or more spinning platters, a mechanical arm that is able to move radial over an area of the platter(s), and a set of one or more heads that are able to write and read data. A disk drive is divided / partitioned into non overlapping disk drive partitions. Each partition is treated as a separate device. The following figure illustrates this:
A file system is an organized collection of regular files and directories. It is used to control how data is stored and retrieved. The following figure illustrates a file system that would control and manage regular files in a partition:
The following table provides brief descriptions for the labels in the previous illustration:
|First block in a file system. Used to boot the operating system. A computer may host multiple file systems. Only the boot block from the first file system is used to boot the operating system.|
|super block||Contains the following information:
– Size of the i-node table.
– Size of the logical blocks in the file system.
– Size of the file system in logical blocks.
|i-node table||Also called the i-list contains a unique entry for each file or folder in the file system.|
|data blocks||Used to form / hold contents of files and folders in the file system.|
The super block is always located at byte offset 1024 from the beginning of the file, block device or partition formatted with Ext2 and later variants (Ext3, Ext4).
The i-node (index node) is a fundamental concept in the Ext2 file system. Each object in the file system is represented by an i-node. The i-node structure contains pointers to the file system blocks which contain the data held in the object and all of the metadata about an object except its name. The metadata about an object includes the permissions, owner, group, flags, size, number of blocks used, access time, change time, modification time, deletion time, number of links, fragments, version (for NFS) and extended attributes (EAs) and/or Access Control Lists (ACLs). All the i-nodes are stored in inode tables, with one inode table per block group. The following table provides additional information of most of the fields in an i-node:
|File type||Regular file, directory, symbolic link or character device.|
|Owner||Referred to as the user ID or UID for a file.|
|Group||Referred to as the group ID or GID for a file.|
|Access permissions||For the three categories of user: owner, group and other.|
|Time stamps for:
– Last access to a file.
– Time of last modification to a file.
– Time of last status change to a file.
|Number of hard links||Number of hard links to a file.|
|File size||Size of a file in bytes.|
|Number of blocks||Number of blocks actually allocated to a file measured in 512-byte blocks.|
|Pointers to data blocks||Pointers to the data blocks of a file.|
The following figure illustrates the structure of an i-node and file blocks for a file in the Ext2 file system:
The first 12 data blocks [0 : 11] contain pointers to individual data blocks for a file. They are allocated in order as needed based on the file size. Each data block has the same size (e.g., 512, 1024, 2048 bytes). For sake of discussion, let’s assume that the depicted file system uses 1,024-bytes per block. The maximum file size for a file would be:
|i-block||Indirect block pointer
|Double indirect block pointer||Triple block pointer||Bytes|
|[0 – 11]||N/A||N/A||N/A||12 * 1,024 = 10,240|
|12||1024 / 4 = 256||N/A||N/A||256 * 1,024 = 262,144|
|13||256 * 256 = 65,536||1024 / 4 = 256||N/A||65,536 * 1,024 = 67,108,864|
|14||256 * 256 * 256 = 16,777,216||256 * 256 = 65,536||1024 / 4 = 256||16,777,216 * 1,024 = 17,179,869,184|
Several years ago I developed a Content Addressable storage. I put in a database table equivalent fields to most of the i-node fields.
If you have comments or questions regarding this post or any other in this blog please feel free and send me a message. I will reply as soon as possible.