Tuesday, August 12, 2014

Step By Step Linux Boot Process [Updated]

In  systemd based systems (RHEL7/8, SLES15 etc..)
Power On BIOS/UEFI MBR/GPT GRUB2 (Stage 1 Boot Loader  Stage 2 Boot Loader) Kernel systemd Login

→  Power On

The system/server hardware or firmware either UEFI (Unified Extensible Firmware Interface)  Or BIOS (Basic Input Output System) runs Power On Self Test (POST) when system gets powered-on.

→  BIOS/UEFI gets executed

BIOS/UEFI get executed and checks for proper connectivity of devices and then it would check for memory availability, and finally locates a boot device.

BIOS/UEFI locates the boot device & loads MBR/GPT (GUID Partition Table) from an active partition.

MBR/GPT loads into memory

The 'Master Boot Record' (MBR) is of size 512 bytes and located on the first sector of a Primary/Active boot device. The MBR loads into the memory at this stage which facilitates GRUB loading. The Partition table is stored within MBR (*details of what is stored within 512 bytes is explained down the page).

If boot device is larger than 2.2TB (Tera Bytes) then GPT partitioning scheme would be used instead of MBR. At this stage, GPT details would be read into memory which is facilitated by UEFI firmware.

→  GRUB (GRand Unified Boot loader). This loads in 4 stages (GRUB2 in case of RHEL7 and above).

 First Stage Boot loader

 - The First Stage Boot loader is of size 446 bytes and its a part of the MBR. This is also called “Primary Boot Loader" OR “Stage 1 Loader”. This is a small binary code within 512 bytes of MBR capable of loading Stage 2/1.5 loader.

 Stage 1.5 bootloader

  - On some hardware platforms this intermediate loader is required. This is sometimes true when the '/boot' partition is above the 1024 cylinder head of a hard drive or when using LBA (Logical Block Addressing) mode. The Stage 1.5 bootloader is found either on the '/boot' partition or in a small part of the MBR.

 → Stage 2 boot loader

 - This is also called as “Secondary Bootloader” or "Stage 2 Bootloader". The secondary boot loader displays the GRUB menu and command environment. At this stage, user can interrupt the booting process and select specific kernel to boot into and pass additional parameters to the kernel if required. The files which belongs to stage2 are stored in '/boot/grub' or '/boot/grub2' (in case of RHEL7 and above). The main purpose of stage2 loader is to load the kernel.

 → Stage 2 boot loader transfers the control to the kernel

  - The secondary boot loader reads the operating system or kernel as well as the contents of '/boot/sysroot/' into memory. Now the control gets transferred to the kernel to load rest of the operating system.

→  Kernel loads into memory

  When the kernel is loaded, it immediately initializes and configures the computer's memory and configures the various hardware attached to the system, including processors, I/O subsystems, storage devices etc. It then looks for the compressed 'initrd or initramfs' image in a predetermined location in memory, decompresses it, mounts it, and loads all necessary drivers. Next, it initializes virtual devices related to the file system, such as LVM (Logical Volume Manager) or software RAID (Redundant Array of Independent/Inexpensive Drives) before un-mounting the initrd disk image and freeing up all the memory the disk image once occupied.

 - Root partition gets mounted read-only. Kernel then creates a root device, mounts the root partition read-only, and frees used memory.

 - Kernel then calls init or systemd program

 - init gets loaded into memory Or systemd (in case of RHEL7)

  The '/sbin/init' program (also called init) co-ordinates the rest of the boot process and configures the environment for user. The same file '/sbin/init' is a symbolic link to '/usr/lib/systemd/systemd' in case of RHEL7. The 'init' or 'systemd' is the first process which starts in.

In RHEL6 & earlier versions

 - Init calls '/etc/rc.d/rc.sysinit' script.

 - This sets the environment path, starts swap, checks the file systems, and executes all other steps required for system initialization.

 - Init checks the default runlevel in '/etc/inittab' file and then calls '/etc/rc.d/init.d/functions'. This defines how to start, kill, and determine the PID (Process ID) of a program.

 - The init program processes all Start (s) and Kill (k) scripts depending on the run level determined.

  The init program starts all of the background processes by looking in the appropriate 'rc' directory for the runlevel specified as default in '/etc/inittab' file. The 'rc' directories are numbered which corresponds to the runlevel they represent. When booting to runlevel 5, the init program looks in the '/etc/rc.d/rc5.d' directory to determine which processes to start and stop.

  All of the files in '/etc/rc.d/rc5.d' are symbolic links pointing to scripts located in the '/etc/rc.d/init.d' directory.  The symbolic links are used in each of the 'rc' directories so that the runlevels can be reconfigured by creating, modifying, and deleting the symbolic links without affecting the actual scripts they reference. First all “k” scripts gets executed and then all “s” scripts.

 - The '/etc/inittab' script forks '/sbin/mingetty' process (Upstart would be used in RHEL 6 for forking mingetty)

  Virtual Consoles gets initiated at this stage depending on run level defined. The '/sbin/mingetty' process opens communication pathways to 'tty' devices, sets their modes, prints the login prompt, accepts the user's username and password, and initiates the login process. In run level 5 “/etc/X11/prefdm” script gets executed. Preferred display manager would gets loaded at this stage.  

 - Finally init calls '/etc/rc.d/rc.local' script and then goes to login screen.

In RHEL7/8..systemd based systems                    

  - The systemd process checks and starts the 'default.target' which would be either 'graphical.target' or 'multi-user.target'.

- So, basically the 'default.target' is a symbolic link to either 'graphical.target' or 'multi-user target'.

- Depending on default target set which is either '/lib/systemd/system/graphical.target' or '/lib/systemd/system/multi-user.target' those respective 'Wants' & Requires' directives would be started. So, the target defined in 'Requires' would be started first.

Take a look at the 'graphical.target' file:

Description=Graphical Interface
Conflicts=rescue.service rescue.target
After=multi-user.target rescue.service rescue.target display-manager.service

 - So, looking at the above snap of 'graphical.target' file, the 'multi-user.target' should be started first and start graphical service, there is 'Wants' directive which says that 'display-manager.service' should be started along. Likewise, the 'multi-user.target' file also holds 'Requires' & 'Wants' directive which are dependencies to be started first and later.

 - The 'multi-user.target' requires 'basic.target' and other targets as shown below:

systemd  →   default.target →  graphical.target  → multi-user.target →  sysinit.target

 - The 'sysinit.target' doesn't contain 'Requires' directive, however, got 'Wants' directive:

  Wants=local-fs.target swap.target

- This 'sysinit.target' starts system initialization services, such as mounting file system, enabling swaps, enable logging, start udevd to detect hardware etc. Likewise there is 'After' directive defined as well in each targets, so as in 'sysinit.target' which is 'local-fs-pre.target' which runs and is responsible for importing network configuration from initramfs and runs file system check on root when required, remounts root file system.

- So, this process as started, starts from end and at final end it would run either 'graphical.target' or 'multi-user.target' which would present the final login prompt to user.

→  User Login Screen

Skeleton View of Boot Process on x86 BIOS Based System

Power On -- BIOS (Boot Strap)
├── POST (Power On Self Test) hardware initial testing/scanning
├── Stored in a CMOS chip
├── Locates and loads MBR (Master Boot Record)from boot device

MBR →  512 Bytes
├──  Stored at Sector 1, Cylinder 0 and Head 0 on First Storage Device

├──  Contains MBR Code, Partition Table & Magic Code

├──  First 446 Bytes contain MBR Code, next 64 bytes holds Partition
      Table & last 2 bytes would hold Magic Code

├──  Main function of MBR is to locate ‘Stage 1 Boot Loader’ and load it

├──  MBR loads "Stage 1 Boot Loader" into memory from an
│          active partition that is how GRUB starts loading
├──  MBR Code
│         └── Size is 446 bytes
│         └── Stage 1 Boot Loader Or Primary Boot Loader
│         └── Loads Second Stage Boot Loader     

├── Partition Table
│         └── Size is 64 Bytes ( 16 bytes X 4 Partitions )
│         └── Contents = Size | Start CHS | End CHS | LBA | Total
                                                 Available Space

GRUB (Grand Unified Bootloader)

├── Stage 1 Boot Loader
│         └── Primary Boot Loader
         └── Stored within 446 bytes of MBR
         └── Scans through partition table and loads Stage 1.5 loader
│   from an active partition

├── Stage 1.5 Boot Loader
│         └── Loads Stage 2 Boot Loader from /boot directory
│         └── Contains small binary which would facilitate in loading stage
│                        2 from boot partition
│         └── It understands the file system, required when boot partition
                    is above 1024 cylinders or when LBA mode is used

├── Stage 2 Boot Loader Or Kernel Loader
│         └── Users can pass arguments to kernel to load at this stage.
│         └── Main function of Stage 2 loader is to load kernel and initrd
│                             (initramfs)
│         └── Default kernel image and initrd images gets loaded


├── It is a compressed image which would get uncompressed and
        starts loading

├── initrd or initramfs loads into memory
│         └── Temporary root file system which loads necessary storage
│                     drivers to facilitate mounting of real root file system              
│         └── Once real root file system gets mounted, this image would
│                    gets removed from memory

├── Performs various hardware initialization (CPU, memory, I/O
         etc.,) and mounts root file system

├── Loads first user space program "init" into memory


├── The init process ID is 1 and normally resists signal 9, and parent
          process for all other processes

├── /etc/rc.d/rc.sysinit script gets executed (sets environment, starts
│          swap, checks file system etc, required for system initialization)

├── /etc/rc.d/init.d/functions would run (defines how to start/stop/kill
│           and find PID of a program)

├── /etc/inittab file would be consulted to find out default run level

├──  All "K" scripts would be killed and scripts with "S" would be started
           from the file in /etc/rc.d/rc<Runlevel>.d/ folder

├── Finally /etc/rc.d/rc.local would gets executed

User Login

*UEFI (Unified Extensible Firmware Interface) based systems would normally implement GPT (Guid Partition Table) partition scheme instead of MBR which supports disks of larger size (more than 2 TB).

** this may change in recent Linux variants and on non-x86 platform. Instead of BIOS, there would be EFI being used in case of Itanium-based systems.


Anonymous said...

Thank you!! This is what i really wanted!

Anonymous said...

It is nicely documented, thank you.