How does Docker work?

Photo by Ian Taylor / Unsplash

If you have been into technology for the last couple of years, you have probably come across Docker. Docker has become the de-facto standard for application deployments across enterprises.

In the following series of posts, we explore how docker works under the hood. Especially we will focus on how docker isolates the application environment and how is it different from using a virtual machine.

We will also explore how docker interacts with the kernel and hardware of the host system. Doing so will help us understand why it is so favored and how it helps us with our development pipeline.

What is Docker?

Docker is usually known for its application isolation environment and cross-platform support. The docker documentation describes docker as a platform to separate the application from the infrastructure. In simple terms, it separates the application runtime environment from the host machine and runs the application in an isolated manner, independent of any libraries/dependencies on the outside.

Many tend to think of Docker as a lightweight virtual machine. After all, you can configure the OS you need, using something like FROM ubuntu:18.04 within a Dockerfile and install all your packages and application dependencies to get the service up and running, right? Isn't that what a virtual machine does?

VM Architecture

Well, not really. Thinking of docker as a virtual machine is not only wrong but defeats the whole purpose of why docker was created. A Docker application can be thought of more of as an isolated process rather than a virtual machine. Let's try to understand how.

Docker Architecture

Linux Namespaces

In 2002 the Linux kernel introduced a new feature called  Namespaces to create an isolated mount point for processes. This feature is the foundation upon which containers were built. By 2013 the Linux kernel 2.8 had added sufficient functionality to Namespaces to support container functionalities as we know them today.

Linux namespaces were built to isolate processes from their environment so that a process running within one namespace cannot affect the functioning of a process running within another namespace. This feature emulates a lightweight virtual machine on the system without the entire overhead of running the Guest OS kernel on the host machine. Namespaces allow us to isolate the crucial parts of the application environment at the kernel level without having to use another kernel. You can think of it as a very lightweight process isolation environment using the same host kernel instead of a different one.

A docker container is nothing but a Linux process running within a different Namespace. Within this Namespace, the user can be root, can change the system clock, the hostname, have a different root directory, and execute any commands he needs without affecting the host system. However, he has this control only over the resources assigned to that namespace.

As Namespaces is a Linux feature, Docker was initially built only for Linux and ran only on its distros. To run Docker on Windows or macOS,  a small lightweight Linux VM would be created, within which all containers are run. Over the years, Windows has added support to run docker natively to support windows containers. This is something we will look into, later in the series.

The Linux namespace controls what a process can see and has access to, thus restricting access to resources that the process is not allowed to tamper with.

As of Linux 5.16.5 released on February 1, 2022, there are 8 supported Namespaces in Linux. Namely:

  1. CGroup namespace: Creates a new namespace with a root Cgroup to control and monitor resources.
  2. IPC namespace: Used to isolate Inter-Process Communication resources.
  3. Network namespace: Create a new network address space within the namespace.
  4. Mount namespace: Creates a new mount space (directories/drives) for the process in the namespace.
  5. Process namespace: Isolates the child process within the new namespace with IDs starting from 1.
  6. Time namespace: Allows the namespace to have its own system clock with a different time from the host machine.
  7. User namespace: Isolates the user and groups from the parent system and enables them to be root within the namespace.
  8. UTS namespace: Allows the user to isolate and change the hostname within the namespace.

Most of these namespaces require the user to have privileged access (have root privileges). For example, creating a new network namespace cannot be done without running the process with a sudo prefix. You can read more about these namespaces here.

Conclusion

That's it for now. In the following post, we will try to break out the components that make up a container and will attempt to replicate docker functionalities using Linux namespaces.

If you would like to follow this series of articles, please consider subscribing to this publication. You can also follow me on Twitter at @lezwon.

References

  1. How Docker Works - Intro to Namespaces. (2020, February 21). [Video]. YouTube. https://www.youtube.com/watch?v=-YnMr1lj4Z8
  2. Containers: cgroups, Linux kernel namespaces, ufs, Docker, and intro to Kubernetes pods. (2017, September 28). [Video]. YouTube. https://www.youtube.com/watch?v=el7768BNUPw
  3. Namespaces in operation, part 1: namespaces overview. (n.d.). Lwn. Retrieved February 6, 2022, from https://lwn.net/Articles/531114/
  4. Ovens, S. (n.d.). The 7 most used Linux namespaces. Redhat. Retrieved February 6, 2022, from https://www.redhat.com/sysadmin/7-linux-namespaces
  5. Heddings, A. (n.d.). What Are Linux Namespaces and What Are They Used for? Cloudsavvyit. Retrieved February 6, 2022, from https://www.cloudsavvyit.com/742/what-are-linux-namespaces-and-what-are-they-used-for/
  6. namespaces(7) - Linux manual page. (n.d.). Linux Manual. Retrieved February 6, 2022, from https://man7.org/linux/man-pages/man7/namespaces.7.html
Lezwon Castelino

Lezwon Castelino

Freelancer | Open Source Contributor | Ex- @PyTorchLightnin Core ⚡ | Solutions Hacker | 20+ Hackathons