Here are some questions that I might ask myself with the answers. Who are you to write it? Have you submitted patches to kernel? No, yet to. Have you modified and recompiled the kernel? Yes, several times. Most of the changes are very specific to the kind of work I am doing. Submitting that as a patch might not help the community. What Linux do you use? I use a Debian 5.0. Have you been paid to modify the kernel? Yes, as part of my job or as a freelance. Do you like doing that? I love doing that.
The order in which I go through the kernel is in the same order in which Robert Love structured his book (because I personally like that book), with some variations here and there. You would see the exact copy of Linux Kernel code in many places. Copyrights of the code belong to its developers (check github) under GPL or one of its variants.
Chapter 1: Introduction - Tasks or Processes
In Linux terminology, task stands for process. Wikipedia defines a process as an instance of a computer program that is being executed. Lets call it task going forward, because many of the variables in kernel source are named so. In Linux, a task can have multiple concurrent threads, a task can spawn multiple children-tasks, and each task operates on virtual memory with most of the time no regards to the other tasks. A task is created by a clone() or fork() system calls and terminated by an exit() system call.
In Linux, the information about each task is preserved in a structure: struct task_struct in linux/sched.h. This is a huge structure because of the information it has to store. Some important variables in that structure are:
- volatile long state; - defines the state of the task. Negative means the task is not runnable; zero means the task is runnable and can be scheduled; and positive means the task is stopped or suspended for some reason.
- pid_t pid; - unique process id. pid_t is usually int.
- struct mm_struct* mm; - user address space for the task
- struct files_struct *files; - pointer to the files kept open by this task
- struct thread_struct thread; - this is the CPU specific state of the current task. We will look at this later
- struct held_lock held_locks[MAX_LOCK_DEPTH]; - array of locks held by the task. Typical value of MAX_LOCK_DEPTH is 48
- struct task_struct *parent; - pointer to the parent of the current task (which would wait() for it)
- struct list_head children; - list of children of current task
- struct list_head sibling; - list of siblings of current task
The state of any task can be changed by the kernel using the macro: set_task_state(task, state); which is defined as follows:
#define set_task_state(tsk, state_value) \
Here set_mb is again architecture specific, just like current macro. If you are interested, you can find all the architecture specific macros in folder arch/##architecture##. Specifically, the implementation details of set_mb() can be found in asm/system.h and asm/barrier.h in every architecture.
Now let us write some simple code using what we learned. The following code traces back the ancestry of the current task and prints them in the kernel log:
struct task_struct* task;
for (task = current; task != init_task; task = task->parent)
printk("%s-%d\n", task->comm, task->pid);
To print all the siblings of the current task in the kernel log:
The code listed above are computationally expensive, and should be used only for debug purposes. That's it for now. In the next chapter (expect it on 5th of September), we will see how a process can be created and how threads in Linux kernel are different from processes (are they different?).
struct task_struct* sibpos;
list_for_each_entry(sibpos, ¤t->sibling, children)
printk("%s-%d\n", sibpos->comm, sibpos->pid);