Friday, August 25, 2006

Write Kernel Thread for your Device Driver

Jan. 2006
Often your driver is required to do more work more than what is can be done in the interrupt context, for example, it may be reasonable to defer some processing in a thread out of interrupt context. Even more your driver may have to perform some processing periodically. One option is to do this extra task in a kernel thread. this note just wants show you the approach to do so, but it does not make judgement if it’s a good idea for your situation.

You make up your mind to create a kernel thread to do the job. Although there are a few articles online addressing the topic, I have not found that answers all my questions when I started to implemented kernel threads for my design. Most have not emphasized synchronization issues involved, for which I learned my lesson. Here is some experience I gained and I want share with you. Warning, there are so many differently version of Linux out there, it works on my machine but I can not guaranty it runs the same on your machine. You are advised do your own research (including reading this article) and do your own experiment also at your own risk.

1. A Dumb driver
A complex driver may need implementation of multiple threads, which required carefully designed communication mechanism between them. At this time we just focus on simply creating and stopping a thread first. For this reason, let’s assume we have a device driver, it does nothing but prints a message from kernel at certain frequency, and updates a status structure. Let’s forget about what in the structure and how it will be used in the driver. I assume also we implement the kernel thread as part of a Linux character driver. You could find all information on writing a character driver in the Linux Device Driver book listed in the references section. In fact it’s irrelevant what kind of driver it is for our purpose, it could be a simple kernel module without any specific type of driver involved.

2. Create a Thread
The kernel function to create a kernel thread is
long kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
Your thread entry point is passed to the function pointer (*fn). The argument (arg) is for you to pass any parameter to your function. The last one gives options to indicate what resource the thread is going to share with the parent. When the driver is loaded as a loadable module, the parent is the process that calls insmod. Most likely you want your process run just like a daemon but not share anything with the parent. So 0 is the value you want. I just list her applicable options and they are needed when creating thread for user space applications: CLONE_FS, CLONE_FILES and CLONE_SIGHAND.

The function returns the process ID on success. After calling this function, the thread implemented by (*fn) is then created and ready to run. But there is some minimum job you have to done in your thread. We’ll see it later in the thread’s implementation.

The following listing shows how to create a kernel thread implemented by function example_thread().

static volatile int example_exit=0;
static volatile int
example_records=0;
static long example_pid=0;
static struct completion
exampletask_sync; /* */
int example_init()
{
init_completion(&exampletask_sync);
example_pid =kernel_thread(example_thread, (void *)&example_records, 0);
if ( example_pid<=0 ) { printk(KERN_WARNING"can not create kernel thread"); } else { wait_for_completion(&exampletask_sync); /*At this point, the thread is running, we can go ahead*/ } return 0; }

The structure “struct completion” is really a semaphore. The semaphore is used to synchronize the creating thread and the created thread. In the above example, the creating process starts the kernel thread, and then waits the created the thread to release the semaphore. The creating thread can only proceed after the semaphore is released. This is not necessary to have it in this example. In real implementation, your calling thread can only continue only unitil the created thread gets into a certain stage.

3. Body of the Thread
The next listing shows a minimum implementation of a kernel thread. This thread does nothing but prints a message periodically. A real driver code should do much more than this.

static int example_thread(void *arg)
{
struct task_struct *curtask = current;
int i=0;
int result=0;
mm_segment_t old;
struct sched_param param = { sched_priority: EXAMPLE_PRIORITY };
unsigned char mask=0;
unsigned char reg=0;
unsigned char value;
int *import_struct=arg;
daemonize();
strcpy(curtask->comm, "kexample");
old=get_fs();
set_fs(KERNEL_DS);
result = setscheduler(example_pid, SCHED_FIFO, ¶m);
set_fs(old);
if ( 0 != result)
{
printk(KERN_ERR"%8x setscheduler() : result=%d\n", (unsigned int) jiffies, result);
}
complete(&exampletask_sync);
while ( 1 )
{
if(signal_pending(current))
{
#ifdef DEBUG
printk(KERN_DEBUG"%8x example thread : get a signal \n", (unsigned int) jiffies);
#endif
flush_signals(current);
if ( example_exit )
{
#ifdef DEBUG
printk(KERN_DEBUG"%8x : give sem for exit\n", (unsigned int) jiffies);
#endif break;
}
}
else
{ // do some real thing??
*important_struct=i;
printk("Do the thing %d\n", i++);
set_current_state(TASK_INTERRUPTIBLE);
schedule_timeout (EXAMPLE_LONG_SLEEP); }
}
printk(KERN_DEBUG"%8x : exiting\n", (unsigned int) jiffies);
complete_and_exit(&exampletask_sync, 0);
}


The thread has some preparation to do for its own before it performs the job in the main thread body, the while loop.

First, the function call daemonize() cuts the tight from the parent. The kernel thread then sets up a name in text string for the thread and the thread priority. All thread information is stored in a complex structure struct task_struct. The global current in the kernel space is the current running thread. The name of the thread is stored in current->comm. Thus you can find your thread from the list when you do “ps –e”.

The priority is set in current->prio. But do not touch it directly. The scheduler updates this information and uses it to put thread in/out the threads queues. Directly manipulating those fields brings disaster to your system at unpredictable time. This type error is very hard to debug as it can crash a totally unrelated thread at a much late time. Look in file schd.c in Linux source tree, there were quite few functions defined to manipulate a thread’s attributes. A closer look reveals that they are really for user application thread but not really to be called from a kernel thread. That’s why you have to call set_fs(KERNEL_DS).

The function set_fs() modifies the current address limits to either KERNEL_DS or USER_DS. Before calling setscheduler(), by calling set_fs(KERNEL_DS), we tell the kernel that we are calling this right inside the kernel. Then we can call setscheduler() safely. Otherwise, this data is just garbage when setscheduler() function interprets it. This is another chance you can harm your system. You can find more discussion on this function in an article of Kroah-Hartman listed later in the reference.

The line complete(&exampletask_sync) releases the semaphore, then the creating thread can go forward to complete the loading. After this point, the thread goes on its own. In fact in this case, the first synchronization point (wait_for_completion() and complete() pair)is dispensable in our example. In a real implementation, you probably have a few things to do before each thread can go ahead. That’s why I put it there as the minimum.

The flowing two function calls
set_current_state(TASK_INTERRUPTIBLE) and schedule_timeout (EXAMPLE_LONG_SLEEP) make the thread sleep and wake up at every EXAMPLE_LONG_SLEEP system ticks. You want set task interruptible, so a software signal can wake up the thread too, it will be handy during debug time. I also use signal to communicate the kernel thread within the driver. We’ll see it in the exit part of listing.

If you set task TASK_INTERRUPTIBLE, you have to deal with signal yourself. You have to decide first, what action to take if the thread is waken by a signal; second and important, you have to consume the signal as well. Otherwise your thread will be continuously waken by the signal. The signal_pending() call can tell you if the thread is waken by a signal. The signal is consumed by calling flush_signals(); In our case, if the thread is waken by a signal, but if example_exit is not set, the thread is going to ignore the signal. If the example_exit is set the thread is going to exit.

As the thread is part of my driver, I only want it exits when the driver is unloaded. If you issue kill -9 command to the kernel thread, it won’t kill the thread.

One thing to notice that signal makes the thread behave a little different as we anticipated: It does not print message exactly at the frequency we set. If we send a signal to the thread, then it will print the message as well. To make the thread print at exact frequency, the thread has to remember how much time escaped if it was waked by a signal. If it wakes up too early, it should sleep for another adjusted time interval. We do not get into too much discussion for this. We just leave it as it is.

The driver module enter point just has a few lines like this.

int init_module(void)

{
int err=0;
err = register_chrdev( EXAMPLE_MAJOR, "example",&example_fops );
if ( 0 > err )
{
return err;

}

example_init();
return 0;

}

The exit point of the driver is given bellow

void cleanup_module(void)
{
int err;
err = unregister_chrdev( EXAMPLE_MAJOR, "example");
if ( 0>err )
{
printk(KERN_ERR"unregister_chrdev err=%d", err);
}
example_cleanup();
#ifdef DEBUG
printk( KERN_DEBUG"Bye!\n");
#endif
return;
}

static void example_cleanup(void)
{
int result=0;
/*set exit instruction to the thread*/
example_exit=1;
result=kill_proc(example_pid, SIGKILL, 1);
if ( result==0 )
{
#ifdef DEBUG
printk(KERN_DEBUG"%8x : sig to exit", (unsigned int)jiffies);
#endif
wait_for_completion(&exampletask_sync);
#ifdef DEBUG
printk(KERN_DEBUG"%8x : aquired sem to exit", (unsigned int)jiffies);
#endif
}
else /*something wrong*/
{
printk(KERN_CRIT"%8x: can not snt: snt to0x%x result=%d",
(unsigned int)jiffies, (unsigned int)example_pid, result);
}
return;
}


The call to fellowing function completes the exit portion.
void complete_and_exit(struct completion *comp, long code)
The first parameter is the completion structure we used at driver initialization time. The second parameter is exit code of the thread. In our case the thread goes here as expected, we give value 0. After this point, the calling thread is then waked up to proceed for unload completion. The module is then safely unloaded.

4. Summary
In this short article we discussed several issues involved in writing a kernel thread for a device driver. The sample code provides a simplified minimal implementation, but illustrates typical synchronization mechanism. You may use it as start point for your implementation. I have run this code on PPC and MIPS with MontaVista 3.1/kernel 2.4.x.

In a real case, your thread will be much more complex. You need put more thoughts on synchronization issues. When your kernel threads become more complicated, you have think again if you really want to put it in the kernel thought. In Linux unlike a typical embedded OS, applications are expected run in user space, and thus much more communication mechanisms are there provided. While in Linux kernel, you do not have a rich set of communication tools available, compared to a typical embedded OS (Of cause user/kernel distinction does not apply to them). For example, if a semaphore is used among multiple threads, be ware of priority inversion problem. You have to be really careful when you have multiple kernel threads in your driver.

Reference

Jonathan Corbet, Alessandro Rubini and Greg Kroah-Hartman. Linux Deviec Driver. Sebastopol: O’Reilly Media, 2005.

Greg Kroah-Hartman. “Driving Me Nuts-Things You Never Should DO in the Kernel” Linux Joural Version (2005):http://www.linuxjournal.com/article/8110