Saturday, October 8, 2011

Linux Kernel Module Hello World!


2.1. Hello, World (part 1): The Simplest Module

When the first caveman programmer chiseled the first program on the walls of the first cave computer, it was a program to paint the string `Hello, world' in Antelope pictures. Roman programming textbooks began with the `Salut, Mundi' program. I don't know what happens to people who break with this tradition, but I think it's safer not to find out. We'll start with a series of hello world programs that demonstrate the different aspects of the basics of writing a kernel module.
Here's the simplest module possible. Don't compile it yet; we'll cover module compilation in the next section.
Example 2-1. hello-1.c

/*  hello-1.c - The simplest kernel module.
 */
#include   /* Needed by all modules */
#include   /* Needed for KERN_ALERT */


int init_module(void)
{
   printk("<1>Hello world 1.\n");
	
   // A non 0 return means init_module failed; module can't be loaded.
   return 0;
}


void cleanup_module(void)
{
  printk(KERN_ALERT "Goodbye world 1.\n");
}  
Kernel modules must have at least two functions: a "start" (initialization) function called init_module() which is called when the module is insmoded into the kernel, and an "end" (cleanup) function called cleanup_module() which is called just before it is rmmoded. Actually, things have changed starting with kernel 2.3.13. You can now use whatever name you like for the start and end functions of a module, and you'll learn how to do this in Section 2.3. In fact, the new method is the preferred method. However, many people still use init_module() andcleanup_module() for their start and end functions.
Typically, init_module() either registers a handler for something with the kernel, or it replaces one of the kernel functions with its own code (usually code to do something and then call the original function). The cleanup_module() function is supposed to undo whatever init_module()did, so the module can be unloaded safely.
Lastly, every kernel module needs to include linux/module.h. We needed to include linux/kernel.h only for the macro expansion for the printk()log level, KERN_ALERT, which you'll learn about in Section 2.1.1.

2.1.1. Introducing printk()

Despite what you might think, printk() was not meant to communicate information to the user, even though we used it for exactly this purpose in hello-1! It happens to be a logging mechanism for the kernel, and is used to log information or give warnings. Therefore, each printk()statement comes with a priority, which is the <1> and KERN_ALERT you see. There are 8 priorities and the kernel has macros for them, so you don't have to use cryptic numbers, and you can view them (and their meanings) in linux/kernel.h. If you don't specify a priority level, the default priority, DEFAULT_MESSAGE_LOGLEVEL, will be used.
Take time to read through the priority macros. The header file also describes what each priority means. In practise, don't use number, like<4>. Always use the macro, like KERN_WARNING.
If the priority is less than int console_loglevel, the message is printed on your current terminal. If both syslogd and klogd are running, then the message will also get appended to /var/log/messages, whether it got printed to the console or not. We use a high priority, like KERN_ALERT, to make sure the printk() messages get printed to your console rather than just logged to your logfile. When you write real modules, you'll want to use priorities that are meaningful for the situation at hand.


2.2. Compiling Kernel Modules

Kernel modules need to be compiled with certain gcc options to make them work. In addition, they also need to be compiled with certain symbols defined. This is because the kernel header files need to behave differently, depending on whether we're compiling a kernel module or an executable. You can define symbols using gcc's -D option, or with the #define preprocessor command. We'll cover what you need to do in order to compile kernel modules in this section.

  • -c: A kernel module is not an independant executable, but an object file which will be linked into the kernel during runtime using insmod. As a result, modules should be compiled with the -c flag.
  • -O2: The kernel makes extensive use of inline functions, so modules must be compiled with the optimization flag turned on. Without optimization, some of the assembler macros calls will be mistaken by the compiler for function calls. This will cause loading the module to fail, since insmod won't find those functions in the kernel.
  • -W -Wall: A programming mistake can take take your system down. You should always turn on compiler warnings, and this applies to all your compiling endeavors, not just module compilation.
  • -isystem /lib/modules/`uname -r`/build/include: You must use the kernel headers of the kernel you're compiling against. Using the default/usr/include/linux won't work.
  • -D__KERNEL__: Defining this symbol tells the header files that the code will be run in kernel mode, not as a user process.
  • -DMODULE: This symbol tells the header files to give the appropriate definitions for a kernel module.
We use gcc's -isystem option instead of -I because it tells gcc to surpress some "unused variable" warnings that -W -Wall causes when you include module.h. By using -isystem under gcc-3.0, the kernel header files are treated specially, and the warnings are surpressed. If you instead use -I (or even -isystem under gcc 2.9x), the "unused variable" warnings will be printed. Just ignore them if they do.
So, let's look at a simple Makefile for compiling a module named hello-1.c:
Example 2-2. Makefile for a basic kernel module

TARGET  := hello-1
WARN    := -W -Wall -Wstrict-prototypes -Wmissing-prototypes
INCLUDE := -isystem /lib/modules/`uname -r`/build/include
CFLAGS  := -O2 -DMODULE -D__KERNEL__ ${WARN} ${INCLUDE}
CC      := gcc-3.0
	
${TARGET}.o: ${TARGET}.c

.PHONY: clean

clean:
    rm -rf {TARGET}.o
As an exercise to the reader, compile hello-1.c and insert it into the kernel with insmod ./hello-1.o (ignore anything you see about tainted kernels; we'll cover that shortly). Neat, eh? All modules loaded into the kernel are listed in /proc/modules. Go ahead and cat that file to see that your module is really a part of the kernel. Congratulations, you are now the author of Linux kernel code! When the novelty wares off, remove your module from the kernel by using rmmod hello-1. Take a look at /var/log/messages just to see that it got logged to your system logfile.
Here's another exercise to the reader. See that comment above the return statement in init_module()? Change the return value to something non-zero, recompile and load the module again. What happens?


2.3. Hello World (part 2)

As of Linux 2.4, you can rename the init and cleanup functions of your modules; they no longer have to be called init_module() andcleanup_module() respectively. This is done with the module_init() and module_exit() macros. These macros are defined in linux/init.h. The only caveat is that your init and cleanup functions must be defined before calling the macros, otherwise you'll get compilation errors. Here's an example of this technique:
Example 2-3. hello-2.c

/*  hello-2.c - Demonstrating the module_init() and module_exit() macros.  This is the 
 *     preferred over using init_module() and cleanup_module().
 */
#include    // Needed by all modules
#include    // Needed for KERN_ALERT
#include      // Needed for the macros


static int hello_2_init(void)
{
   printk(KERN_ALERT "Hello, world 2\n");
   return 0;
}


static void hello_2_exit(void)
{
   printk(KERN_ALERT "Goodbye, world 2\n");
}


module_init(hello_2_init);
module_exit(hello_2_exit);
So now we have two real kernel modules under our belt. With productivity as high as ours, we should have a high powered Makefile. Here's a more advanced Makefile which will compile both our modules at the same time. It's optimized for brevity and scalability. If you don't understand it, I urge you to read the makefile info pages or the GNU Make Manual.
Example 2-4. Makefile for both our modules

WARN    := -W -Wall -Wstrict-prototypes -Wmissing-prototypes
INCLUDE := -isystem /lib/modules/`uname -r`/build/include
CFLAGS  := -O2 -DMODULE -D__KERNEL__ ${WARN} ${INCLUDE}
CC      := gcc-3.0
OBJS    := ${patsubst %.c, %.o, ${wildcard *.c}}

all: ${OBJS}

.PHONY: clean

clean:
    rm -rf *.o
As an exercise to the reader, if we had another module in the same directory, say hello-3.c, how would you modify this Makefile to automatically compile that module?


2.4. Hello World (part 3): The __init and __exit Macros

This demonstrates a feature of kernel 2.2 and later. Notice the change in the definitions of the init and cleanup functions. The __init macro causes the init function to be discarded and its memory freed once the init function finishes for built-in drivers, but not loadable modules. If you think about when the init function is invoked, this makes perfect sense.
There is also an __initdata which works similarly to __init but for init variables rather than functions.
The __exit macro causes the omission of the function when the module is built into the kernel, and like __exit, has no effect for loadable modules. Again, if you consider when the cleanup function runs, this makes complete sense; built-in drivers don't need a cleanup function, while loadable modules do.
These macros are defined in linux/init.h and serve to free up kernel memory. When you boot your kernel and see something like Freeing unused kernel memory: 236k freed, this is precisely what the kernel is freeing.
Example 2-5. hello-3.c

/*  hello-3.c - Illustrating the __init, __initdata and __exit macros.
 */
#include       /* Needed by all modules */
#include       /* Needed for KERN_ALERT */
#include         /* Needed for the macros */

static int hello3_data __initdata = 3;


static int __init hello_3_init(void)
{
   printk(KERN_ALERT "Hello, world %d\n", hello3_data);
   return 0;
}


static void __exit hello_3_exit(void)
{
   printk(KERN_ALERT "Goodbye, world 3\n");
}


module_init(hello_3_init);
module_exit(hello_3_exit);
By the way, you may see the directive "__initfunction()" in drivers written for Linux 2.2 kernels:

 __initfunction(int init_module(void))
{
   printk(KERN_ALERT "Hi there.\n");
   return 0;
}
This macro served the same purpose as __init, but is now very deprecated in favor of __init. I only mention it because you might see it modern kernels. As of 2.4.18, there are 38 references to __initfunction(), and of 2.4.20, there are 37 references. However, don't use it in your own


2.5. Hello World (part 4): Licensing and Module Documentation

If you're running kernel 2.4 or later, you might have noticed something like this when you loaded the previous example modules:

# insmod hello-3.o
Warning: loading hello-3.o will taint the kernel: no license
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Hello, world 3
Module hello-3 loaded, with warnings
	
In kernel 2.4 and later, a mechanism was devised to identify code licensed under the GPL (and friends) so people can be warned that the code is non open-source. This is accomplished by the MODULE_LICENSE() macro which is demonstrated in the next piece of code. By setting the license to GPL, you can keep the warning from being printed. This license mechanism is defined and documented in linux/module.h.
Similarly, MODULE_DESCRIPTION() is used to describe what the module does, MODULE_AUTHOR() declares the module's author, andMODULE_SUPPORTED_DEVICE() declares what types of devices the module supports.
These macros are all defined in linux/module.h and aren't used by the kernel itself. They're simply for documentation and can be viewed by a tool like objdump. As an exercise to the reader, try grepping through linux/drivers to see how module authors use these macros to document their modules.
Example 2-6. hello-4.c

/*  hello-4.c - Demonstrates module documentation.
 */
#include 
#include 
#include 
#define DRIVER_AUTHOR "Peiter Jay Salzman "
#define DRIVER_DESC   "A sample driver"

int init_hello_3(void);
void cleanup_hello_3(void);


static int init_hello_4(void)
{
   printk(KERN_ALERT "Hello, world 4\n");
   return 0;
}


static void cleanup_hello_4(void)
{
   printk(KERN_ALERT "Goodbye, world 4\n");
}


module_init(init_hello_4);
module_exit(cleanup_hello_4);


/*  You can use strings, like this:
 */
MODULE_LICENSE("GPL");           // Get rid of taint message by declaring code as GPL.

/*  Or with defines, like this:
 */
MODULE_AUTHOR(DRIVER_AUTHOR);    // Who wrote this module?
MODULE_DESCRIPTION(DRIVER_DESC); // What does this module do?

/*  This module uses /dev/testdevice.  The MODULE_SUPPORTED_DEVICE macro might be used in
 *  the future to help automatic configuration of modules, but is currently unused other
 *  than for documentation purposes.
 */
MODULE_SUPPORTED_DEVICE("testdevice");

2.6. Passing Command Line Arguments to a Module

Modules can take command line arguments, but not with the argc/argv you might be used to.
To allow arguments to be passed to your module, declare the variables that will take the values of the command line arguments as global and then use the MODULE_PARM() macro, (defined in linux/module.h) to set the mechanism up. At runtime, insmod will fill the variables with any command line arguments that are given. The variable declarations and macros should be placed at the beginning of the module for clarity. The example code should clear up my admittedly lousy explanation.
The MODULE_PARM() macro takes 2 arguments: the name of the variable and its type. The supported variable types are "b": single byte, "h": short int, "i": integer, "l": long int and "s": string. Strings should be declared as "char *" and insmod will allocate memory for them. You should always try to give the variables an initial default value. This is kernel code, and you should program defensively. For example:

    int myint = 3;
    char *mystr;

    MODULE_PARM (myint, "i");
    MODULE_PARM (mystr, "s");
	
Arrays are supported too. An integer value preceding the type in MODULE_PARM will indicate an array of some maximum length. Two numbers separated by a '-' will give the minimum and maximum number of values. For example, an array of shorts with at least 2 and no more than 4 values could be declared as:

    int myshortArray[4];
    MODULE_PARM (myintArray, "2-4i");
	
A good use for this is to have the module variable's default values set, like which IO port or IO memory to use. If the variables contain the default values, then perform autodetection (explained elsewhere). Otherwise, keep the current value. This will be made clear later on. For now, I just want to demonstrate passing arguments to a module.
Example 2-7. hello-5.c

/*  hello-5.c - Demonstrates command line argument passing to a module.
 */
#include 
#include 
#include 
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Peiter Jay Salzman");

static short int myshort = 1;
static int myint = 420;
static long int mylong = 9999;
static char *mystring = "blah";

MODULE_PARM (myshort, "h");
MODULE_PARM (myint, "i");
MODULE_PARM (mylong, "l");
MODULE_PARM (mystring, "s");


static int __init hello_5_init(void)
{
   printk(KERN_ALERT "Hello, world 5\n=============\n");
   printk(KERN_ALERT "myshort is a short integer: %hd\n", myshort);
   printk(KERN_ALERT "myint is an integer: %d\n", myint);
   printk(KERN_ALERT "mylong is a long integer: %ld\n", mylong);
   printk(KERN_ALERT "mystring is a string: %s\n", mystring);
   return 0;
}


static void __exit hello_5_exit(void)
{
   printk(KERN_ALERT "Goodbye, world 5\n");
}


module_init(hello_5_init);
module_exit(hello_5_exit);
Supercalifragilisticexpialidocious

2.7. Modules Spanning Multiple Files

Sometimes it makes sense to divide a kernel module between several source files. In this case, you need to:

  1. In all the source files but one, add the line #define __NO_VERSION__. This is important because module.h normally includes the definition of kernel_version, a global variable with the kernel version the module is compiled for. If you need version.h, you need to include it yourself, because module.h won't do it for you with __NO_VERSION__.
  2. Compile all the source files as usual.
  3. Combine all the object files into a single one. Under x86, use ld -m elf_i386 -r -o <1st src file.o> <2nd src file.o>.
Here's an example of such a kernel module.
Example 2-8. start.c

/*  start.c - Illustration of multi filed modules
 */

#include        /* We're doing kernel work */
#include        /* Specifically, a module */

int init_module(void)
{
  printk("Hello, world - this is the kernel speaking\n");
  return 0;
}
The next file:
Example 2-9. stop.c

/*  stop.c - Illustration of multi filed modules
 */

#if defined(CONFIG_MODVERSIONS) && ! defined(MODVERSIONS)
   #include  /* Will be explained later */
   #define MODVERSIONS
#endif        
#include   /* We're doing kernel work */
#include   /* Specifically, a module  */
#define __NO_VERSION__     /* It's not THE file of the kernel module */
#include  /* Not included by module.h because of
	                                      __NO_VERSION__ */
	
void cleanup_module()
{
   printk("<1>Short is the life of a kernel module\n");
}  
And finally, the makefile:
Example 2-10. Makefile for a multi-filed module

CC=gcc
MODCFLAGS := -O -Wall -DMODULE -D__KERNEL__
   	
hello.o:	hello2_start.o hello2_stop.o
   ld -m elf_i386 -r -o hello2.o hello2_start.o hello2_stop.o
   	
start.o: hello2_start.c
   ${CC} ${MODCFLAGS} -c hello2_start.c
   	
stop.o: hello2_stop.c
   ${CC} ${MODCFLAGS} -c hello2_stop.


By 
James Thornton



No comments: