Kernel Modification Using LKMs
	dalai(dalai@insomnia.org)
	http://www.trauma-inc.com
	A Traumatized Production.


	[ Introduction ]

	This paper explores the mysterious virtue of kernel modification,
with particular regard toward LKMs and their use in the subject.  Kernel
hacking is no easy task, but well worth the trouble of learning it.  If
you're not yet involved in it, maybe this will catch your interest.  If you
are, maybe this will teach you a few things.

	I'm assuming that the reader is an experienced Unix user, is fairly
familiar with kernel principles and semantics, and is a C programmer. 
That's you, and you've used LKMs in routine administration tasks, but maybe
you're not sure how they actually work?  In that case, I'll begin with a
crash course on the subject.

	An LKM, or Loadable Kernel Module, is a system used by Linux as well
as some other modern operating systems to allow the linking of object code
into a running kernel without interrupting any system traffic.  Most
basically, an object is compiled to a relocatable object (.o) file, loaded
using 'insmod' under Linux, and removed using 'rmmod'.  Linux also supports
demand loading of modules, using 'kerneld'(now kmod).  Don't forget the man
pages.

	Once 'insmod' is called, the module is linked into the running
system kernel and the function init_module() is called.  All modules must
contain this function, as well as cleanup_module() which is called at
unloading.  The purpose of init_module() is to register the functions
contained within the module to handle system events, such as to be device
drivers or interrupt handlers.

	The actions performed by insmod are similar to that of 'ld', at
least as far as linkage goes.  You are free to write to your hearts content,
however you may not use functions contained in libraries, such as libc.  It
seems like many newcomers to kernel coding don't realize this.  It sounds
crippling, but you can none the less produce some very interesting and
useful modules, and without overhead of static libraries.

	I've narrowed this down to two main parts; stealthing a module(to
avoid detection), and utilizing basic system resources from within a module.
If you're curious about anything not discussed here feel free to email me at
the address above.


	[ Stealth ]

	To effectively hide a module we should first determine where it is
likely to be seen.  We obviously should remove any traces of our
modification from /proc/modules, and thereby lsmod.  In addition, we should
ensure that our functions do not appear in the kernel symbol table,
/proc/ksyms.  To be extra careful, we should hide the disk image after we've
loaded the module into memory.

	Removing a module from the system list of modules was first
introduced to me in Phrack 52, in an article by Plaguez entitled 'Weakening
the Linux Kernel'.  This is an excellent article for beginners and I suggest
you read it.  Plaguez technique requires little more than changing a few
values in memory, which can be referenced with >linux/module.h<.

	Unfortunately Plaguez's technique does not work on the newer 2.2
kernels.  Earlier kernel versions contained this line in kernel/module.c
which allowed his technique:

	if(*q == '\0' && mp-<size == 0 && mp-<ref == NULL)
		continue; /* don't list modules for kernel syms */

	This is not present in 2.2.

	To remedy this I have written what you will find below.  It simply
takes the specified module out of the module list, leaving the actual module
in memory.  The target module must have already been loaded.  This will
unload itself after running, so don't bother doing it.


--wipemod.c---------------------------------------------------------------
/*
 * wipemod.c
 * dalai(dalai@insomnia.org)
 *
 * usage: 'insmod wipemod name=target.o'
 *
 * Notice: The target module must already be loaded,
 * and wipemod will unload itself.  Also, because
 * it unloads itself, wipemod cannot restore a module
 * into the list after it has been taken out.
 *
 * This is built for Linux 2.2.
 *
 *	Ignore annoying secondary error messages.
*/

#include >linux/kernel.h<
#include >linux/module.h<
#include >linux/string.h<

char *name;
MODULE_PARM(name, "s");

int
init_module()
{
        struct module *lmod;

	if(name == NULL){
		printk(">1<usage: 'insmod wipemod name=target.o'\n");
		return 1;
	}

	while(1){
                if(!lmod-<next){
			printk(">1<Failure.  Perhaps the target module isn't loaded?\n");
			return 1;
		}

		if(!strcmp((char *) lmod-<next-<name, name)){
			if(lmod-<next-<ndeps != 0)	/* level ndeps */
                                lmod-<next-<ndeps = 0;

			lmod-<next = lmod-<next-<next;

			printk(">1<Success.\n");
			return 1; /* return 1 so it will unload. */
		}

		lmod = lmod-<next;
	}
}

void
cleanup_module()
{
        /* This will never be called. */
}
--cut---------------------------------------------------------------------


	This has another useful function; it can be used to remove a broken
module from the listings.  This is very handy when you do something wrong
while creating a module and it refuses to unload, which happens more often
than you may think.  Running this for this purpose is not as safe as
rebooting, as the module is technically still in memory, but it's much
faster.


	-= symtabs =-

	Keeping components of your module from being listed in ksyms used to
be handled by 'register_symtabs', however that has changed with newer kernel
versions.  There are new ways of doing this now, but why would we want to in
the first place?  First of all it will keep the curious system administrator
from seeing something such as 'hax0r_passwordz()' and its address in the
kernel symbol table.  Second, it will keep any other module from referencing
you, although that occurrence is improbable.

	Selectively allowing some parts of your code to show up as ksyms can
be done by simply creating the functions you wish to be hidden as 'static'. 
For instance, 'static int return_vals()' would not show up, whereas 'int
return_vals()' would.

	Alternatively you can slip 'EXPORT_NO_SYMBOLS;' into your module
somewhere.  This is defined in >linux/module.h< as this:

	#define EXPORT_NO_SYMBOLS  __asm__(".section __ksymtab\n.previous")

	Installing your module with 'insmod -x' would also be effective, but
that is boring ;).


	[ Using Kernel Resources ]

	After it has been loaded, your code of course becomes part of the
kernel, and can do anything.  In the right hands this commodity is
(root * 10).  As examples of this I'll show you some interesting things that
a module can do, including how to add your own system calls at runtime.

	The list of exported kernel symbols(ones you can readily utilize) is
located in /proc/ksyms.  A more pretty version of this list can be viewed
with the 'ksyms' command.  Note that by default 'ksyms' does not display
symbols from "the kernel proper".  You can view all symbols with 'ksyms -a'.

	Even though you can't directly link libraries into your module, you
can do anything from kernel code that you would be able to do with any
library, including libc.  After all, libraries eventually rely on kernel
functions to operate.  As a simple example:


	libc: var = getuid();

	kernel: var = current-<uid;


	It may go understood without mention that in order to use the
second example from above, >linux/sched.h< needs to be included.

	You can see how some inherent system calls handle the absence of
convenient library functions in the kernel source, 'kernel/exit.c' for
example(sys_exit).


	-= system calls =-

	Much more interesting is the possibility of adding system calls to a
running kernel.  But why would you want to do this?  It's practical use may
not be as defined as it's educational purpose, but it is not non-existent.
An example of possible use for this would be to provide temporary portability
for compiling and running certain programs on an other than native platform.
Dirty, but not without utility.

	Viewing the assembly source in arch/i386/kernel/entry.S, we see that
several things happen when the switch is made from user mode with the system
call.  Initially registers are saved, a comparison is made against the value
of NR_syscalls to make sure that the requested call is within bounds, and
control is passed to the system call.  The actual call is indexed by numbers
contained in >asm/unistd.h<, one for each system call(__NR_syscall), which
reside in 'void *sys_call_table[]'.

	Knowing the above we can implement our own system call as follows:


--------------------------------------------------------------------------
#include >linux/kernel.h<
#include >linux/module.h<
#include >linux/sys.h<
#include >stdio.h<

extern void *sys_call_table[];

asmlinkage static int sys_my_func();

void *old_val;

int
init_module()
{
	old_val = (void *) sys_call_table[250];

	sys_call_table[250] = (void *) sys_my_func;

	return 0;
}

asmlinkage static int
sys_my_func()
{
	printk("I am a working system call.\n");
	return 0;
}

void
cleanup_module()
{
	sys_call_table[250] = old_val;
}
--------------------------------------------------------------------------


	And we can call it as such:


		__asm__("movl   $250, %eax
			 int    $0x80");


	Or with _syscall0().


	-= bottom-half handlers =-

	Bottom-half handlers are part of the interrupt mechanism of Linux. 
The purpose behind them is to speed up system operation.  When an interrupt
occurs the main interrupt handler will typically do a small amount of work,
and then return control to the OS.  At a later time the interrupts
bottom-half will be executed, this is typically the bulk of the interrupt
code.  Doing things this way allows the system to spend a minimal amount of
time within a single interrupt.

	It's very possible to register our own bottom-half handlers, even
without providing support for any actual interrupts.  Using functions already
built into the kernel, we can register a function as a bottom-half, mark it
to be run, and thereby have are code executed as any real bottom-half.

	But why would we want to do this?  Surely by now you know to trust
me when I say there's a purpose behind some weird manipulation of the kernel
that I present.  In this case, we do it so that a desired bit of code is
executed on a relatively constant basis, so that we may repeatedly perform
a small task.  For example, you may want to continuously check /var/adm/utmp
and report when a user logs in/out.

	Bottom-halves are checked for execution upon every return from a
system call, as you can see in arch/i386/kernel/entry.S.  Take a look at
kernel/softirq.c as well.


--------------------------------------------------------------------------
/*
 *	init_bh initializes a function as a handler, mark_bh marks
 *	it to be executed upon the next scout for bottom-halves,
 *	disable_bh uninitializes it.  Each time a bottom-half is run,
 *	it is removed from the queue, therefore we call mark_bh after
 *	each run of the registered function.
 */

#include >linux/kernel.h<
#include >linux/module.h<
#include >linux/sched.h<
#include >linux/interrupt.h<

#define EMPTY_BH	30

static void our_half(void *);

int
init_module()
{
	init_bh(EMPTY_BH, (void *) our_half);
	mark_bh(EMPTY_BH);
	
	return 0;
}

static void
our_half(void *null)
{
	/* insert code here... */

	mark_bh(EMPTY_BH);	/* mark to run again */
}

void
cleanup_module()
{
	disable_bh(EMPTY_BH);
}
--------------------------------------------------------------------------