How To Debug Linux Kernel

Debugging the Linux kernel often starts with enabling dynamic debugging and analyzing kernel logs. If you are wondering how to debug linux kernel effectively, you need a structured approach that combines logging, breakpoints, and system analysis. This guide will walk you through the essential techniques, from basic log inspection to advanced kernel debugging tools.

Kernel debugging is different from user-space debugging. A crash in the kernel can freeze the entire system. You need the right tools and a methodical process to find the root cause. Let’s get started with the fundamentals.

Understanding Kernel Debugging Basics

Before you jump into debugging, you must understand the environment. The Linux kernel runs in privileged mode. A simple mistake can lead to a kernel panic. You need a test system, not your production machine.

Set up a virtual machine or a dedicated test box. This isolation protects your main system. It also allows you to experiment freely without consequences.

Essential Tools For Kernel Debugging

You need a set of tools to debug effectively. Here are the core ones:

GDB (GNU Debugger) – For source-level debugging with KGDB
KGDB – Kernel GDB stub for remote debugging
Kdump – Captures crash dumps for post-mortem analysis
Kprobes – Dynamic breakpoints in kernel functions
Ftrace – Built-in tracer for function calls and events
SystemTap – Scriptable dynamic tracing tool
Perf – Performance counters and profiling

Each tool serves a specific purpose. You will use different tools depending on the bug type. For example, use Ftrace for function tracing and Kdump for crash analysis.

Now we get to the core section. This is the step-by-step process for debugging the kernel. Follow these steps to identify and fix kernel issues.

Step 1: Enable Kernel Debugging Features

Your kernel must be compiled with debugging options. These options add extra checks and logging. Without them, debugging is much harder.

Configure your kernel with these flags:

CONFIG_DEBUG_KERNEL=y
CONFIG_KGDB=y
CONFIG_KPROBES=y
CONFIG_FTRACE=y
CONFIG_MAGIC_SYSRQ=y

Rebuild and install the kernel. This enables the debugging infrastructure. You can now use advanced debugging features.

Step 2: Use Dynamic Debugging

Dynamic debugging lets you enable or disable debug messages at runtime. It is more efficient than static printk statements. You control which modules and functions produce logs.

Enable dynamic debug for a specific module:

echo 'module my_driver +p' > /sys/kernel/debug/dynamic_debug/control

Check available debug messages:

cat /sys/kernel/debug/dynamic_debug/control

This method reduces log noise. You only see messages from the code you are investigating.

Step 3: Analyze Kernel Logs With Dmesg

The kernel ring buffer stores all kernel messages. Use the dmesg command to view them. This is your first stop when something goes wrong.

Useful dmesg commands:

dmesg -T – Show human-readable timestamps
dmesg -l err – Show only error messages
dmesg -w – Watch for new messages in real-time
dmesg -k – Show kernel messages only

Look for patterns like “Oops”, “BUG”, or “Call Trace”. These indicate serious issues. Save the output for later analysis.

Step 4: Use Ftrace For Function Tracing

Ftrace is a built-in tracer. It records function calls and events. You can trace specific functions or the entire kernel.

Enable function tracing:

echo function > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace

Trace a specific function:

echo 'do_sys_open' > /sys/kernel/debug/tracing/set_ftrace_filter
echo function > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace

Ftrace is lightweight. It works on production systems with minimal overhead. Use it to understand code flow and find performance bottlenecks.

Step 5: Set Breakpoints With Kprobes

Kprobes let you insert breakpoints dynamically. You can inspect variables and registers. This is useful for debugging without recompiling.

Example using kprobe via debugfs:

echo 'p:my_probe do_sys_open $arg1 $arg2' > /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/my_probe/enable
cat /sys/kernel/debug/tracing/trace

Kprobes work on live systems. They are safe for most debugging scenarios. Remove probes after use to avoid performance impact.

Step 6: Capture Crash Dumps With Kdump

When the kernel crashes, Kdump captures a memory dump. This dump contains the state at the moment of failure. You can analyze it offline.

Configure Kdump:

Reserve memory for the crash kernel: crashkernel=128M in boot parameters
Install kdump tools: apt install kdump-tools
Trigger a crash test: echo c > /proc/sysrq-trigger

The dump file is usually in /var/crash. Use crash utility to analyze it. This tool provides a GDB-like interface for kernel analysis.

Step 7: Remote Debugging With KGDB

KGDB allows you to debug the kernel over a serial connection. You run GDB on a host machine and connect to the target. This is the most powerful debugging method.

Setup steps:

Enable KGDB in kernel config
Add kgdboc=ttyS0,115200 to boot parameters
Boot the target and enter KGDB: echo g > /proc/sysrq-trigger
On host: gdb vmlinux then target remote /dev/ttyS0

You can set breakpoints, step through code, and inspect memory. This is ideal for complex bugs that are hard to reproduce.

Step 8: Use Perf For Performance Issues

Perf is a performance counter tool. It profiles the kernel and user-space. Use it to find hot functions and system bottlenecks.

Record kernel events:

perf record -a -g -- sleep 10
perf report

Trace system calls:

perf trace -s -p PID

Perf works with kernel and user-space. It helps identify performance regressions after kernel changes.

Common Kernel Bugs And How To Find Them

Different bugs require different approaches. Here are common types and their debugging strategies.

Kernel Panics And Oops Messages

A kernel panic stops the system. An Oops is a non-fatal error. Both produce a call trace in logs.

Analyze the call trace:

Find the function that caused the error
Check the instruction pointer (RIP)
Look for NULL pointer dereferences
Check for buffer overflows

Use decode_stacktrace.sh to translate addresses to function names. This script is in the kernel source tree.

Memory Leaks In Kernel Modules

Kernel memory leaks cause system slowdowns. Use kmemleak to detect them.

Enable kmemleak:

echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak

Kmemleak reports unreferenced memory blocks. Fix the module to free memory properly.

Race Conditions And Locking Issues

Race conditions are hard to reproduce. Use lockdep to detect potential deadlocks.

Enable lockdep in kernel config:

CONFIG_LOCKDEP=y
CONFIG_PROVE_LOCKING=y

Lockdep reports locking violations at runtime. It helps find incorrect lock ordering and missing locks.

Advanced Debugging Techniques

For complex issues, you need advanced methods. These techniques require more setup but provide deeper insights.

Using SystemTap For Scripted Tracing

SystemTap lets you write scripts to probe kernel events. It is similar to DTrace on Solaris.

Example script to trace system calls:

probe syscall.open {
    printf("open(%s)\n", filename)
}

Run with: stap script.stp

SystemTap is powerful but requires kernel debuginfo packages. It can be complex to set up on some distributions.

Kernel Probes (Kprobes) And Return Probes

Kprobes can be used programmatically. You can write kernel modules that register probes. This gives you full control over the debugging process.

Example kernel module with kprobe:

static int handler_pre(struct kprobe *p, struct pt_regs *regs) {
    printk("Function called\n");
    return 0;
}

This approach is for advanced developers. It requires kernel programming knowledge.

Debugging Kernel Modules

Kernel modules are loadable pieces of code. Debugging them is similar to core kernel debugging but with some differences.

Loading And Unloading Modules

Use insmod and rmmod to control modules. Watch dmesg for errors during loading.

Debug module initialization:

insmod my_module.ko
dmesg | tail

If the module crashes, it might hang the system. Use a test environment.

Module Symbols And GDB

When using KGDB, you need to load module symbols:

add-symbol-file my_module.ko 0xffffffffc0000000

Get the module address from /proc/modules. This lets you set breakpoints in module code.

Debugging With QEMU And Virtual Machines

Virtual machines are ideal for kernel debugging. QEMU supports GDB integration out of the box.

Setting Up QEMU For Kernel Debugging

Boot a kernel in QEMU with GDB support:

qemu-system-x86_64 -kernel bzImage -append "console=ttyS0" -s -S

The -s flag opens a GDB server on port 1234. The -S flag freezes the CPU at start.

Connect GDB:

gdb vmlinux
target remote :1234

You can now debug the kernel from the very first instruction. This is useful for boot-time issues.

FAQ: Common Questions About Kernel Debugging

How Do I Start Debugging The Linux Kernel?

Start by enabling kernel debugging options and using dmesg to check logs. Then try dynamic debugging for specific modules. Use ftrace for function tracing if logs are not enough.

What Is The Best Tool For Kernel Debugging?

There is no single best tool. Use dmesg for quick checks, ftrace for tracing, kdump for crash analysis, and KGDB for deep debugging. Choose based on the problem type.

Can I Debug The Kernel On A Production System?

Yes, but with caution. Use dynamic debugging and ftrace as they have low overhead. Avoid KGDB and kprobes on critical systems. Always test debugging tools in a lab first.

How Do I Interpret A Kernel Oops Message?

Look at the “RIP” line for the instruction pointer. Check the “Call Trace” for the function call path. Use decode_stacktrace.sh to map addresses to function names. The error code gives hints about the fault type.

What Is The Difference Between Kdump And KGDB?

Kdump captures a crash dump after a panic. You analyze it offline. KGDB lets you debug live with breakpoints and step execution. Use Kdump for post-mortem analysis and KGDB for interactive debugging.

Conclusion

Debugging the Linux kernel is a skill that improves with practice. Start with the basics: enable debug options, use dmesg, and try dynamic debugging. As you gain confidence, move to ftrace and kprobes. For the toughest bugs, set up KGDB or use Kdump for crash analysis.

Remember to work in a test environment. Kernel debugging can crash your system. With the right tools and methodical approach, you can fix even the most elusive kernel bugs. Keep learning and experimenting with different techniques.