User-level debugging on Genode via GDB

A convenient solution for debugging individual applications is a feature that is repeatedly asked for by Genode developers. In the past, the answer to this question was rather unsatisfying. There existed a multitude of approaches but none was both simple to use and powerful. With GDB monitor, this has changed. Using this component, debugging an individual Genode application over a remote GDB connection has become possible. This way, most debugging facilities valued and expected from user-level debuggers on commodity operating systems become available to Genode developers.

Traditional approaches

There are several ways of debugging user-level programs on Genode. Probably the most prominent approach is spilling debug messages throughout the code of interest. The colored printf macros PDBG, PINF, PWRN, and PERR become handy in such situations. By default, those messages are targeting core's LOG service. Hence, each debug message appears nicely tagged with the originating process on the kernel debug console. Even though this approach looks like from the stone age, it remains to be popular because it is so intuitive to use.

For debugging the interaction between different processes, however, the classical printf methodology becomes inefficient. Here is where platform-specific debugging facilities enter the picture. Most L4 kernels come with built-in kernel debuggers that allow the inspection of kernel objects such as threads and address spaces. This way, we get a global view on the system from the kernel's perspective. For example, the mapping of virtual memory to physical pages can be revealed, the communication relationships between threads become visible, and the ready queue of the scheduler can be observed. To a certain extent, kernel debuggers had been complemented with useful facilities for debugging user-level programs. For example, the Fiasco kernel debugger comes with a convenient backtrace function that parses the call stack of a given thread. Using the addresses printed in the backtrace, the corresponding code can be matched against the output of the objdump utility that comes with the GNU binutils. Among the kernel debuggers of Genode's supported base platforms, the variants of L4/Fiasco and respectively Fiasco.OC stand out. We often find ourself switching to one of these kernel platforms when we hit a hard debugging problem for the sole reason that the kernel debugger is so powerful.

However, with complex applications, the kernel debugger becomes awkward to use. For example, if an application uses shared libraries, the kernel has no interpretation of them. Addresses that appear as the backtrace of the stack must be manually matched against the loading addresses of the individual shared libraries, the objdump must be used with the offset of the return address from the shared-library's base address. Saying that this process is inconvenient would be a blatant understatement. Of course, sophisticated features like source-level debugging and single-stepping of applications is completely out of the scope of a kernel debugger.

For problems that call for source-level debugging and single-stepping, however, we have found Qemu's GDB stub extremely useful. This stub can be used to attach GDB to a virtual machine emulated by Qemu. By manually loading symbols into GDB, this approach can be used to perform source-level debugging to a certain degree. However, there are a number of restrictions attached to this solution. First, Qemu is not aware of any abstractions of the running operating system. So if the kernel decides to preempt the current thread and switch to another, the single-stepping session comes to a surprising end. Also, Qemu is not aware of the different address spaces. Hence, a breakpoint triggers as soon as the program counter reaches the breakpoint address regardless of the process. If multiple applications use the same virtual addresses (which is usually the case), we get an aliasing problem. This problem can be mitigated by linking each application to a different virtual-address range. However, this effort is hardly recommendable as a general solution. Still, Qemu's GDB stub can save the soul of a developer who has to deal with problems in the category of low-level C++ exception handling.

For debugging higher-level application code and protocols, using GDB on Linux is a viable choice if the application scenario can executed on the base-linux platform. For many problems on Genode, this is apparently the case because most higher-level code is platform-independent. On the Linux base platform, each Genode process runs as an individual Linux process. Consequently, GDB can be attached to such a process using the gdb -p command. To synchronize the point in time of attaching GDB, the utility function wait_for_continue provided by the Linux version of Genodes env library can be utilized. In general, this approach is viable for high-level code. There are even success stories with debugging the program logic of a Genode device driver on Linux even though no actual hardware has been present the Linux platform. However, also this approach has severe limitations (besides being restricted to the base-linux platform). The most prevalent limitation is the lack of thread debugging. For debugging normal Linux applications, GDB relies on certain glibc features (e.g., the way of how threads are managed using the pthread library and the handling of thread-local storage). Since, Genode programs are executed with no glibc, GDB lacks this information.

To summarize, there are plentiful ways of debugging programs on Genode. The fact that Genode supports a range of base platforms opens up a whole range of possibilities of all base platforms combined. But none of those mechanisms is ideal for debugging native Genode applications. GDB monitor tries to fill this gap by enabling GDB to be attached to a Genode process. Once attached, GDB can be used to debugged the process GDB's full power including source-level debugging, breakpoints, single-stepping, backtraces, and call-frame inspection.

GDB monitor concept

In the following, the term target refers to the Genode program to debug. The term host refers to the system where GDB is executed. When using the normal work flow of Genode's run tool, the host is typically a Linux system that executes Genode using Qemu.

GDB monitor is a Genode process that sits in-between the target and its normal parent. As the parent of the target it has full control over all interactions of the target with the rest of the system. I.e., all session requests originating from the target including those that normally refer to core's services are first seen by GDB monitor. GDB monitor, in turn, can decide whether to forward such a session request to the original parent or to virtualize the requested service using a local implementation. The latter is done for all services that GDB monitor needs to inspect the target's address space and thread state. In particular, GDB monitor provides local implementations of the CPU and RM (and ROM) services. Those local implementations use real core services as their respective backend and a actually mere wrappers around the core service functions. However, by having the target to interact with GDB monitor instead of core directly, GDB monitor gains full control over all threads and memory objects (dataspace) and the address space of the target. All session requests that are of no specific interest to GDB monitor are just passed through to the original parent. This way, the target can use services provided by other Genode programs as normally. Furthermore, service announcements of the target are propagated to the original parent as well. This way, the debugging of Genode services becomes possible.

Besides providing a virtual execution environment for the target, the GDB monitor contains the communication protocol code to interact with a remote GNU debugger. This code is a slightly modified version of the so-called gdbserver and uses a Genode terminal session to interact with GDB running on the host. From GDB monitor's point of view, the terminal session is just a bidirectional line of communication with GDB. The actual communication mechanism depends on the service that provides the terminal session on Genode. Currently, there are two services that can be used for this purpose: TCP terminal provides terminal sessions via TCP connections, and Genode's UART drivers provides one terminal session per physical UART. Depending on which of those terminal services is used, the GDB on the host must be attached either to a network port or to a comport of the target, i.e. Qemu.

Building

The source code of GDB monitor builds upon the original gdbserver that comes as part of the GDB package. This 3rd-party source code is not included in Genode's source tree. To download the code and integrate it with Genode, issue the following command

 ./tool/ports/prepare_port gdb

This way, the 3rd-party source code will be downloaded, unpacked, and patched.

To build and use GDB monitor, you will need to enable the ports source-code repository on your <build-dir>/etc/build.conf file (in addition to the default repositories):

If you intend to use the TCP terminal for connecting GDB, you will further need to prepare the lwip package and enable the following repositories in your 'build.conf':

libports

providing the lwIP stack needed by TCP terminal

gems

hosting the source code of TCP terminal

With those preparations made, GDB monitor can be built from within the build directory via

 make app/gdb_monitor

The build targets for the TCP terminal and the UART drivers are server/tcp_terminal and drivers/uart respectively.

Integrating GDB monitor into an application scenario

To integrate GDB monitor into an existing Genode configuration, the start node of the target must be replaced by an instance of GDB monitor. GDB monitor, in turn, needs to know which binary to debug. So we have provide GDB monitor with this information using a config/target node.

For example, the original start node of the Nitpicker GUI server as found in the os/run/demo.run run script looks as follows:

 <start name="nitpicker">
   <resource name="RAM" quantum="1M"/>
   <provides><service name="Gui"/></provides>
 </start>

For debugging the Nitpicker service, it must be replaced with the following snippet (see the debug_nitpicker.run script at ports/run/ for reference):

 <start name="gdb_monitor">
   <resource name="RAM" quantum="4M"/>
   <provides> <service name="Gui"/> </provides>
   <config><target name="nitpicker"/></config>
 </start>

Please note that the RAM quota has been increased to account for the needs of both GDB monitor and Nitpicker. On startup, GDB monitor will ask its parent for a Terminal service. So we have to enhance the Genode scenario with either an UART driver or the TCP terminal.

For using an UART, add the following start entry to the scenario:

 <start name="uart_drv">
   <resource name="RAM" quantum="1M"/>
   <provides> <service name="Terminal"/> </provides>
   <config>
     <policy label_prefix="gdb_monitor" uart="1"/>
   </config>
 </start>

This entry will start the UART driver and defines the policy of which UART to be used for which client. In the example above, the client with the label "gdb_monitor" will receive the UART 1. UART 0 is typically used for the kernel and core's LOG service. So the use of UART 1 is recommended.

For using the TCP terminal, you will need to start the tcp_terminal and a NIC driver (nic_drv). On PC hardware, the NIC driver will further need the PCI driver (pci_drv). For an example of integrating TCP terminal into a Genode scenario, please refer to the tcp_terminal.run script proved at gems/run/.

GDB monitor is built upon the libc and a few custom libc plugins, each coming in the form of a separate shared library. Please make sure to integrate the shared C library (libc.lib.so) along with the dynamic linker (ld.lib.so) in your boot image. For using the TCP terminal, lwip.lib.so (TCP/IP stack) is needed as well.

Examples

The following examples are using the Fiasco.OC kernel on the x86_32 platform. This is the only platform where all debugging features are fully supported at the time of this writing. Please refer to the Section Current limitations and technical remarks for more platform-specific information.

Working with shared libraries

To get acquainted with GDB monitor, the ports repository comes with two example run scripts. The gdb_monitor_interactive.run script executes a simple test program via GDB monitor. The test program can be found at ports/src/test/gdb_monitor/. When looking behind the scenes, the simple program is not simple at all. It uses shared libraries (the libc) plugin and executes multiple threads. So it is a nice testbed for exercising these aspects. The run script can be invoked right from the build directory via make run/gdb_monitor_interactive. It will execute the scenario on Qemu and use the UART to communicate with GDB. Qemu is instructed to redirect the second serial interface to a local socket (using the port 5555):

 -serial chardev:uart
 -chardev socket,id=uart,port=5555,host=localhost,server,nowait,ipv4

The used TCP port is then specified to the GDB as remote target:

 target remote localhost:5555

The gdb_monitor_interactive.run script performs these steps for you and spawns GDB in a new terminal window. From within your build directory, execute the run script via:

 make run/gdb_monitor_interactive

On startup, GDB monitor halts the target program and waits for GDB to connect. Once connected, GDB will greet you with a prompt like this:

 Breakpoint 2, main () at /.../ports/src/test/gdb_monitor/main.cc:67
 67 {
 (gdb)

At this point, GDB has acquired symbol information from the loaded shared libraries and stopped the program at the beginning of its main() function. Now let's set a breakpoint to the puts function, which is called by the test program, by using the breakpoint command:

 (gdb) b puts
 Breakpoint 3 at 0x106e120: file /.../libc-8.2.0/libc/stdio/puts.c, line 53.

After continuing the execution via c (continue), you will see that the breakpoint will trigger with a message like this:

 (gdb) c
 Continuing.
 Breakpoint 3, puts (s=0x10039c0 "in func2()\n")
     at /.../libc-8.2.0/libc/stdio/puts.c:53
 53 {

The following example applies to an older version of Genode and must be revised for recent versions.

Now, you can inspect the source code of the function via the list command, inspect the function arguments (info args command) or start stepping into the function using the next command. For a test of printing a large backtrace including several functions located in different shared libraries, set another breakpoint at the stdout_write function. This function is used by the libc_log backend and provided by the dynamic linker. The backtrace will reveal all the intermediate steps throughout the libc when puts is called.

 (gdb) b stdout_write
 Breakpoint 4 at 0x59d10: file /.../log_console.cc, line 108.
 (gdb) c
 Continuing.
 Breakpoint 4, stdout_write (s=0x1015860 "in func2()\n")
     at /.../genode/base/src/base/console/log_console.cc:108
 108  {
 (gdb) bt
 #0  stdout_write (s=0x1015860 "in func2()\n")
     at /.../genode/base/src/base/console/log_console.cc:108
 #1  0x010c3701 in (anonymous namespace)::Plugin::write (this=0x10c4378, 
     fd=0x10c0fa8, buf=0x6590, count=11)
     at /.../genode/libports/src/lib/libc_log/plugin.cc:93
 #2  0x010937bf in _write (libc_fd=1, buf=0x6590, count=11)
     at /.../genode/libports/src/lib/libc/file_operations.cc:406
 #3  0x0106ec4f in __swrite (cookie=0x10a1048, buf=0x6590 "in func2()\n", n=11)
     at /.../genode/libports/contrib/libc-8.2.0/libc/stdio/stdio.c:71
 #4  0x0106ef5a in _swrite (fp=0x10a1048, buf=0x6590 "in func2()\n", n=11)
     at /.../genode/libports/contrib/libc-8.2.0/libc/stdio/stdio.c:133
 #5  0x01067598 in __sflush (fp=0x10a1048)
     at /.../genode/libports/contrib/libc-8.2.0/libc/stdio/fflush.c:123
 #6  0x010675f8 in __fflush (fp=0x10a1048)
     at /.../genode/libports/contrib/libc-8.2.0/libc/stdio/fflush.c:96
 #7  0x0106a223 in __sfvwrite (fp=0x10a1048, uio=0x1015a44)
     at /.../genode/libports/contrib/libc-8.2.0/libc/stdio/fvwrite.c:194
 #8  0x0106e1ad in puts (s=0x10039c0 "in func2()\n")
     at /.../genode/libports/contrib/libc-8.2.0/libc/stdio/puts.c:68
 #9  0x0100041d in func2 ()
     at /.../genode/ports/src/test/gdb_monitor/main.cc:51
 #10 0x01000444 in func1 ()
     at /.../genode/ports/src/test/gdb_monitor/main.cc:60
 #11 0x01000496 in main ()
     at /.../genode/ports/src/test/gdb_monitor/main.cc:70

To inspect a specific call frame, switch to a particular frame by using the number printed in the backtrace. For example, to print the local variables of the call frame 5:

 (gdb) f 5
 #5  0x01067598 in __sflush (fp=0x10a1048)
     at /.../libc-8.2.0/libc/stdio/fflush.c:123
 123   t = _swrite(fp, (char *)p, n);
 (gdb) info locals
 p = 0x6590 "in func2()\n"
 n = 11
 t = <optimized out>

The test program consists of multiple threads. To see which threads there are, use the info thread command. To switch another thread, use the thread command with the thread number as argument. Please make sure to issue the info threads command prior using the thread command for the first time.

Inspecting a Genode service

As a reference for debugging a native Genode service, the debug_nitpicker.run script provides a ready-to-use scenario. You can invoke it via make run/debug_nitpicker.

Nitpicker is a statically linked program. Hence, no special precautions are needed to obtain its symbol information. As a stress test for GDB monitor, let us monitor the user input events supplied to the Nitpicker GUI server.

First, we need to set pagination to off. Otherwise, we will be repeatedly prompted by GDB after each page scrolled. We will then define a breakpoint for the User_state::handle_event function, which is called for each input event received by Nitpicker:

 (gdb) set pagination off
 (gdb) b User_state::handle_event(Input::Event)

For each call of the function, we want to let GDB print the input event, which is passed as function argument named ev. We can use the commands facility to tell GDB what to do each time the breakpoint triggers:

 (gdb) commands
 Type commands for breakpoint(s) 1, one per line.
 End with a line saying just "end".
 >silent
 >print ev
 >c
 >end

Now, let's continue the execution of the program via the continue command. When moving the mouse over the Nitpicker GUI or when pressing/releasing keys, you should see a message with the event information.

Current limitations and technical remarks

Platform support

At the time of this writing the platform support is available on the following base platforms:

Fiasco.OC on x86_32

This is the primary platform fully supported by GDB monitor. To enable user-land debugging support for the Fiasco.OC kernel a kernel patch (base-foc/patches/foc_single_step_x86.patch) is required, which is applied on ./tool/ports/prepare_port foc.

Fiasco.OC on ARM

GDB Monitor works on this platform but it has not received the same amount of testing as the x86_32 version. Please use it with caution and report any bugs you discover. To enable Fiasco.OC to deliver the correct instruction pointer on the occurrence of an exception, a kernel patch (base-foc/patches/fix_exception_ip.patch) is required.

OKL4 on x86_32

Partially supported. Breaking into a running programs using Control-C, working with threads, printing backtraces, and inspecting target memory works. However, breakpoints and single-stepping are not supported. To use GDB monitor on OKL4, please make sure to have applied the kernel patches in the base-okl4/patched directory.

All required patches are applied to the respective kernel by default when issuing ./tool/ports/prepare_port <platform>.

The other base platforms are not yet covered. We will address them according to the demanded by the Genode developer community.

No simulation of read-only memory

The current implementation of GDB monitor hands out only RAM dataspaces to the target. If the target opens a ROM session, the ROM dataspace gets copied into a RAM dataspace. This is needed to enable GDB monitor to patch the code of the target. Normally, the code is provided via read-only ROM dataspace. So patching won't work. The current solution is the creation of a RAM copy.

However, this circumstance may have subtle effects on the target. For example a program that crashed because it tries to write to its own text segment will behave differently when executed within GDB monitor.

CPU register state during system calls

When intercepting the execution of the target while the target performs a system call, the CPU register state as seen by GDB may be incorrect or incomplete. The reason is that GDB monitor has to retrieve the CPU state from the kernel. Some kernels, in particular Fiasco.OC, report that state only when the thread crosses the kernel/user boundary (at the entry and exit of system calls or then the thread enters the kernel via an exception). For a thread that has already entered the kernel at interception time, this condition does not apply. However, when stepping through target code, triggering breakpoints, or intercepting a busy thread, the observed register state is current.

No support for watchpoints

The use of watchpoints is currently not supported. This feature would require special kernel support, which is not provided by most kernels used as base platforms of Genode.

Memory consumption

GDB monitor is known to be somehow lax with regard to consuming memory. Please don't be shy with over-provisioning RAM quota to gdb_monitor.