16 Sept 2019

Board Bring-Up: An Introduction to gdb, JTAG, and OpenOCD

Lately I've been working on getting a recent-ish U-Boot (2018.07) and Linux kernel (5.0.19) running on an SoC that was released back in 2008: the NXP LPC3240 which is based on the ARM926EJ-S processor (NOTE: the ARM926EJ-S was released in 2001).

Although I had U-Boot working well enough to load and boot Linux, the moment the Linux kernel started printing its boot progress, my console was filled with the sort of garbage that tells an embedded developer they've got the wrong baud rate. Double-checking, and even triple-checking, of the baud rate values, however, showed that every place where it was configured, it had been correctly set to 115200 8N1.

Having a working console is the basis from which the rest of a software developer's board bring-up activities take place! If you can compile a kernel, load it on the board, and get it to print anything, legibly, to the console, then you're already in a really good position. But if there's no working connection via the console, it means more low-level work is needed.

Going down the hierarchy (from easier to harder), if the console isn't working, then you'll need to see if JTAG is a possibility. If a JTAG isn't available, then you'll need to look for an LED to blink. Blinking an LED to debug one's work during board bring-up isn't uncommon, but it can be a lot more painful. With nothing but (perhaps) a single LED, it can be hard (though strictly not impossible) to communicate something as simple as: "the value at 0x4000 4064 is 0x0008 097e, and I've reached <this> point in the code". Thankfully for me, this particular board has a working JTAG, and there is support for this SoC in OpenOCD.

JTAG is a very large specification and has a lot of use-cases. For my purposes, JTAG consists of:
  • extra logic that is added to a chip which implements a very specific state machine
  • a bunch of extra pins (at least 4, but some designs add more) with which to interface to this internal state machine from outside the chip
  • a set of commands (in the state machine) that can be executed by toggling bits on the external pin interface
These commands let you do things such as push individual bits into, or get individual bits out of, a particular device's JTAG scan chain. Depending on how the scan chain is implemented, this could translate to activities such as the ability to read/write registers, and read/write arbitrary values in the memory space (which includes things like peripherals, configuration, etc).

Most development hosts don't have random GPIO lines available for interfacing, therefore a dongle of some sort is needed to go between the desktop machine and the target board's multi-wire JTAG interface. In days past, these dongles would be connected to the development host via serial or parallel interfaces; nowadays they're mostly USB.

Armed with a JTAG dongle, in theory it would be possible to start interacting with the target board directly via JTAG commands. However, this could be very tedious as the JTAG commands are very primitive (i.e. having to follow the state machine precisely, and work 1 bit at a time). One of the more common arrangements is to use gdb, which permits the user to perform higher-level actions (i.e. set a breakpoint, read a given 32-bit memory address, list the register contents, etc) and let the software deal with the details. Note, however, gdb itself does not know how to "speak" JTAG nor does it know how to interact with a JTAG dongle. gdb does, however, speak its own command language called the remote gdbserver protocol. It is OpenOCD which acts as the interpreter between the remote gdbserver protocol on the one hand (e.g. over a network port), and JTAG commands for the target on the other (e.g. over USB to the dongle) marshalling all the data back and forth between the two.

With the target board powered off, plug the JTAG dongle's pins into the board's JTAG connector; connect the development host to the JTAG dongle via USB.

Power on the target board.

Run OpenOCD on the development host. In my specific case the command I invoke is:
$ openocd -f interface/ftdi/olimex-arm-usb-ocd-h.cfg -f board/phytec_lpc3250.cfg
It is important to note that openocd runs as a daemon, and as such, once invoked, does not terminate until explicitly killed. In particular, this command is run in its own terminal, and simply left running until my debugging session is done. All other work that will be done, needs to be performed in other terminals. Perhaps you're thinking: "I'll just run it in the background using an ampersand". That would work, however: as it runs and interacts with gdb and the board, openocd will print out useful information to the terminal. Therefore giving it its own terminal and letting it run independently while keeping it visible is often quite useful. It's always someplace visible on my desktop while debugging.

OpenOCD needs to know what dongle I'm using (it supports a number of JTAG dongles) and it needs to know the board or SoC to which it is connecting (it has support for many SoCs and boards). Implicit in the choice of dongle is the communication protocol (here USB) and dongle characteristics (properties, product ID, etc).  By specifying a target board or SoC, you're letting OpenOCD know things such as how to initialize its connection, the register size, what speed to use, details about how the device needs to be reset, and so on.

More recently, some development boards come with built-in debug circuitry, including a USB connector, already designed into the target board itself. In these cases the JTAG dongle isn't needed. One simply needs to connect the target board directly to the development host via a single USB cable, and start up OpenOCD (and gdb) giving only one piece of information: the board's name. All other details are implied.

Running on a GNU/Linux system, gdb works best with ELF executables. gdb can be coerced into working with raw binaries, but when presented with an ELF file, it is provided with a lot more of the data it needs to do its job. But neither the Linux kernel nor U-Boot are ELF binaries. As part of their default build processes, however, both the Linux kernel and U-Boot build systems generate ELF output in addition to the parts that are actually run. A U-Boot build will produce, for example, u-boot.bin, which is the actual U-Boot binary that is stored wherever the bootloader needs to be placed. But in addition to this, a file called u-boot is produced which is its ELF counterpart. Similarly for the Linux kernel, the kernel itself might be found in arch/arm/Image, but its ELF counterpart is vmlinux.

If you want to debug a Linux kernel via JTAG using gdb, simply invoke:
$ arm-oe-linux-gnueabi-gdb vmlinux

Since the target is an ARM board and my host is an x86 board, I need to invoke the cross-gdb program, not the native one (otherwise it won't be able to make sense of the binary instructions). Since I do so much of my work using OpenEmbedded as a basis, when working independently on U-Boot and the kernel, I simply have OpenEmbedded generate an SDK targeting this particular board, and use it for all my work. When invoking this cross-debugger, I simply provide it with the path to, and the name of, the ELF file containing the kernel I have compiled.

By default openocd listens on port 6666 for tcl connections, port 4444 for telnet connections, and port 3333 for gdb connections. In order to create the link between gdb and openocd, once gdb is up and running you'll need to link them together by issuing the target remote or target extended-remote command:
(gdb) target extended-remote :3333
Of course if you've told openocd to listen to a different port, you'll need to make the necessary adjustments to the connection.

Congratulations! You're now debugging the Linux kernel on your target board via JTAG using gdb! No serial console required!

In my particular case, although I knew the Linux kernel was doing something, I wasn't sure what exactly was going on since the baud rate via my serial console was messed up. Using this setup I was able to dump the kernel's ring buffer, allowing me to see exactly what the kernel was doing and providing me with valuable debugging information of its boot:
(gdb) x /2000bs __log_buf