8 Jan 2021

Device Enumeration on a 1-Wire Bus

Introduction

In my last two posts I discussed using a DS18B20 to obtain the temperature. In digital electronics, most IC devices would use either the I2C protocol or the SPI protocol for communication. The DS18B20 temperature sensor, however, uses the 1-wire protocol; a standard, though less common, protocol. While reading the datasheet for the DS18B20, one thing that fascinated me about the 1-wire protocol is its algorithm for correctly detecting an arbitrary number of connected devices despite having only one, bi-directional data wire and despite all these data wires being connected together in one big node.

On twitter I jokingly lamented the fact that the Linux kernel already has full support for the 1-wire protocol as well as the DS18B20, thus "robbing" me of the opportunity to learn much about either in detail. Therefore in order to "buy back" some of this opportunity, I decided to write my own code (in userspace) implementing the 1-wire device discovery algorithm. This exercise also forced me to write the code for a "testing" application which mimics the behaviour of an arbitrary number of connected devices.

The project consists of 2 parts:

  • the ROMsearch program which acts as the "master" side of the algorithm, and implements the ROMsearch feature of the 1-wire protocol
  • a tester component which either reads in pre-crafted files or generates random data and acts as a representation of an arbitrary number of connected "devices"

The two programs communicate with each other over a pair of fifos, which act as the "wiring" of the circuit. The code for this project can be found in the ROMsearch repository of my github profile.

Algorithm

I encourage you to read the specification yourself for the exact details. The datasheet I'm using can be found here. There are also other places where the 1-wire device discovery algorithm is described such as here and here.

Every specification sits atop a set of assumptions and facts in order for it to function correctly. Here are the relevant facts and assumptions related to this algorithm:

  • Each device comes from the factory with a unique 64-bit serial number encoded into its ROM.
  • All data is transmitted LSb first.
  • The master initiates all conversations.
  • Multiple devices are permitted to write to the bus at the same time.
  • The data line (DQ) has a weak pullup, as a consequence of the weak pullup on DQ:
    • if all devices write a 1 at the same time, the master reads 1 on DQ
    • if all devices write a 0 at the same time, the master reads 0 on DQ
    • if 1 or more devices write a 1, and 1 or more devices write a 0 on DQ at the same time, the master reads a 0 on DQ (regardless of the number of devices writing a given value versus the other)
    • if there are no devices trying to write anything to DQ, the master will read a 1

Interestingly enough, this algorithm and the circuitry of the 1-wire bus, are designed to expect multiple devices to write to the bus at the same time. Usually much care is taken in the design of a shared bus to make sure that no two devices ever try to write to the bus at the same time; and to detect when such inevitable events occur. In the case of a 1-wire system, having multiple devices write to the bus at the same time is a feature, and an integral part of the algorithm's success.

The gist of the algorithm is as follows:

  • the master issues a RESET on the line, this causes all devices to enter their reset state
  • the master sends the ROMsearch command on the line putting all devices into ROMsearch mode; this causes all devices to get ready to send their serial numbers starting with the LSb
  • the master issues a read on the line, all devices put the first bit of their respective serial numbers on the data line simultaneously; the master records this value
  • the master then issues a second read on the line, all devices put the complement of the first bit of their respective serial numbers on the line, the master records this value too
  • the master then writes either a 0 or a 1 on the data line; any device whose current actual bit matches the value put on the line by the master will continue being part of the search; any device whose current actual bit doesn't match the value put on the line by the master will stop responding to the master until a RESET is seen
  • all of the devices that are still part of the search get ready to put the next bit, and its complement, on the data line as the master issues two more reads
  • this pattern (2 reads, 1 write) continues until an entire serial number is read, then the whole sequence starts again in an effort to find the next device's serial number (if any are left to discover)

The algorithm assumes that all devices are able to stay in sync. In other words, for any given pass through the search algorithm (i.e. the set of 2 reads and a write from the master) all slave devices are working on the same bit position. In other words, in the first pass they are all sending their 0th bit… on the 11th pass they're all sending their 10th bit, etc. The algorithm wouldn't work if one device was sending its 12th bit while another was sending its 5th.

Due to the rules of boolean logic, the "bit and complement" nature of the algorithm, and the fact that all the writes from the devices are logically ANDed together (due to the pull-up circuitry of the 1-wire bus, as described above) the master can interpret each pair of values it reads on the line as follows:

    • 01: all devices still part of the search have a 0 in the current bit position
    • 10: all devices still part of the search have a 1 in the current bit position
    • 00: some of the devices have a 1, others have a 0 in the current bit position
    • 11: there are no devices responding as part of the search (we've reached the end of the serial number)

Example

It's time to look at a concrete example. For this example we're going to limit the length of the serial number to 8 bits. In this example there are 4 devices on the bus whose serial numbers are:


Starting with a reset and ROM-search command, all devices are in ROM-search mode and are ready to send the 1st bit of their respective serial numbers. The master performs a read on the data line and all devices send their 1st bit at the same time. This results in a 0 being read by the master.


The master performs a second read and all devices write the complement of their 1st bit at the same time. The master records a 1:


The master has no idea how many devices are connected on the bus, but it does know that this bit of all devices is a 0, because it did not encounter a fork. The master adds a 0 to the serial number it is currently creating:


The master writes a 0 on the data line, all devices who have a 0 bit in the current position remain part of the search (which in this case is all of them).

Having written a 0 on the data line causes all the devices to move on to their 2nd bit. The same thing happens for the 2nd bit. The master performs two reads on the data line which causes all devices still part of the search to write their 2nd bit and its complement (respectively) on the bus in response to each read pulse. Since all devices, coincidentally, have a 0 in their 2nd bit as well, the master sees another 01, adds a 0 to the serial number it is building, writes a 0 on the data line, and all devices continue to be part of the search:


Now we move to the 3rd bit. The first read yields the following bits. When they're all written to the bus at the same time the master reads a 0:


When all devices write the complement of the 3rd bit of their serial number on the bus the master reads another 0:


Now that the master has seen a 00 it knows that at this bit position (i.e. the 3rd bit position) there is a fork: there is at least one device with a serial number of (in part) "000" and there is at least one device with a serial number that begins with "100". The master needs to continue down one path, while remembering to come back and finish the other path. Which path it chooses to finish now is irrelevant.

Let's arbitrarily decide to finish the 0 path now. In doing so, we need to save the partial serial number "100" for later, add a 0 to the current serial number we're building, and write a 0 to the data line.


Writing a 0 to the data line at this point eliminates devices #1 and #4 from the search. With these two devices eliminated, the next set of reads yields:


A 01 lets the master know that all remaining devices have a 0 at the current position. Therefore it writes a 0 to the bus, which keeps both remaining devices in the search, and adds a 0 to the serial number it is building:


This process continues with the 5th bit of the remaining devices. Seeing a 10 in response to a set of reads lets the master know the next bit of all remaining devices is a 1:


The 6th bit:


At the 6th bit position of all the devices that are remaining along the 0 fork from the last time we encountered a 00, we have found another fork. If we arbitrarily decide to take the 0 fork again at this junction then, in addition to remembering to finish the "100" fork we found earlier, we also have to remember to finish the "110000" that we've just found now.


Taking the 0 fork again removes device #3 from the list:

Performing 2 reads yields:


Which we know means that all remaining devices have a 1 in the current position. Writing a 1 to the bus keeps all the current devices and moves us to the next bit:


At this point the master knows that all remaining devices have a 0 in this position. It records the 0 and writes a 0 to the bus. Since device #2 has reached the end of its serial number, it takes itself out of the search in response to the write the master has performed (which would normally cause the device to get ready to send its next bit).

Now, with no devices left in the search, and due to the pull-up on the bus, when the master performs its next 2 reads it gets:


When the master receives a 11 it knows that it has found the unique serial number of one of the devices attached to its bus: 01010000. We know this as the serial number of device #2.

Remember: the master is blind! We can see that there are 4 devices in total and that their serial numbers all consist of 8 bits, but the master doesn't know either of these things. However, the master is not dumb. It does know that of all the serial numbers present, none of them have 1 bits in either the 1st or 2nd positions. The master doesn't have to perform an exhaustive search of the entire bit-space to find all the devices, it only needs to follow-up with all of the forks it encounters each time through.

During the last pass the master encountered a fork; in fact it encountered 2 forks. Each time it found a fork it saved the state of the serial number up until that point, plus the "other path". I.e. at the 3rd bit position a fork was found, this indicated to the master that there were two valid paths at this point "000" and "100". We decided, arbitrarily, to follow the "000" path, so we saved the "100" path for later. The same thing happened at bit position 6. When examining the 6th bit position of the devices that were part of the search up until that point we found another 2 paths: "010000" and "110000". Now we need to go back to find what exists along the "100" and "110000" paths.

At this point the master sends a RESET to all devices and issues the ROM-search command. This puts all devices back into the search (including any ones we've already found) and gets all devices ready to send their 1st bits again.

However, the difference this time is that the master is not starting from scratch; it already knows that something exists along the "100" and "110000" paths and that no "1"s can possibly exist in bit positions 1 and 2.

We arbitrarily decide to investigate the "100" partial serial number. All devices are ready to send their first bits, so the master performs 2 reads. There's no need to process the result since we already know how it turns out, so the master can simply write a 0 to the bus (which is the value of the 1st bit in the partial serial number "100" that we are currently following). All devices that have a 0 in the 1st position get ready to send the next bit. The master performs 2 reads, ignores this result as well (since we're at the 2nd position of a known partial serial id "100"), and writes a 0 to the bus again. A third set of 2 reads is performed and the master ignores this value too.

This time, however, a different path is taken than before. Previously at the 3rd bit position we arbitrarily decided to take the "0" path and save the "1" path for later. We are now at "later". This time we're following up on the known partial serial number of "100", so this time we're going to write a 1 for the 3rd bit of the search pattern. When the master writes a 1 on the data bus, devices #1 and #4 will continue on as part of the search, and devices #2 and #3 will drop out.


The number of forks, so far, along this path is zero (although there is a fork to come).

Keeping track of all the forks that are found, and the partial serial numbers that go along with them, will lead to finding all the serial numbers of all the devices attached to the bus.

Notice that when the RESET and ROMsearch were issued most recently, device #2 became part of the search again. But we will never "find" device #2 again because we've already found it by arbitrarily following all the "0" paths the first time, and all subsequent passes through the algorithm will start with already-known partial serial IDs. Given that all serial numbers are assumed to be unique and of the same length, we won't ever "find" an already-found device again.

As a result, all devices are only found once, and alleys that don't lead to valid serial numbers are never explored.

Implementation Details

As mentioned above, this algorithm is implemented as two programs which communicate over a pair of named fifos.

The testing program needs to be started first. It will create a pair of fifos. The ROMsearch program can then be started. It will use these fifos to communicate with the testing program. As the ROMsearch program performs reads and writes to the fifos the testing program will send back appropriate responses based on its list of devices.

Alternatively, you could interact with the testing program "by hand" by writing commands into the "toTesterFifoFd" fifo (echo -n R > toTesterFifoFd) and seeing what comes out the "fmTesterFifoFd" fifo (cat fmTesterFifoFd). In this way you could perform an interactive ROMsearch if you wished. All communication done through the fifos is done using ASCII characters.

Note that in the following example run, the replies coming back from the testing program in response to the reads being performed by the master (cat fmTesterFifoFd) are interspersed with the output of the testing program itself which is set to verbose mode. So you have to search a little to find the two zeros that the tester is sending back to the master when the master issues its two 'r' commands in the following sequence.

$ ./tester &
[1] 9808
devices: 7
bitsize: 8
devices_pG[00] = 170 (0b10101010)  current bit pos:00 → 0
devices_pG[01] = 040 (0b00101000)  current bit pos:00 → 0
devices_pG[02] = 115 (0b01110011)  current bit pos:00 → 1
devices_pG[03] = 085 (0b01010101)  current bit pos:00 → 1
devices_pG[04] = 162 (0b10100010)  current bit pos:00 → 0
devices_pG[05] = 224 (0b11100000)  current bit pos:00 → 0
devices_pG[06] = 024 (0b00011000)  current bit pos:00 → 0
$ ls
Makefile   ROMsearch.o  common.o  data02  data04  fmTesterFifoFd  tester.o
ROMsearch  baddata      data01    data03  data05  tester          toTesterFifoFd
$ cat fmTesterFifoFd &
[2] 9862
$ echo -n V > toTesterFifoFd 

devices_pG[00] = 170 (0b10101010)  current bit pos:00 → 0
devices_pG[01] = 040 (0b00101000)  current bit pos:00 → 0
devices_pG[02] = 115 (0b01110011)  current bit pos:00 → 1
devices_pG[03] = 085 (0b01010101)  current bit pos:00 → 1
devices_pG[04] = 162 (0b10100010)  current bit pos:00 → 0
devices_pG[05] = 224 (0b11100000)  current bit pos:00 → 0
devices_pG[06] = 024 (0b00011000)  current bit pos:00 → 0
$ echo -n R > toTesterFifoFd 
fifo: 0x52 (R) cnt:1 bitPos:0

devices_pG[00] = 170 (0b10101010)  current bit pos:00 → 0
devices_pG[01] = 040 (0b00101000)  current bit pos:00 → 0
devices_pG[02] = 115 (0b01110011)  current bit pos:00 → 1
devices_pG[03] = 085 (0b01010101)  current bit pos:00 → 1
devices_pG[04] = 162 (0b10100010)  current bit pos:00 → 0
devices_pG[05] = 224 (0b11100000)  current bit pos:00 → 0
devices_pG[06] = 024 (0b00011000)  current bit pos:00 → 0
$ echo -n S > toTesterFifoFd 
fifo: 0x53 (S) cnt:1 bitPos:0

devices_pG[00] = 170 (0b10101010)  current bit pos:00 → 0
devices_pG[01] = 040 (0b00101000)  current bit pos:00 → 0
devices_pG[02] = 115 (0b01110011)  current bit pos:00 → 1
devices_pG[03] = 085 (0b01010101)  current bit pos:00 → 1
devices_pG[04] = 162 (0b10100010)  current bit pos:00 → 0
devices_pG[05] = 224 (0b11100000)  current bit pos:00 → 0
devices_pG[06] = 024 (0b00011000)  current bit pos:00 → 0
$ echo -n r > toTesterFifoFd 
fifo: 0x72 (r) cnt:1 bitPos:0
 readState:0 bitPos:0
  in search [00]  bit:0
  in search [01]  bit:0
  in search [02]  bit:1
  in search [03]  bit:1
  in search [04]  bit:0
  in search [05]  bit:0
  in search [06]  bit:0
  <= 0

devices_pG[00] = 170 (0b10101010)  current bit pos:00 → 0
devices_pG[01] = 040 (0b00101000)  current bit pos:00 → 0
devices_pG[02] = 115 (0b01110011)  current bit pos:00 → 1
devices_pG[03] = 085 (0b01010101)  current bit pos:00 → 1
devices_pG[04] = 162 (0b10100010)  current bit pos:00 → 0
devices_pG[05] = 224 (0b11100000)  current bit pos:00 → 0
0devices_pG[06] = 024 (0b00011000)  current bit pos:00 → 0
$ echo -n r > toTesterFifoFd 
fifo: 0x72 (r) cnt:1 bitPos:0
 readState:1 bitPos:0
  in search [00] ~bit:1
  in search [01] ~bit:1
  in search [02] ~bit:0
  in search [03] ~bit:0
  in search [04] ~bit:1
  in search [05] ~bit:1
  in search [06] ~bit:1
  <= 0

devices_pG[00] = 170 (0b10101010)  current bit pos:00 → 0
devices_pG[01] = 040 (0b00101000)  current bit pos:00 → 0
devices_pG[02] = 115 (0b01110011)  current bit pos:00 → 1
devices_pG[03] = 085 (0b01010101)  current bit pos:00 → 1
0devices_pG[04] = 162 (0b10100010)  current bit pos:00 → 0
devices_pG[05] = 224 (0b11100000)  current bit pos:00 → 0
devices_pG[06] = 024 (0b00011000)  current bit pos:00 → 0
$ echo -n 0 > toTesterFifoFd 
fifo: 0x30 (0) cnt:1 bitPos:0
 readState:2 bitPos:0
   removing: 02
   removing: 03

devices_pG[00] = 170 (0b10101010)  current bit pos:01 → 1
devices_pG[01] = 040 (0b00101000)  current bit pos:01 → 0
devices_pG[04] = 162 (0b10100010)  current bit pos:01 → 1
devices_pG[05] = 224 (0b11100000)  current bit pos:01 → 0
devices_pG[06] = 024 (0b00011000)  current bit pos:01 → 0

The testing program takes an optional cmdline argument. This argument names a file whose contents are the data to be used. This data represents a list of unique serial numbers for the search program to find. The first line of the data file specifies the number of devices (N), the second line specifies the bit size of the devices, and the subsequent N lines start with a unique positive integer number representing the serial IDs of each of the devices connected to the 1-wire bus. If no file is specified, the test program will simply generate random data (time-seeded) to use.

When generating random data, you can tell the testing program the bit size to use via the -b/--bitsize cmdline option, and you can specify the maximum number of devices to generate using the -m/--max-devices cmdline argument. Note that by specifying the maximum number of devices this doesn't mean that this many devices will be created; a random number of devices will be created, up to a maximum of the number given.

Note: the program has provisions to make sure that the max-devices and bitsize options make sense. You can't set the bitsize to 4, then ask for a maximum of 1000 devices.

$ ./tester -b4 -m1000
the given bit size (4) is not high enough to randomly
generate the requested number of max entries (1000)

please either increase the bit size
or reduce the number of max entries to 7 or less

The help for the testing program is as follows:

$ ./tester -h
ROMsearch 1.0.0
usage: ./tester [<options>] [<testfile>]
  where:
    <testfile>              a file from which to get serial ID data
                            (otherwise the data is generated randomly)
    <options>
      -h|--help             print information about this program and exit successfully
      -b|--bitsize <b>      set the number of bits in the serial ID to <b> (MIN:2 default:8 MAX:64)
      -m|--max-devices <m>  set the maximum number of devices (MIN:1 default:8)

Note that if you specify a testfile to use and try to specify a max-devices and/or bitsize, the max-devices and/or bitsize options will be ignored:

$ ./tester -b8 -m20 data01
WARNING: specifying the bit size and/or max entries on the cmdline
         is not compatible with using pre-generated data from a file
these cmdline options will be ignored in favour of the values from the datafile
devices: 3
bitsize: 8
devices_pG[00] = 210 (0b11010010)  current bit pos:00 → 0
devices_pG[01] = 205 (0b11001101)  current bit pos:00 → 1
devices_pG[02] = 164 (0b10100100)  current bit pos:00 → 0

The testing program understands the following commands on the "toTesterFifoFd" fifo:

  • r
    • The master is performing a read on the bus. The tester writes the logical AND of all the current bits of each of the device's serial numbers that are still part of the search as one value. If this is the second such 'r' command then the tester will AND the complement of all the device's current serial bits of the devices that are still part of the search.
  • R
    • RESET. The tester resets the current bit position back to the beginning (i.e. bit 0, LSb first) and resets the state back to expecting the ROM command (i.e. ROMsearch).
  • S
    • ROMsearch. After coming out of RESET, the devices need to know which function to perform. Currently the only supported function is ROMsearch, but I include it in order to be faithful to the 1-wire protocol.
  • 0 or 1
    • The master is choosing which devices are to continue on being part of the search, and which are go to idle until the next RESET. Any device whose current serial ID bit position contains either the '0' or the '1' that is sent by the master remain in the search.
  • Q
    • I've added this command so the master can tell the testing program that it is done, so the tester can terminate gracefully.
  • V
    • Another command I added to toggle the verbosity of the tester.

The ROMsearch program takes no arguments; it simply looks for the "toTesterFifoFd" and "fmTesterFifoFd" fifos and begins running through the ROMsearch algorithm to find all the devices represented in the testing program. As it finds devices, it prints their serial IDs to stdout.

$ ./tester &
[1] 11651
devices: 5
bitsize: 8
devices_pG[00] = 060 (0b00111100)  current bit pos:00 → 0
devices_pG[01] = 188 (0b10111100)  current bit pos:00 → 0
devices_pG[02] = 007 (0b00000111)  current bit pos:00 → 1
devices_pG[03] = 061 (0b00111101)  current bit pos:00 → 1
devices_pG[04] = 197 (0b11000101)  current bit pos:00 → 1
$ ./ROMsearch 
00011110...060
01100010...197
01011110...188
00000011...007
00011110...061
[1]+  Done                    ./tester

No comments: