Abstract
Watchdog timers are key components in ensuring digital controller devices, such as microcontrollers (MCUs), function in a safe manner. The basic functional safety (FS) standard IEC 61508-2 Annex A Table A.10 recommends several diagnostic techniques and measures to control hardware failures in the program sequences of digital devices. Such techniques include a watchdog with a separate time base with or without a time window and a combination of temporal and logical monitoring of program sequences. While each of these has corresponding maximum claimable diagnostic coverage as discussed in Part 3 of this series, all these techniques employ watchdog timers. For this reason, this article will show how to implement these diagnostic functions using watchdog timers.
Introduction
With the prevalence of microcontrollers (MCUs) as processing units in safety-related systems (SRS) comes the need for diagnostic measures that will ensure safe operation. IEC 61508-2 specifies self-test supported by hardware (one channel) as one of the recommended diagnostic techniques for processing units. This measure uses special hardware that increases speed and extends the scope of the failure detection, for instance, a watchdog timer IC that cyclically monitors the output of a certain bit pattern from the MCU.
Also, IEC 61508-2 specifies two types of watchdogs—one with a separate time base without a time window, and another with a time window. Despite this, it can also be used in other techniques, such as in a combination of temporal and logical monitoring of program sequences. Thus, this article will provide insights into the differences of such program sequence monitoring diagnostic measures in terms of operation and diagnostic coverage when implemented with ADI’s high performance supervisory circuits with watchdog function.
Watchdog with a Separate Time Base Without a Time Window
Part 2 of IEC 61508 describes simple watchdogs as external timing elements with a separate time base. Such devices allow detection of program sequence failures in a computer device, such as MCUs, within a specified interval. This is done by having a mechanism that allows the MCU to issue a signal to reset the watchdog before it reaches the timeout, which means the program sequence is running smoothly, or let the watchdog timeout period be reached so the watchdog can issue a reset signal to the MCU, which means the program is not running smoothly.
Figure 1a shows an example of the watchdog implementation with a separate time base but without a time window through the MAX6814. Notably, MCUs usually have an internal watchdog timer (WDT), but it cannot be solely relied on to detect a fault if it is part of the defective microcontroller, which will be an issue considering common cause failures (CCF). To address such CCF concerns, a separate WDT is used to ensure the MCU is placed in reset.1,2
Through a flowchart, Figure 1b illustrates the behavior of the watchdog timer as embedded in the MCU’s program execution. Before the flow starts, it’s important to set the watchdog timeout period or the WDT’s maximum reset interval. When such a period or interval is defined, the WDT will run upon execution of the program. The MCU must be able to send a signal to the MAX6814’s WDI pin before it reaches timeout, as the device will issue a reset signal to the MCU if the timeout period is reached. When the MCU resets, the system shall be placed into a safe state.
Such a WDT’s timeout period will capture program sequence issues; for example, a program sequence gets stuck in a loop, or an interrupt service routine does not return in time. For instance, only 5 of the 10 subroutines meant to be run on every loop of the software are executed. However, it will not cover other issues concerning program sequence issues—whether execution of the program took longer or shorter than expected or if the sequence of the program sections is correctly executed. This can be solved by the next type of WDTs.
Watchdog with a Separate Time Base and a Time Window
As the existence of a separate time window allows the detection of both excessive delays and premature execution, windowed WDTs prohibit the MCU from responding longer or shorter than the WDT’s open, also referred to as valid, window specification. As compared to simple watchdogs, it guarantees that all subroutines are executed by the program in a timely manner; otherwise, it will assert the MCU into reset.3
Figure 2 shows an example implementation of program sequence monitoring using the MAX6753. It comes with a windowed watchdog with external-capacitor-configurable watchdog periods. Figure 3, on the other hand, shows another implementation using the MAX42500, whose watchdog time settings can be configured through I2C—effectively reducing the number of external components and having the capability to increase fault coverage through a packet error checking (PEC) byte as shown in Figure 4. The PEC byte increases diagnostic coverage against I2C communication-related failures such as bus errors, stuck-bus conditions, timing problems, and improper configuration.
While watchdogs with a separate time base and time window offer higher diagnostic coverage compared to simple WDTs, they still cannot capture issues concerning whether the software’s subroutines have been executed in the correct sequence. Thus, this is what the next type of diagnostic technique addresses.
Combination of Temporal and Logical Monitoring of Program Sequences
Diagnostic techniques involving the combination of temporal and logical monitoring provide high diagnostic coverage to program sequences according to IEC 61508-2. An example implementation of this technique involves a windowed watchdog and a capability to check whether the program sequence has been executed in the correct order. An example can be visualized when the circuit in Figure 2 is combined with the sequence in Figure 5, where the MCU has each of its program routines employing a unique combination of characters and digits. Such unique combinations are then placed in an array each time a routine is executed. After the last routine, the MCU will only kick, or send a reset signal to, the watchdog if all words are correctly set in the array.
Challenge/Response Architecture
In some systems, more diagnostic coverage may be required to capture failures of the MCU, which may mean simply that sending back a pulse in a windowed time is not enough. With this, it may be beneficial to require the MCU to perform a complex task such as calculating to ensure that it’s fully operational. This is where the MAX42500’s challenge/response watchdog can come into play. In this watchdog mode, there’s a key-value register in the IC that must be read as the starting point of the challenge message. The MCU must utilize this message to calculate the appropriate response needed to be sent back to the watchdog IC so the watchdog can be kicked within the valid window. This type of challenge/response watchdog operates similarly to a simple windowed one, except that the key register is updated rather than the watchdog being refreshed with a rising edge. This is shown in Figure 6. Notably, for the MAX42500’s watchdog timer, the watchdog input is implemented using the I2C while the watchdog output is the output reset pin.
The MAX42500 contains a linear-feedback shift key (LFSK) register with a polynomial of x8 + x6 + x5 + x4 + 1 that will shift all bits upward towards the most significant bit (MSB) and insert the calculated bit as the new least significant bit (LSB). With this, the MCU must compute the response in this manner and return it to the register of the MAX42500 through I2C. Notably, such a polynomial is identified as primitive and at the same time, a maximal length feedback polynomial for 8 bits. This ensures that all bit value combinations (1 to 255) are generated by the polynomial and the order of the numbers is indeed pseudo-random.4,5
Such a challenge/response can offer more coverage than the combination of temporal and logical program sequence monitoring as it shows that the MCU can still do actual calculations. This is as opposed to an MCU just implementing decision-making routines such as only checking whether the array of words is correct before issuing a signal to reset the watchdog.
Diagnostic Coverage Claims
The basic functional safety standard has maximum claimable diagnostic coverage for each diagnostic measure recommended per block in an SRS. Table 1 corresponds to program sequence according to IEC 61508, which utilizes watchdog timers.
| Diagnostic Technique/Measure | Maximum DC Considered Achievable |
| Watchdog with a separate time base without a time window | Low |
| Watchdog with a separate time base and time window | Medium |
| Combination of temporal and logical monitoring of program sequences | High |
Furthermore, with the existence of different implementations that may not be covered in the standard, a claimed diagnostic coverage can only be validated through fault insertion testing.
Conclusion
This article enumerates three types of diagnostic measures that use watchdog timers as recommended by IEC 61508-2 to address failures in program sequence. The first type of watchdog, which has a separate time base but without a time window, can be implemented using a simple watchdog. This diagnostic measure can only claim low diagnostic coverage. On the other hand, the second type of watchdog, which has both a separate time base and time window can be implemented by a windowed watchdog. This measure can claim a medium diagnostic coverage. To improve diagnostic coverage to high, one can employ logical monitoring aside from the usual temporal monitoring using watchdogs. A challenge/response windowed watchdog architecture can further increase diagnostic coverage against program sequence failures with its capability to check an MCU’s computational ability.
参考資料
1 “IEC 61508 All Parts, Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related Systems.” International Electrotechnical Commission, 2010.
2 “Top Misunderstandings About Functional Safety.” TÜV SÜD, 2025.
3 “Basics of Windowed Watchdog Operation.” Analog Devices, Inc. December 2021.
4 “Pseudo Random Number Generation Using Linear Feedback Shift Registers.” Maxim, June 2010.
5 Mohammed Abdul Samad AL-khatib and Auqib Hamid Lone “Acoustic Lightweight Pseudo Random Number Generator based on Cryptographically Secure LFSR.” International Journal of Computer Network and Information Security, Vol. 2, February 2018.
