# Design and Evaluation of Radiation-Hardened Standard Cell Flip-Flops Oliver Schrape<sup>®</sup>, Marko Andjelković<sup>®</sup>, Anselm Breitenreiter<sup>®</sup>, Steffen Zeidler, Alexey Balashov, and Miloš Krstić<sup>®</sup> Abstract—Use of a standard non-rad-hard digital cell library in the rad-hard design can be a cost-effective solution for space applications. In this paper we demonstrate how a standard nonrad-hard flip-flop, as one of the most vulnerable digital cells, can be converted into a rad-hard flip-flop without modifying its internal structure. We present five variants of a Triple Modular Redundancy (TMR) flip-flop: baseline TMR flip-flop, latch-based TMR flip-flop, True-Single Phase Clock (TSPC) TMR flip-flop, scannable TMR flip-flop and self-correcting TMR flipflop. For all variants, the multi-bit upsets have been addressed by applying special placement constraints, while the Single Event Transient (SET) mitigation was achieved through the usage of customized SET filters and selection of optimal inverter sizes for the clock and reset trees. The proposed flip-flop variants feature differing performance, thus enabling to choose the optimal solution for every sensitive node in the circuit, according to the predefined design constraints. Several flip-flop designs have been validated on IHP's 130 nm BiCMOS process, by irradiation of custom-designed shift registers. It has been shown that the proposed TMR flip-flops are robust to soft errors with a threshold Linear Energy Transfer (LET) from $(32.4 \frac{\text{MeV} \cdot \text{cm}^2}{\text{mg}})$ to (62.5 $\frac{\text{MeV} \cdot \text{cm}^2}{\text{mg}})$ , depending on the variant. Index Terms—Single event effect, fault tolerance, triple modular redundancy, ASIC design flow, radhard design. ## I. INTRODUCTION THE use of electronic components in the harsh space environment is coupled with many reliability challenges. Since the components employed in space applications cannot be easily replaced once they are put into operation, their Manuscript received April 22, 2021; revised July 14, 2021; accepted August 22, 2021. Date of publication September 9, 2021; date of current version November 9, 2021. This work was supported in part by the European Union's Horizon 2020 Research and Innovation Programme under Grant 870365 (MORAL), in part by the Federal Ministry of Education and Research under Grant 16ME0134 (Scale4Edge), and in part by the European Regional Development Fund and State Brandenburg under Grant 80175745 (SPAD). This article was recommended by Associate Editor F. Rivet. (Corresponding author: Oliver Schrape.) Oliver Schrape, Marko Andjelković, Anselm Breitenreiter, Steffen Zeidler, and Alexey Balashov are with the IHP–Leibniz-Institut für Innovative Mikroelektronik, 15236 Frankfurt (Oder), Germany (e-mail: schrape@ihp-microelectronics.com; andjelkovic@ihp-microelectronics.com; breitenreiter@ihp-microelectronics.com; zeidler@ihp-microelectronics.com; balashov@ihp-microelectronics.com). Miloš Krstić is with the IHP-Leibniz-Institut für Innovative Mikroelektronik, 15236 Frankfurt (Oder), Germany, and also with the Institute of Computer Science, University of Potsdam, 14469 Potsdam, Germany (e-mail: krstic@ihp-microelectronics.com). Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCSI.2021.3109080. Digital Object Identifier 10.1109/TCSI.2021.3109080 reliability needs to be very high so that they remain functional during the entire mission. One of the main causes of failures in CMOS integrated circuits employed in space are the radiation-induced effects, which can be classified as Total Ionizing Dose (TID) effects and Single Event Effects (SEEs) [1], [2], and [3]. The TID effects are the result of radiation-induced charge deposition in the transistors' gate oxide, leading to an increase of the leakage current and a threshold voltage shift, and thus causing gradual performance degradation. On the other side, the SEEs are caused by the passage of a single energetic particle (proton, neutron or heavy ion) through an off-state transistor in the circuit [4], [5]. The SEEs can be either hard SEEs (potentially causing permanent failure) or soft SEEs (causing data loss and temporary failure). In order to design a system for space applications, it is necessary to use a technology which is radiation-hardened or at least sufficiently tolerant to radiation-induced effects. While the TID was one of the most critical issues for older CMOS technologies, advanced nano-scale technologies are more tolerant to TID due to very thin gate oxides (less than 10 nm). As reported in [6], many standard CMOS technologies may withstand the total doses up to 300 kRad, while the total doses in many space missions are below 100 kRad. The TID immunity can be further enhanced with the use of enclosed layout transistors (ELT) [7], [8]. Another effect that has been very critical for older technologies is the Single Event Latchup (SEL) [9]. The SEL is a type of hard SEE causing excessive supply current flow, which can destroy the circuit if the supply is not reset. Many modern technologies are inherently immune to SEL, as for example the SOI (Silicon on Insulator) technology. However, some bulk CMOS technologies may be susceptible to SEL [10]. To reduce the probability of SEL occurrence, various design-level approaches can be applied [11]-[13]. Once a technology with sufficient tolerance to TID and SEL effects has been selected, the respective standard digital cell library can be used as a baseline for space applications. However, the soft SEEs remain the major reliability issue. The critical soft SEEs are the Single Event Transients (SETs) and the Single Event Upsets (SEUs). A Single Event Transient (SET) is a voltage glitch generated at the output of a combinational gate when an incident particle deposits sufficient charge in the sensitive region of the gate. The SETs may propagate to the sequential cells, and eventually lead to the flipping of stored logic value, known as the soft error or Single Event Upset (SEU). Alternatively, the soft errors can be caused when an energetic particle hits directly the sequential logic (flip-flops and latches). Some recent technologies, such as 14 nm FinFET, have shown improved robustness of standard digital cells to soft errors [14]. Nevertheless, due to the increasing system complexity with technology downscaling, the system Soft Error Rate (SER) increases with every new process node [15]. Since the standard cells in any technology are not designed to be robust against soft errors, it is necessary to apply special design measures in order to mitigate the SET and SEU effects. The radiation-hardened standard cells can be constructed by applying various design and layout techniques, commonly known as the Radiation-Hardening-by-Design (RHBD) [16]. The RHBD approaches applicable to standard cells can be divided in two main groups: (i) hardening based on the custom design of standard cells, i.e., design of a new cell structure or modification of the internal structure of an existing cell, and (ii) hardening without modification of the existing cells' internal structure. The custom design of a rad-hard standard cell library can efficiently address the radiationinduced effects, but the effort in designing such a library is very high. On the other hand, the use of an existing standard cell library may provide a cost-effective solution for a wide range of space applications, at significantly lower cost. The flip-flops are generally considered as the most vulnerable standard cells, because a direct particle strike on a non-hardened flip-flop can relatively easily result in a soft error. In addition, when a soft error occurs, it may in some cases remain forever in the system, causing the chip malfunction in the worst case. A typical example where such a scenario can occur are the state machines [17]. The contribution of SETs originating in combinational logic is frequently neglected because the probability that they will propagate and be latched in flip-flops is relatively low due to the inherent electrical, logical and temporal masking effects [18]. However, with technology downscaling, supply voltage reduction and increase of clock frequency, the contribution of SETs to the total SER becomes more significant. As demonstrated in [19], at clock frequencies beyond 2 GHz, the combinational SER may exceed the sequential SER in 40 nm technology. This implies that the effect of both SEUs and SETs should be considered jointly in the design of rad-hard standard cells. Various solutions for radiation-hardened flip-flops, based on both custom-designed cells and use of existing standard cells, have been proposed in the literature, as will be elaborated in Section II. However, the use of a single rad-hard flip-flop solution, even if it is highly robust to both SETs and SEUs, is usually not sufficient to meet the stringent design constraints in terms of area, delay and power consumption. The hardening approaches increase the area and power consumption, which are critical constraints in space applications. To obtain optimal rad-hard design, it is imperative to use a set of rad-hard flipflops with complementary performance, such that the best solution can be selected for a particular node in the circuit, while complying with the design constraints. To the best of our knowledge, the current state-of-the-art lacks a comprehensive solution for designing the rad-hard flip-flops from the non-radhard cells, which can meet the design constraints in complex systems. In this work we will demonstrate how the standard flip-flops from a commercial non-rad-hard library can be hardened without modifying their internal structure, but still providing high level of radiation tolerance. The main contributions of this work are: (i) an approach for the design of a set of TMR flip-flops with complementary features, considering synergistically the SETs, SEUs and multiple upsets, as well as the important design aspects such as testability, highperformance, low power and self-correction, and (ii) two original flip-flop designs: a self-correcting flip-flop and a scannable flip-flop. In addition, we address the aspects of design flow compliance, characterization of flip-flops, logic synthesis and Place & Route method. The proposed solutions introduce only minor changes in the regular industry digital design flow. We have used the 130 nm bulk CMOS technology as a proof of concept, which is particularly suitable for space applications. Previous experimental results have shown that the digital cells in this technology can withstand the doses up to 50 krad, while the doses beyond 200 krad can be tolerated with ELT approach. Furthermore, the technology is latchup-free at least up to 67 MeV $\mathrm{cm}^2 \mathrm{mg}^{-1}$ [20]. The paper is organized in the following way. In Section II, we will describe the state-of-the-art solutions for rad-hard flip-flops. In Section III, the results of the simulation-based characterization of SET effects in standard combinational cells, which is essential in the design of rad-hard flip-flops, are presented. The proposed TMR flip-flop architectures are introduced in Sections IV and V. In Section VI, the physical implementation and characterization of flip-flops are discussed. Section VII discusses the design flow aspects. Section VIII describes the test circuits used for validation of the proposed flip-flops, and Section IX presents and discusses the experimental results. # II. STATE-OF-THE-ART SOLUTIONS FOR RAD-HARD FLIP-FLOPS The custom design of rad-hard flip-flops is accomplished on the schematic/transistor level (e.g., by using special cell architectures or by increasing the transistors' channel width) and/or by applying special layout methods on the cell level (insertion of guard rings, well contact arrays or dummy transistors, or increase of the distance between the critical transistors) [21]–[23]. One of the commercially most used custom rad-hard flip-flop designs is the Dual Interlocked storage Cell (DICE) approach [24], which employs the dual modular redundancy on the schematic level of the flip-flop. While this approach provides very good hardness for older technologies, it is less effective for highly scaled technologies (e.g., 45 nm) [25], [26]. Moreover, standard DICE flip-flops are not robust to multi-bit upsets. Two improved versions of DICE, T-DICE [27] and F-DICE [28], employ additional transistors to achieve tolerance to multi-bit upsets. In addition, an improved version of DICE employing layout-level hardening, named the LEAP-DICE, has been proposed in [29]. The LEAP-DICE is based on the application of a layout technique known as Layout design through Error Aware Transistor Positioning (LEAP), which leverages the charge sharing mechanism to enhance the robustness to multi-bit upsets. An alternative hardening approach, known as Quatro, has been proposed in [30]. It employs the Cascode Voltage Switch Logic (CVSL) and can provide better performance than DICE at high LET values [31], [32]. A rad-hard flipflop architecture known as the HIT (Heavy Ion Tolerant) flip-flop has been incorporated in Imec's DARE (Design Against Radiation Effects) library [33], which has been used in space missions. Although the HIT flip-flops can provide high robustness to soft errors, this approach requires to use large transistors. However, all aforementioned flip-flop designs suffer from inherent vulnerability to SETs, which may propagate through data lines, as well as clock and reset trees, to the flipflop inputs. For that reason, additional measures have to be undertaken to ensure sufficient tolerance to SETs. Several custom flip-flop designs have been proposed to address the SETs. An improved DICE implementation with integrated tunable delay elements for SET filtering was proposed in [34]. The employed SET filters in this design require analog signal for tuning the filterable SET pulse width, which may substantially increase the complexity of the overall system design. Alternative implementation of DICE flip-flop with SET filters on all inputs, known as Delay-filtered DICE, has been proposed in [35]. Recently, a flip-flop tolerant to SETs and SEUs, based on eight mutually feeding C-elements and a Schmitt trigger has been proposed, but its functionality was verified only with simulations [36]. In addition, the above mentioned flip-flop variants are not optimized for low power consumption. To address this issue, several variants based on True Single-Phase Clock (TSPC) flip-flop have been proposed: TSPC-DICE flip-flop [37], TSPC-Quatro flip-flop [38] and Dual-Modular TSPC flip-flip [39]. However, as stated previously, the custom-designed flip-flops require significant effort first to design the flip-flop, and then to fully characterize it and incorporate it into an existing library. As the design has to be done on transistor level, the sizing and layout must be performed manually. This may be very time-consuming, particularly considering that the library needs to be equipped with different variants of each cell. As a cost-effective alternative to the hardening methods which require the modification of the cell's structure, hardware redundancy can be applied to an existing non-rad-hard flipflop without modification of its internal structure. The design of rad-hard flip-flops from non-rad-hard cells in most cases incurs higher area overhead than the custom-designed variants. However, it is well-known that in most digital designs, only a fraction of cells are highly sensitive to soft errors, i.e., have dominant contribution to SER. For example, as shown in [40], up to 50% of logic gates in ISCAS89 benchmark circuits contribute to over 80% of soft errors. Thus, an efficient radhard approach is based on selective hardening of the most sensitive cells in the design. By employing such an approach, the area overhead of the standard-cell-based rad-hard flip-flops can be minimized. But, other important issues, such as SET effects, multiple SEUs, power consumption, propagation delay, design-for-testability and compliance to the standard design flow need to be considered. A well-known solution for a rad-hard flip-flop based on standard non-rad-hard flip-flops employs the system-level Built-in Soft Error Resilient (BISER) concept [41]. In this approach, a standard flip-flop is duplicated and both copies are connected to a C-element, while the input signal for one flip-flop is delayed to mitigate the SETs. The C-element ensures that previous state is maintained if a soft error occurs in one of the flip-flops. However, this solution is not robust to multi-bit upsets. One of the most widely used system-level radiation-hardening techniques is triple modular redundancy (TMR), based on the concept proposed by Von Neumann in 1956 [42]. This concept is today extensively used in space electronics as one of the main radiation-hardening approaches. The TMR methodology can be applied at the flip-flop level by replacing a single flip-flop with the respective set of three flipflops, and using a voter to process the output signal. This basic TMR structure can mask an error in one of the three flip-flops. The basic TMR flip-flop is suitable for applications where SETs are not a serious concern. However, if a transient occurs on the data path during the sensitive clock edge, all three flip-flops would simultaneously capture the transient. One alternative approach for better SET protection is the triplication of the voter itself in addition to the storage cell triplication. This circuit is also known as full-TMR (FTMR). However, the main disadvantage of FTMR is the complex design with three independent inputs and outputs, and triplicated combinational logic paths. The triplication of the entire data path logic in front of the flip-flops leads to a huge increase of silicon area occupation. A second important disadvantage of FTMR is that it would have a very complex scan chain integration. Several TMR flip-flop designs have been proposed in the literature as alternatives to the basic and full TMR configurations. One of the well-known improvements of the basic TMR is by placing the filters at the inputs of flip-flops in order to suppress the incoming SETs. The choice of the filterable SET pulse width is of key importance, because it directly determines the introduced delay. However, in most cases the SET filtering is not considered during the flip-flop design, but at the circuit level, which may increase the design time. Several special TMR flip-flop designs, addressing the designfor-testability and power consumption issues, have been also reported. A TMR implementation of a scan flip-flop was proposed in [43], enabling to combine the fault tolerance and design-for-testability. However, the sensitivity to input SETs was not addressed, and the functionality of the flip-flop was not verified under radiation. A reconfigurable TMR flip-flop with scan functionality was proposed in [44]. This solution enables to switch off redundant flip-flops in order to save power, while offering the design-for-testability functionality. Nevertheless, this solution does not address the SET effects nor the multi-bit upsets. The self-correcting TMR flipflops have been proposed in [45]-[47]. However, these solutions are highly customized, requiring partial modification of existing cells. An important aspect of the rad-hard design with standard flip-flops is the compatibility with the design flow. Several previous approaches have proposed the modification of the design flow by introducing the processing of the synthesized design netlist, with the aim to convert the standard flip-flops into the TMR configurations [48]–[50]. Such an approach may be very exhaustive for a complex digital design with a large number of flip-flops. There is a need for alternative TMR cell approaches which provide the radiation-hardness with acceptable power, delay, and area penalties, and which can be integrated in the design with minimum or no modification of the standard industrial design flow. Despite the fact that the hardening of standard flip-flops without modifying their structure offers substantial benefits, this approach has not been widely explored. The current state-of-the-art lacks the solutions that can meet the requirements for selective hardening of a complex design. In the following, we aim to address the aforementioned limitations of the rad-hard flip-flop solutions based on standard cells. # III. SET CHARACTERIZATION OF STANDARD COMBINATIONAL CELLS The characterization of SET effects in standard cells is essential for the design of rad-hard flip-flops, because the width of SET pulses induced in combinational circuits (clock and reset trees, and logic stages between the flip-flops) directly determines the probability of soft error occurrence. Therefore, it is necessary to estimate the typical range of SET pulse widths that can be induced in a particular design/technology. This estimation can be done through experiments, device-level simulations, or electrical (SPICE-like) simulations. Based on the SET characterization data, the flip-flops can be designed to filter the most critical SET pulses. Moreover, the SET characterization data is required for selecting the optimal sizes of inverters in clock and reset trees, in order to reduce the probability that SETs from clock and reset trees will propagate to the flip-flops. We have performed a series of electrical simulations to characterize the SET pulse width in standard cells in the investigated 130 nm technology. The analysis was done through electrical simulations, using a commercial tool Cadence Spectre. A bias-dependent current model [51] was used as an SET model. This is an advanced SET current model with improved accuracy compared to the traditional double-exponential current model. The particle LET was specified as input parameter of the model, and the SET pulse width was extracted directly from the simulation waveforms. The timing constants of the current model have been chosen as 10 ps for rise time and 100 ps for fall time. These values have been selected based on comparison with the experimental results for 130 nm technology reported in [52], and they provide a fairly good estimate of the average SET pulses reported in [52]. All simulations have been done for the supply voltage from 0.8 V to 1.2 V, and for temperature from -40 °C to 125 °C. Due to limited space, here we present only the results for nominal supply voltage of 1.2 V and temperature of 25 °C. For the sake of brevity, we present here only the simulation results for three most common combinational gates in digital designs, i.e., INV, NAND2 and NOR2. The SET pulse width as a function of LET, for the investigated gates, is shown in Fig. 1. In all cases, the simulation results (a) NAND and NOR gate characteristics Fig. 1. SET characterization results of the 130 nm technology. are for the most sensitive input level(s). For INV, all available driving strengths have been considered since large inverters are required to mitigate the SETs in clock and reset trees. For NAND2 and NOR2, only the smallest (x1) gates are analyzed because they are most sensitive to particle strikes. In all cases an inverter with the minimum driving strength was used as a load, and the SET pulse width was observed at the output of load gate. As can be seen, for the LET range up to 60 MeV cm<sup>2</sup> mg<sup>-1</sup>, the expected SET pulse widths may be up to 700 ps for the investigated simulation conditions. Considering that due to random particle strikes the SET pulse width in a 130 nm combinational chain may vary by 300 ps to 400 ps for a given LET, it is reasonable to expect the SETs up to 1 ns in 130 nm technology, as has been confirmed in [52]. However, as most energetic particles in space have the LET below 30 MeV cm<sup>2</sup> mg<sup>-1</sup>, while the particles with higher LET are very rare, it can be expected that most SETs induced in the investigated technology would be up to 500 ps. Fig. 2. Logical sections of the proposed ΔTMR concept [53]. By using larger gates available in the standard cell library, the SET pulse width decreases and the threshold LET (minimum LET required to cause a soft error) increases. The threshold LET is around 10 MeV cm<sup>2</sup> mg<sup>-1</sup> for INV\_x16, and around $15 \text{ MeV} \text{ cm}^2 \text{ mg}^{-1}$ for $\text{INV}_{x}20$ (not shown here). However, it is important to note that the aforementioned threshold LET values are for individual gates. For a real circuit, the threshold LET values would be higher. Thus, the choice of x16 or x20 inverters provides robustness to most low-LET particles which are abundant in space. Nevertheless, the INV\_x8 may be also considered as optimal solution if the area overhead is a critical design constraint. The SETs generated in INV\_x8 are around 250 ps at LET of 30 MeV cm<sup>2</sup> mg<sup>-1</sup>. It can be seen that the SETs in x8 inverters are halved in comparison to those of the x1 inverters, which means that the probability of SET latching is halved. On the other hand, for x16 inverters the SET pulse width at 30 MeV cm<sup>2</sup> mg<sup>-1</sup> is around 200 ps, which brings minor improvement in SET robustness over x8 inverters, but at the expense of twice larger area. Thus, as a trade-off between area and SET robustness, INV\_x8 can be a good choice for the clock and reset trees. It is necessary to emphasize that while the gate upsizing is the most efficient approach for hardening the clock and reset trees, it may not be sufficient for hardening other combinational circuits in the design. To address the SETs in combinational logic, the selective mitigation is done by combining the gate upsizing with various in-circuit SET filtering methods. # IV. BASELINE $\Delta$ TMR FLIP-FLOP The primary requirement for a rad-hard flip-flop is to be robust to both SETs (originating in combinational logic between flip-flop stages) and SEUs (caused by direct particle strikes on flip-flops). We denote such a flip-flop as a $\Delta$ TMR flip-flop. A general design of a $\Delta$ TMR flip-flop is illustrated in Fig. 2. It consists of four main sections, i.e., D-SET Filter Section, FF-Section, Voter Section, and Driver Section. The robustness to SETs is achieved by integrated transient filter on the data path (D-SET Filter Section), while the robustness to SEUs and multiple SEUs is achieved by triplication and Fig. 3. SET filter variants for general TMR approach: (a) baseline SET filter architecture (D1D2), (b) guard-gate based D-SET filter variant after [58], (c) alternative SET filter with the use of AND-OR-MUX suppressor [59]. a Multiple-Node Charge Collection (MNCC)-aware layout arrangement. The proposed design has a common clock signal and a similar pin interface as the existing unhardened flip-flops in the standard cell library. The $\Delta$ TMR cells are modeled and characterized as common flip-flops and can be easily selected for mapping during the gate-level synthesis step. As can be seen in the concept scheme (see Fig. 2), we have chosen to delay the internal data inputs in order to filter the transients based on the transient effect nature and analysis of the architectures of different redundant circuits [12], [54]. This approach has two main advantages over insertion of delay elements in the clock line. First, the hardness is achieved by the arrangement of the internal cells, and the robustness does not depend on the amount of insertion delay of a clock signal. In delaying-the-clock approaches [55]–[57], the individual clock tree or clock skew group latency determines the transient filter size. Second, the insertion of delay elements in the data line is more beneficial in terms of power consumption. However, when developing complex TMR cells as standard cells, it requires for particular attention during the design phase. In particular, the baseline component selection and the arrangement for every section have direct impact on the performance in terms of power, delay overhead, and area occupation when integrated on complex systems. As a consequence, it is worth to introduce each section briefly. The D-SET filter section can be implemented in different configurations with the use of internal delay elements. The delays can be realized by alternating low/high drive inverter types in order to meet expected the pulse width ( $\delta$ ) as delay in the filter structure. Alternatively, explicit delay cells can be selected instead as such, they obtain the targeted $\delta$ delay. Both approaches can be implemented in a pure digital manner with the use of available standard cell gates. The most popular concept for a D-SET filter section is shown in Fig. 3 (a). It consists of three identical delay elements (inverter chain, or special delay cell arrangements). The primary data input D is directly fed to the internal output D0 without any delay, whereas the signal for the second output D1 is processed by one delay element dly0 shifting the data by $\delta$ in time. Finally, the signal for the third output D2 is driven by a couple of delay elements, i.e., dly1 and dly2. The delay Fig. 4. SET filter timing diagram: filtered SET (green), captured SET (red). elements shift the signal propagation of the primary data input D by two times $\delta$ . As a consequence, if a transient of less than $\delta$ is propagating on the data path, it will not be present on all three internal data inputs of the flip-flop at the same time. In contrast, if the transient is longer than the implemented $\delta$ delay of the D-SET section, the overlap of two faulty internal delayed signals might be captured by the sensitive clock edge leading to an internal double fault or multiple bit upset (MBU). Both scenarios are illustrated in the signal timing diagram in Fig. 4. The dashed vertical lines mark the sensitive edge of the flip-flop with respect to the clock signal CK. In this work we denote this transient filter structure as a D1D2 arrangement. It is the selected baseline D-SET filter concept for the $\Delta$ TMR cells. The size of the $\delta$ delay is set to 0.5 ns based on the investigation presented in Section III. In general, this section mainly determines the timing windows for setup and hold time characterization which has a huge impact on timing performance which we will discuss in later sections. Nevertheless, to address this, alternative transient filter realizations can be selected for TMR architectures (see Fig. 3 (b), and (c)). A more customized circuit solution is the use of guard-gates or C-elements in combination with a delay element as published in [58]. However, C-elements or guard-gates as standalone standard cells are more frequently used in the asynchronous world and may require additional standard cell design. Instead, a triple AND-OR-MUX arrangement together with the $\delta$ delay acting as a transient suppressor is proposed in [59]. This solution is advantageous for designs in which the area occupation is not a limiting factor due to the number of combinational gates. The second section is the Flip-Flop section (FF-Section) which mainly contributes together with the Voter Section to the total propagation delay of a $\Delta TMR$ flip-flop. The selected baseline flip-flop is immediately triplicated for TMR, which increases the power consumption and area occupation already by a factor of three. As a consequence, potential cell candidates for the $\Delta TMR$ flip-flop solution should be selected considering the introduced overhead. Moreover, different application fields should be addressed as well, and covered by the library content. Candidates for low-power design, high speed applications and scan chain implementation are supposed to be part of the additional RHBD standard cell library. Fig. 5. Schemes of the guard-gate based majority voter: (a) transistor level of inverting voter variant with three guard-gate inverters (GGI), (b) gate-level arrangement with additional output inverter. To start with, we have selected standard D-flip-flops from unhardened library with asynchronous reset for triplication. They offer a good trade-off between speed performance and power consumption and are often selected candidates during gate-level synthesis step in most applications. The third section is the Voter Section. This module can be implemented in various architectures. We have chosen AND/OR-based variants as a starting point. The NAND-based solutions require less silicon area and are more suitable for timing critical designs due to shorter propagation delay. There are several pure digital voter architectures made of more complex AND/OR gate arrangements [60], or multiplexers in combination with XOR gates [61]. The voter can be also implemented with the half-adder available in the standard cell library. However, the selection of the optimal voter design is always a trade-off between parameters such as power, delay and area. An alternative but more customized architecture is the use of in-parallel connected 2-input guard-gates for the voter section. This circuit is characterized by a low transistor count of only 12 devices and a short propagation delay. The scheme is illustrated in Fig. 5. As can be seen, if one input (e.g., B) is unequal to A and C, only the third guard-gate inverter will drive the inverted output value. The Driver Section is the last section and decouples the complex internal cells from the external circuitry in terms of fan-out and connected load. It is possible to use different types of driving gates. We name all implementations based on this concept $\Delta TMR$ gates. They are arranged, developed and modeled as common standard cells. In this work, several $\Delta TMR$ variants are discussed. They differ in terms of different D-SET filter size and layout arrangement such as internal component spacing. #### V. ALTERNATIVE ΔTMR FLIP-FLOP VARIANTS The baseline $\Delta TMR$ variant introduced in Section IV does not satisfy all relevant design requirements. Namely, due to hardware triplication, the power overhead is increased $3\times$ , which may be in conflict with the low-power requirements. Furthermore, the baseline $\Delta TMR$ may introduce large delay overhead due to the added majority voter, leading to performance degradation. In addition, it does not support the testability features. In order to address the requirements in Fig. 6. Gate-level scheme of the L-ΔTMR flip-flop (without D-SET filter). terms of propagation delay, design-for-testability, and low power consumption, we introduce several alternative flip-flop variants. All flip-flop variants presented in the following are derived from $\Delta TMR$ , i.e., they are robust to both SETs and SEUs. #### A. Latch-Based $\Delta TMR$ Flip-Flop (L- $\Delta TMR$ ) An alternative solution is a decomposition of the internal D-flip-flop standard cell gate of the FF-Section into a standard cell master-slave arrangement made of D-latches [53]. These variants have nearly the same delay performance in comparison to the baseline $\Delta TMR$ solution, but provide additional access to the internal nodes. The initial version of the L- $\Delta TMR$ flip-flops consist of existing unhardened standard cells and an integrated robustly sized clock inverter (INVX) (c.f. Fig. 6). This increases the power consumption but provides a robust clock phase for the slave stages. The voter section is similarly designed as the baseline $\Delta TMR$ cells with slight modifications in the use of 2-input AND and 2-input OR gates instead. For experimental results, different variants with and without D-SET filter section are designed. In a further improvement of the L-ΔTMR cells, integrated multiplexers are connected between the master and slave outputs each. The selector chooses the output according to the clock phase CK. Preliminary experiments have shown that a standard cell arrangement reduces the cell propagation delay by 22%, whereas the operating current is slightly increased by 2% due to the additional multiplexers [53]. Moreover, if the three multiplexers are internally arranged with transmission gates, a guard-gate-based voter is used, and the opposite sensitive latch type is selected, a reduction in delay about 61% can be obtained in comparison to the initial L-ΔTMR design. Nevertheless, the benefit still depends on the performance of the existing standard cell gates of the unhardened library, i.e., the baseline D-flip-flop, or latch, and the combinational gates. However, the baseline $\Delta TMR$ and the L- $\Delta TMR$ concepts are common arrangements with the use of two internal clock phases. They can be easily developed by existing standard cell library elements, with some additional custom design if required. #### B. $TSPC-\Delta TMR$ Flip-Flop The use of faster True-Single Phase Clock (TSPC) flip-flop variants as baseline cells benefits from shorter propagation Fig. 7. The selected TSPC baseline cell (modified from [65]) after [64]. delays due to the fewer number of transistors, and its nature of using one single phase for operation. TSPC is a well-known dynamic logic design style based on clocked CMOS (C<sup>2</sup>MOS) which was already introduced 1973 [62]. At the end of the 80's of last century, this high-speed CMOS circuit technique was established e.g. [63], [64]. A popular scheme after [64] as a selected baseline cell can be seen in Fig. 7. This inverting TSPC flip-flop requires only nine MOS transistors. With the use of such a cell in FF-Section, a compact, high-speed and low-area TMR flip-flop is obtainable. Baseline flip-flops are implemented as standard cell gates in IHP's 130 nm technology and compared to the available standard D-flip-flop which is present in the unhardened standard cell library. As published in [65], a TSPC flip-flop candidate is nearly twice as fast as the reference D-flip-flop and saves 41% of the silicon area. However, there are two drawbacks related to the TSPC architecture which have to be mentioned. First, the internal arrangement limits the maximum transition time of the clock signal, which might lead to a capture of an intermediate glitch on the data path, leading to a fault at the primary output. Second, the data is not refreshed or stored by feedback devices in the aforementioned baseline configurations. As a consequence, the data is not stabilized for slow frequencies or long clock phases. For the selected technology, a maximum clock transition of 1 ns and an operating frequency above 50 MHz are sufficient. Based on the above mentioned observations, a fast TSPC-ΔTMR cell is developed. It consists of an alternative D-SET filter section made of a delay cell/guard-gate arrangement. The inverting FF-Section is realized by nine-device TSPC flip-flops depicted in Fig. 7. The voter is a pure NAND gate implementation due to its shorter propagation time in comparison with the classical AND/OR-based variant. Finally, the output signal needs to be negated by an output inverter. As published in [65], the TSPC- $\Delta$ TMR arrangement requires only 130ps more than a single standard reference D-flip-flop under worst case condition, based on post layout extracted propagation delays. Finally, a normalized overhead of only 5.8 for the area, 7.4 for the energy, and a slight improvement in terms of propagation delay tpg of 0.9 can be obtained in comparison to the baseline, unhardened standard D-flip-flop DFF STD (see Table I). Fig. 8. A scannable ΔTMR flip-flop gate-level scheme (S-ΔTMR-II). # C. Scannable $\Delta TMR$ Flip-Flop $(S-\Delta TMR)$ With respect to design-for-testability flow compatibility, the standard scan chain implementation is fully supported by adequate, scannable ΔTMR flip-flop counterparts. In a first approach (S- $\Delta$ TMR-I), the TMR cell is directly realized by available unhardened Scan-D-flip-flops according to the concept illustrated in Fig. 2 and with the use of D1D2 D-SET filter of Fig. 3 (a). All internal "scan-enable" ports (SE) of the redundant flip-flops are connected to a common primary SE input of the $\Delta$ TMR flip-flop. However, the basic function of the voter is to correct a single fault. This nature of the voter is a drawback if scan flip-flops are just triplicated and the voter provides a single output value. Thus, the voter masks the individual flip-flop states programmed by the scan-in function leading to an undetectable fault. Instead, a scannable TMR flip-flop should provide an inner scan chain in order to obtain detailed access to the internal flip-flop cells. Moreover, the proposed S-ΔTMR cells are equipped with a separate scan-out port SO. An AND gate controls the output depending on the selected mode by SE to save power during normal operation. In addition, combinatorial gates are added along the inner scan chain in order to meet the hold time of the second and third flip-flop in this configuration. The second variant (S- $\Delta$ TMR-II) is a TMR arrangement with multiplexers in front of the data input of the standard flip-flops, which select the corresponding data signal depending on the SE signal. In this case, the D-SET filter is integrated as can be seen in Fig. 8. The hold time is already met due to the delay elements inside D1D2 filter. For both approaches, the scan patterns can be easily generated by linking the detailed architectural content of the S- $\Delta$ TMR cell during automatic test pattern generation (ATPG). Alternatively, a second cell model (.lib-file) with an internal 3-bit scan-chain behavior can be used instead. ## D. Self-Correcting $\Delta TMR$ Flip-Flop (SC-S- $\Delta TMR$ ) A critical design requirement, especially in highly-scaled technologies, is low power consumption. One approach to reduce the dynamic power part of a digital circuit is to put the clock into an inactive steady state, i.e., clock-gating. As a consequence, the affected registers keep their stored value until Fig. 9. Block scheme of a self-correcting ΔTMR flip-flop. the next clock edge arises, leading to an update or refreshing of the registered data. Nevertheless, this low-power feature has a dangerous side effect, if clock-gating is applied to a system configuration which is exposed to radiation. The induced SEUs would accumulate over time and might lead to a fault in the complete TMR flip-flop, bringing the system to a deadlock in the worst case. As a consequence, the use of activated self-correcting registers while the clock is gated is one option to address this issue. One solution is to place the majority voter inside a feedback loop in order to enable the self-correction in the slave stages. Approaches made of C-element/guard-gates are published in [45], [46], whereas a single extra latch is proposed in [47]. We reuse the existing asynchronous set and reset functions in order to correct the state of the flip-flop when the clock phase is low. As illustrated in Fig. 9, control signals are generated by additional control modules (SC\_CTRL), which process the individual asynchronous control signals for the three flip-flops. These SC\_CTRL units generate the proper settings for the self-correction, i.e., RN = 0 and SN = 1 or vice versa for each control module individually. Both control signals are not allowed to be in the same logic state. Moreover, inverting guard-gates (GGI) are internally added in order to avoid transient propagation, which can affect all flip-flops at the same time. This candidate is additionally equipped with the scan functionality as described in the previous section. As a consequence, we are able to write unequal internal flip-flop settings during scan mode and start the self-correction feature by releasing the SE signal during electrical measurements. As an example, the waveform of an analog transient simulation of the self-correction functionality is shown in Fig. 10. As can be seen, the correction only starts after the falling clock edge. #### VI. DEVELOPMENT OF TMR-BASED FLIP-FLOPS Depending on the complexity, the cells can be either directly created at netlist level or at transistor level if deeper investigations were required. The TMR cell layouts are developed using commercial place and route tool. The cell frames are constrained by the placement and routing grid of the existing compatible unhardened standard cell library. Fig. 10. Transient simulation of a self-correcting ΔTMR flip-flop. #### A. Physical Implementation The most critical issue with respect to the implementation of robust TMR cells is the placement of the internal flip-flops. Based on the referenced NASA-Boeing research [66], the two issues are addressed. First, a simultaneous flipping of two flip-flops needs to be avoided, which may occur when particle energy is deposited between two adjacent flip-flops in the same row. Similarly, if the particle energy is deposited between flip-flops of different rows, i.e. "rail stacking of voted flip-flops" [66], this needs to be prevented as well. If such an effect occurs, a double fault or an internal multi-bit upset could arise. As a consequence, measures at layout level have to take action in order to maintain the robustness of the proposed ΔTMR flip-flops. For this work, we denote these measures as a Multiple Node Charge Collection (MNCC)-aware placement. Different layout concepts are proposed for each specific type of $\Delta TMR$ cell. One of the main goals is the compliance to the selected unhardened standard cell library, which is used together with the novel flip-flops in order to implement complex systems. Nevertheless, the internal complexity of the $\Delta TMR$ cells increases the routing congestion for lower metal layers and complicates the pin-access. Moreover, the MNCC-aware placement results in additional spacing between sensitive regions which enlarges the final $\Delta TMR$ cell frames. The placement of each individual sequential cell maintains a minimum spacing of several micrometers $\Delta x_{seq}$ . Moreover, a minimum section spacing $\Delta x_{sec}$ between active combinatorial logic is targeted. For this work, both types of spacing are defined as an edge-to-edge spacing of two cell frames for simplification. In addition to the spacing, the following rules are considered independent of a 1-, 2-, or 3-row layout arrangement. First, the routing between filled gaps (empty spare area) are kept at the lowest possible routing layer in order to free upper signal routing channels at application stage. Furthermore, the cells can be placed below the required vertical power and ground stripes without having much coupling impact on the cells performance. Third, the pin-access is maintained at upper routing layer with additional cut-outs in order to relax the pin routing. And finally, only middle routing tracks are used in order to have space fulfilling metal enclosure requirements for vias of vertical stripes. Fig. 11 shows the simplified abstracts of the cell placement for different row arrangements. Fig. 11. Internal standard cell placement of $\Delta$ TMR cells. The layouts of baseline $\Delta TMR$ , the TMR scan flip-flops S- $\Delta TMR$ -I, and the TSPC- $\Delta TMR$ cells follow the 1-row placement concept illustrated in Fig. 11 (a). The flip-flops are distributed in the cell frame in order to keep the specified $\Delta x$ spacings in between. The spare area between the sensitive regions is filled with standard filler or decoupling cells, if applicable, in order to stabilize the power and ground supply respectively. The initial self-correcting variant SC-S- $\Delta$ TMR is implemented in a 2-row arrangement as depicted in Fig. 11 (b). This is required in order to keep the spacing between the different flip-flop-related sections. Similarly as for the 1-row arrangement, the spare area is aligned as such the cells can be placed freely in a design. Finally, the experimental layout of the L- $\Delta$ TMR cells is shown in Fig. 11 (c). This time the latches are distributed over three rows in order to separate them from each other. As a result, the layout is highly dense with an utilization above 97%. The decision for a 3-row layout allows a placement between the vertical stripes and uses less metal layers in comparison to a similar 1-row arrangement. The layout information of all cells is provided in a.lef-file which contains signal and pin shape definitions, blockages, and special net routing. #### B. Characterization Automation of digital standard cell library characterization is a global trend of EDA tools evolution. We have selected Cadence Liberate Characterization tool to generate models for our complex $\Delta TMR$ cells. As a result, extracted timing and power values are provided in a.lib-file with the information of each cell under different conditions, such as input transition, output load, power supply, and environment temperature. Our $\Delta$ TMR flip-flops are equipped with integrated D-SET filters. Their internal topology has direct impact on the overall flip-flops' setup and hold time. As an example, the longest input-to-register path exists from the primary input D to the data input of the third internal D-flip-flop ff2/D when a D1D2 transient filter arrangement is selected. The larger the $\delta$ -delay, the more the performance (speed) of the entire $\Delta$ TMR flip-flop is limited. In particular, the two $\delta$ -delay stages in front of the third flip-flop mainly determine the total setup time ( $t_{setup}$ ) of the entire $\Delta$ TMR flip-flop as follows: $$t_{\text{setup}}(\Delta \text{TMR}_{\text{D1D2\_SET}}) \ge t_{\text{setup}}(\text{ff}_2) + 2\delta,$$ (1) whereas the hold time is dominated by the first flip-flop with $$t_{hold}(\Delta TMR_{D1D2\_SET}) \ge t_{hold}(ff_0)$$ (2) As can be seen from the Eq. (1) and (2) above, a specified $\delta$ -delay of 0.5 ns results in a total setup time $t_{setup}(\Delta TMR_{D1D2\_SET})$ above 1 ns. In other words, the data needs to be stable for more than one nanosecond before the data is captured by the clock edge in comparison to standard unhardened flip-flop scheme or a TMR arrangement without D-SET filter section. For setup/hold measurements, the data D and clock signal CK are shifted in time and their slope is changed according to the input transition look-up table definitions. The offset between D and CK is reduced stepwise as long as the probe node (e.g. Q) becomes unequal to the expected value – thus, a setup/hold violation has occurred. In a TMR arrangement with D1D2 SET filter, the dominant flip-flop is $ff_1$ for setup/hold characterization. Any internal setup violation of $ff_2$ and any hold violation of $ff_0$ is hidden and invisible for the characterization tool when the probe node is set to the primary output Q during the setup/hold characterization phase. The voter masks the internal fault and the final result is already present if two out of three flip-flops have the same value. As a consequence, internal flip-flop output nodes have to be selected during this characterization phase depending on the selected DSET filter architecture [67]. The template files with timing and power arc definitions are modified with additional initial conditions, probe definitions, and pre-vector specification in order to improve the quality of results of extracted power and timing data for our complex $\Delta$ TMR flip-flops [68]. Having a dedicated characterization setup for the $\Delta$ TMR cells, an accurate.lib-file generation is obtained. The behavioral model (.v-file) is generated according to the characterization results. Together with the abstract layout information (.lef), the required library bundle for a digital design is created. #### VII. DESIGN FLOW FOR ROBUST APPLICATIONS The proposed rad-hard flip-flop cells are designed as such that no changes are required on RTL level (Register-Transfer Level), and the development follows the vertical, digital design approach. The internal structure of the TMR flip-flops is invisible when using them in a design. During gate-level synthesis, the flip-flops are mapped to the novel RHBD $\Delta$ TMR flip-flops provided in an additional.lib-file, i.e., standard cell library Fig. 12. Integration of ΔTMR shift register chips in one 68-pin PGA. description. The place and route process is not influenced in general, the placement of the $\Delta TMR$ cells is straightforward and does not require special care. However, some particular attention is necessary in order to maintain the desired robustness during implementation. Depending on the requirements, special SET-filter cells are additionally connected as output drivers for critical signals (e.g., domain crossing signals from digital to analog domain). Similarly, in order to mitigate SETs on most sensitive commonly shared signals, such as clock or asynchronous control signals, the RHBD variants of SET filters are directly added on input side to prevent transient propagation and possible occurrence of MBUs. The architecture of such filters is made of well-known guard-gate-delay cell configurations, i.e., a subset of the scheme shown in Fig. 3 (b), whereas the robustness is achieved by dedicated gate sizing based on the presented technology-related SET investigations. Nevertheless, a synchronous reset strategy would reduce the risk by orders of magnitude and it can be eliminated completely by adding robust SET filters on the entire reset network. Similarly, the clock tree must also consist of large buffers or inverters in order to increase the node capacitance, which has additional impact on the power consumption on one side. However, the decision to use a common clock network is also advantageous on the other side. It is common to imbalance the clock sinks aggressively in order to utilize the different skews to meet the specified critical timing requirements or to save power. To our knowledge, this might be challenging if clock sinks are separated in different clock groups (e.g. three for TMR) which have to be rebalanced (skewed) for timing but have to maintain the transient filter size ( $\delta$ ) in parallel for keeping the radiation robustness. However, this optimization strategy is directly supported by our approach. #### VIII. TEST ICS The presented variants are implemented and fabricated in IHP's 130 nm technology. Each cell under test (CUT) is developed as a standard cell and arranged in a 1024 bitwide shift register IC design for electrical verification and radiation testing. The shift registers are implemented as single chip each, following the classical digital design flow. The clock tree and the asynchronous reset tree are realized with robust inverter cells. It has to be emphasized that no extra | Campaign | CUT | $\delta$ -size | Min t <sub>setup</sub> | Min $\Delta_{\text{seq}}$ | Min $\Delta_{\rm sec}$ | Norm. | Norm. | Norm. | Description | |----------|----------|----------------|------------------------|---------------------------|------------------------|-------|-----------------|-------|--------------------------------------------| | No. | | [ns] | overhead | [µm] | [µm] | A | t <sub>pg</sub> | Е | | | 1 | DFF_STD | n/a | n/a | n/a | n/a | 1.0 | 1.0 | 1.0 | Standard D-flip-flop | | 1 | DTMR_01 | 0.18 | $2\delta$ | 15 | 0.5 | 11.9 | 1.7 | 14.4 | $\Delta$ TMR with $\delta$ , small spacing | | 2 | DTMRR00 | _ | _ | 10 | 10 | 6.7 | 1.9 | 4.6 | $\Delta$ TMR w/o $\delta$ , larger spacing | | 2 | DTMRR05 | 0.5 | $2\delta$ | 14 | 10 | 8.1 | 1.9 | 7.4 | $\Delta$ TMR, larger $\delta$ , spacing | | 2 | LDTMRR00 | _ | _ | 5 | n/a | 7.9 | 1.8 | 5.5 | L- $\Delta$ TMR w/o $\delta$ | | 2 | LDTMRR05 | 0.5 | $2\delta$ | 5 | n/a | 7.9 | 1.8 | 8.4 | L- $\Delta$ TMR with $\delta$ | | 3 | DTMRR05N | 0.5 | $2\delta$ | 14 | 10 | 8.1 | 1.9 | 7.4 | DTMRR05 + deep N-well | | 3 | SDTMRB05 | 0.5 | $2\delta$ | 19 | 9 | 10.8 | 2.1 | 7.9 | S- $\Delta$ TMR-I with $\delta$ , spacing | | 3 | SDTMRN05 | 0.5 | $2\delta$ | 19 | 9 | 10.8 | 2.0 | 7.8 | SDTMRB05 w/o async. re-/set | TABLE I FLIP-FLOP IMPLEMENTATIONS OVERVIEW (b) Experimental results for campaign no. 3, untilted, and 45°-angled Fig. 13. Cross-section as a function of effective LET (LET<sub>eff</sub>) for all irradiated shift registers. effort is done for SET mitigation at the chip level. The shift register designs do not contain additional filter structures at input or output side. Several variants of $\Delta TMR$ had been implemented with different SET filter sizes $(\delta)$ and different spacings $(\Delta x_{seq|sec})$ . Moreover, as published in [53], additional test devices with the L- $\Delta TMR$ approach are fabricated as well. The most promising solutions for the scannable flip-flop architecture are also implemented. Table I lists a subset of the most important candidates of all $\Delta TMR$ architectures. The TSPC variant and the self-correcting architecture SC-S- $\Delta TMR$ were only electrically verified with on-wafer tests and are therefore not listed in the table. ## IX. RADIATION MEASUREMENTS After successful on-wafer measurements, four different candidates were integrated in one 68-pin ceramic PGA package for radiation test campaigns (see Fig. 12). These open packages had been exposed to different ions with a fluence of at least $2 \cdot 10^7 \, \mathrm{cm}^{-2}$ and an LET between 3 and 67 [ $\frac{\mathrm{MeV \cdot cm}^2}{\mathrm{mg}}$ ] at the cyclotron of the Université catholique de Louvain (UCL) in Belgium. Four different test patterns had been applied and the errors were counted while irradiating the devices under test. The solid-0/solid-1 tests were applied in order to assess the sensitivity of test circuits to faults that force the outputs to logic low and high levels, respectively. Both test setups were immune with respect to soft errors in the clock tree, whereas solid-1 was still sensitive to events on the asynchronous reset network. In addition, two types of checkerboard tests were executed in order to investigate the faults in clock or reset trees. In sum, three different campaigns were performed as listed in Table I. The radiation test results of the first two campaigns are illustrated in diagram (a) of Fig. 13, whereas the cross-section $\sigma$ per bit was calculated, as follows: $$\sigma = \frac{\text{\#Errors}}{\text{Fluence} \times \text{\#Flip-flops}}$$ (3) Most tests were performed with a fluence of $3 \cdot 10^7 \text{cm}^{-2}$ . Some tests at lower LET only used a fluence of $2 \cdot 10^7 \text{cm}^{-2}$ . For better illustration we defined a global worst-case detection limit of $4.88e^{-11}\text{cm}^2$ corresponding to the minimum fluence. For both, the $\Delta TMR$ device DTMRR05 and the L- $\Delta TMR$ device LDTMRR05, no SEU was detected at 62.5 MeV cm<sup>2</sup> mg<sup>-1</sup> and 46.1 MeV cm<sup>2</sup> mg<sup>-1</sup>, respectively. As can be seen, the proposed variants of $\Delta TMR$ and L-ΔTMR with integrated D-SET filter had higher LET threshold in comparison to an initial variant DTMR 01. Moreover, we can deduce and underline the importance of the D-SET filter section. In both cases without any $\delta$ delay, the first errors had been already seen at 16.1 MeV cm<sup>2</sup> mg<sup>-1</sup> when circuits were radiated with lower energy ions. The experimental results of the third campaign with the scannable flip-flops and an improved DTMRR05 variant with deeper N-well layout for better SEL robustness and noise performance are shown in Fig. 13 (b). The diagram shows two groups of experiments. The solid black curves indicate the results of all different four test patterns, whereas the blue lines show the resulting cross sections of the 45°-tilted experiments. The detection limit for the non-angled experiments was determined to be 2.44e<sup>-11</sup>cm<sup>2</sup>, and 6.21e<sup>-11</sup>cm<sup>2</sup> for the angled experiments respectively. Obviously, angled strikes increase the cross section because angled strikes results in longer trajectories, i.e. more deposited charge. Nevertheless, most particles are below an effective LET of 30 MeV cm<sup>2</sup> mg<sup>-1</sup> and the cross section for 45°-angled strikes does not increase for LET below 30 MeV cm<sup>2</sup> mg<sup>-1</sup>. As a consequence, the proposed architectures are suitable solutions for most radiationhard applications. Based on the experimental results, it can be observed that the proposed flip-flop variants provide a high level of robustness to soft errors, and could be considered for space applications. Since most energetic particles in space environment have LET below 30 MeV cm<sup>2</sup> mg<sup>-1</sup>, all proposed flip-flop designs will be robust under these conditions. This performance is comparable to some wellknown custom-designed flip-flops. For example, the threshold LET of HIT flip-flops in Imec's DARE library is from 33.1 to 59 MeV cm $^2$ mg $^{-1}$ , as reported in [33]. #### X. CONCLUSION In this paper, several design variants for robust TMR flip-flops based on standard cells from a non-rad-hard digital library are presented. All proposed architectures are arranged and modeled as digital design flow compliant standard cells. They feature a dedicated internal component placement and integrated transient filter on the data path. Designs for a baseline TMR flip-flop, a latch-based TMR flip-flop, a TSPC TMR flip-flop, a scannable TMR flip-flop, and a self-correcting architecture with the use of the proposed $\Delta$ TMR approach are presented. The most promising subset of the $\Delta$ TMR designs is implemented in a 130nm CMOS technology and realized as shift registers for radiation experiments. The results show a robustness of 32.4, 46.1, and $62.5 \,\mathrm{MeV} \,\mathrm{cm}^2 \,\mathrm{mg}^{-1}$ LET dependent on the selected variant. When 45°-angled strikes are applied, no increase of the cross section for LET below 30 MeV cm<sup>2</sup> mg<sup>-1</sup> is observed. With utilization of the presented $\Delta$ TMR flip-flops it will be possible to design more efficiently complex digital ASICs for space applications. As each variant brings a certain level of trade-off between robustness and overhead, the selection of the most suitable variant for a given circuit node would depend on the application requirements. #### REFERENCES - [1] P. E. Dodd, M. R. Shaneyfelt, J. R. Schwank, and J. A. Felix, "Current and future challenges in radiation effects on CMOS electronics," *IEEE Trans. Nucl. Sci.*, vol. 57, no. 4, pp. 1747–1763, Aug. 2010. - [2] E. Ibe, H. Taniguchi, Y. Yahagi, K. Shimbo, and T. Toba, "Impact of scaling on neutron-induced soft error in SRAMs from a 250 nm to a 22 nm design rule," *IEEE Trans. Electron Devices*, vol. 57, no. 7, pp. 1527–1538, Jul. 2010. - [3] J. R. Schwank, M. R. Shaneyfelt, and P. E. Dodd, "Radiation hardness assurance testing of microelectronic devices and integrated circuits: Radiation environments, physical mechanisms, and foundations for hardness assurance," *IEEE Trans. Nucl. Sci.*, vol. 60, no. 3, pp. 2074–2100, Jun. 2013. - [4] P. E. Dodd and L. W. Massengill, "Basic mechanisms and modeling of single-event upset in digital microelectronics," *IEEE Trans. Nucl. Sci.*, vol. 50, no. 3, pp. 583–602, Jun. 2003. - [5] R. C. Baumann, "Radiation-induced soft errors in advanced semiconductor technologies," *IEEE Trans. Device Mater. Rel.*, vol. 5, no. 3, pp. 305–316, Sep. 2005. - [6] H. J. Barnaby, "Total-ionizing-dose effects in modern CMOS technologies," *IEEE Trans. Nucl. Sci.*, vol. 53, no. 6, pp. 3103–3121, Dec. 2006. - [7] W. J. Snoeys, T. A. P. Gutierrez, and G. Anelli, "A new NMOS layout structure for radiation tolerance," in *Proc. IEEE Nucl. Sci. Symp. Conf. Rec.*, vol. 2, Nov. 2001, pp. 822–826. - [8] L. Chen and D. M. Gingrich, "Study of N-channel MOSFETs with an enclosed-gate layout in a 0.18 μm CMOS technology," *IEEE Trans. Nucl. Sci.*, vol. 52, no. 4, pp. 861–867, Aug. 2005. - [9] R. R. Troutman, Latchup in CMOS Technology: The Problem and Its Cure (The Springer International Series in Engineering and Computer Science). Springer, 1986. [Online]. Available: https://www.springer. com/de/book/9780898382150 - [10] C. J. Marshall *et al.*, "Mechanisms and temperature dependence of single event latchup observed in a CMOS readout integrated circuit from 16–300 K," *IEEE Trans. Nucl. Sci.*, vol. 57, no. 6, pp. 3078–3086, Dec. 2010. - [11] M. Nicolaidis, "A low-cost single-event latchup mitigation sscheme," in *Proc. 12th IEEE Int. On-Line Test. Symp. (IOLTS)*, Jul. 2006, p. 5. - [12] V. Petrović, G. Schoof, and Z. Stamenković, "Fault-tolerant TMR and DMR circuits with latchup protection switches," *Microelectron. Rel.*, vol. 54, no. 8, pp. 1613–1626, Aug. 2014. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0026271414001267 - [13] T. Aoki, "A practical high-latchup immunity design methodology for internal circuits in the standard cell-based CMOS/BiCMOS LSIs," *IEEE Trans. Electron Devices*, vol. 40, no. 8, pp. 1432–1436, Aug. 1993. - [14] N. Seifert et al., "Soft error rate improvements in 14-nm technology featuring second-generation 3D tri-gate transistors," *IEEE Trans. Nucl. Sci.*, vol. 62, no. 6, pp. 2570–2577, Dec. 2015. - [15] K. Lilja, "Environment and devices SER—Modeling neutrons and heavy ion SER, from planar CMOS to FinFETs," in *Proc. IEEE Nucl. Space Radiat. Effects Conf. (NSREC)*, Jul. 2016. - [16] R. C. Lacoe, "Improving integrated circuit performance through the application of hardness-by-design methodology," *IEEE Trans. Nucl. Sci.*, vol. 55, no. 4, pp. 1903–1925, Aug. 2008. - [17] S. Weidling and M. Goessel, "Fault tolerant linear state machines," in Proc. 15th Latin Amer. Test Workshop (LATW), Mar. 2014, pp. 1–6. - [18] P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi, "Modeling the effect of technology trends on the soft error rate of combinational logic," in *Proc. Int. Conf. Dependable Syst. Netw.*, 2002, pp. 389–398. - [19] N. N. Mahatme et al., "Impact of technology scaling on the combinational logic soft error rate," in Proc. IEEE Int. Rel. Phys. Symp., Jun. 2014, pp. 5F.2.1–5F.2.6. - [20] A. Simevski, P. Skoncej, C. Calligaro, and M. Krstic, "Scalable and configurable multi-chip SRAM in a package for space applications," in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Nanotechnol. Syst. (DFT), Oct. 2019, pp. 1–6. - [21] B. Narasimham et al., "Quantifying the reduction in collected charge and soft errors in the presence of guard rings," *IEEE Trans. Device Mater. Rel.*, vol. 8, no. 1, pp. 203–209, Mar. 2008. - [22] J.-J. Chen et al., "Novel layout technique for single-event transient mitigation using dummy transistor," *IEEE Trans. Device Mater. Rel.*, vol. 13, no. 1, pp. 177–184, Mar. 2013. - [23] J. Furuta, K. Kobayashi, and H. Onodera, "Impact of cell distance and well-contact density on neutron-induced multiple cell upsets," in *Proc. IEEE Int. Rel. Phys. Symp. (IRPS)*, Apr. 2013, pp. 6C.3.1–6C.3.4. - [24] T. Calin, M. Nicolaidis, and R. Velazco, "Upset hardened memory design for submicron CMOS technology," *IEEE Trans. Nucl. Sci.*, vol. 43, no. 6, pp. 2874–2878, Dec. 1996. - [25] O. A. Amusan *et al.*, "Single event upsets in deep-submicrometer technologies due to charge sharing," *IEEE Trans. Device Mater. Rel.*, vol. 8, no. 3, pp. 582–589, Sep. 2008. - [26] T. D. Loveless et al., "Neutron- and proton-induced single event upsets for D- and DICE-flip/flop designs at a 40 nm technology node," *IEEE Trans. Nucl. Sci.*, vol. 58, no. 3, pp. 1008–1014, Jun. 2011. - [27] M. D'Alessio, M. Ottavi, and F. Lombardi, "Design of a nanometric CMOS memory cell for hardening to a single event with a multiplenode upset," *IEEE Trans. Device Mater. Rel.*, vol. 14, no. 1, pp. 127–132, Mar. 2014. - [28] S. Campitelli, M. Ottavi, S. Pontarelli, A. Marchioro, D. Felici, and F. Lombardi, "F-DICE: A multiple node upset tolerant flip-flop for highly radioactive environments," in *Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Nanotechnol. Syst. (DFTS)*, Oct. 2013, pp. 107–111. - [29] H.-H. K. Lee et al., "LEAP: Layout design through error-aware transistor positioning for soft-error resilient sequential cell design," in Proc. IEEE Int. Rel. Phys. Symp. (IRPS), May 2010, pp. 203–212. - [30] S. M. Jahinuzzaman, D. J. Rennie, and M. Sachdev, "A soft error tolerant 10T SRAM bit-cell with differential read capability," *IEEE Trans. Nucl. Sci.*, vol. 56, no. 6, pp. 3768–3773, Dec. 2009. - [31] Y.-Q. Li et al., "A quatro-based 65-nm flip-flop circuit for soft-error resilience," *IEEE Trans. Nucl. Sci.*, vol. 64, no. 6, pp. 1554–1561, May 2017. - [32] S. Jagannathan et al., "Single-event tolerant flip-flop design in 40-nm bulk CMOS technology," *IEEE Trans. Nucl. Sci.*, vol. 58, no. 6, pp. 3033–3037, Dec. 2011. - [33] S. Redant. The Dare Library Family. DARE User Day. ESA/ESTEC. Accessed: Feb. 15, 2011. [Online]. Available: https://www.esa.int - [34] J. E. Knudsen and L. T. Clark, "An area and power efficient radiation hardened by design flip-flop," *IEEE Trans. Nucl. Sci.*, vol. 53, no. 6, pp. 3392–3399, Dec. 2006. - [35] R. Naseer and J. Draper, "DF-DICE: A scalable solution for soft error tolerant circuit design," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2006, p. 4. - [36] A. Yan, Y. Hu, J. Song, and X. Wen, "Single-event double-upset self-recoverable and single-event transient pulse filterable latch design for low power applications," in *Proc. Design, Automat. Test Eur. Conf. Exhib. (DATE)*, Mar. 2019, pp. 1679–1684. - [37] S. M. Jahinuzzaman and R. Islam, "TSPC-DICE: A single phase clock high performance SEU hardened flip-flop," in *Proc.* 53rd IEEE Int. Midwest Symp. Circuits Syst., Aug. 2010, pp. 73–76. - [38] S. M. Jahinuzzaman, D. J. Rennie, and M. Sachdev, "Soft error robust impulse and TSPC flip-flops in 90 nm CMOS," in *Proc. 2nd Microsyst. Nanoelectron. Res. Conf.*, Oct. 2009, pp. 45–48. [39] S. Gupta and J. Mekie, "Soft error resilient and energy efficient dual - [39] S. Gupta and J. Mekie, "Soft error resilient and energy efficient dual modular TSPC flip-flop," in *Proc. 32nd Int. Conf. VLSI Design*, 18th Int. Conf. Embedded Syst. (VLSID), Jan. 2019, pp. 341–346. - [40] V. Srinivasan, A. L. Sternberg, A. R. Duncan, W. H. Robinson, B. L. Bhuva, and L. W. Massengill, "Single-event mitigation in combinational logic using targeted data path hardening," *IEEE Trans. Nucl.* Sci., vol. 52, no. 6, pp. 2516–2523, Dec. 2005. - [41] M. Zhang et al., "Sequential element design with built-in soft error resilience," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 12, pp. 1368–1378, Dec. 2006. - [42] J. von Neumann, "The institute for advanced study Princeton," in Automata Studies. (AM-34), vol. 34. Princeton, NJ, USA: Princeton Univ. Press, California Institute of Technology, 1956. - [43] R. Oliveira, A. Jagirdar, and T. J. Chakraborty, "A TMR scheme for SEU mitigation in scan flip-flops," in *Proc. 8th Int. Symp. Qual. Electron. Design (ISQED)*, Mar. 2007, pp. 905–910. - [44] L. Cassano, A. Bosio, and G. Di Natale, "A novel adaptive fault tolerant flip-flop architecture based on TMR," in *Proc. 19th IEEE Eur. Test Symp.* (ETS), May 2014, pp. 1–2. - [45] N. D. Hindman et al., "High speed redundant self-correcting circuits for radiation hardened by design logic," in *Proc. Eur. Conf. Radiat. Effects Compon. Syst.*, Sep. 2009, pp. 465–472. - [46] C. Ramamurthy, A. Gujja, V. Vashishtha, S. Chellappa, and L. T. Clark, "Muller C-element self-corrected triple modular redundant logic with multithreading and low power modes," in *Proc. 17th Eur. Conf. Radiat. Effects Compon. Syst. (RADECS)*, Oct. 2017, pp. 1–4. - [47] A. J. Drake, A. Kleinosowski, and A. K. Martin, "A self-correcting soft error tolerant flop-flop," in *Proc. 12th NASA Symp. VLSI Design*, Coeur d'Alene, ID, USA, 2005, pp. 4–5. - [48] N. D. Hindman, L. T. Clark, D. W. Patterson, and K. E. Holbert, "Fully automated, testable design of fine-grained triple mode redundant logic," *IEEE Trans. Nucl. Sci.*, vol. 58, no. 6, pp. 3046–3052, Dec. 2011. - [49] V. Petrovic and M. Krstic, "Design flow for radhard TMR flip-flops," in *Proc. IEEE 18th Int. Symp. Design Diag. Electron. Circuits Syst.*, Apr. 2015, pp. 203–208. - [50] L. A. C. Benites and F. L. Kastensmidt, "Automated design flow for applying triple modular redundancy (TMR) in complex digital circuits," in *Proc. IEEE 19th Latin-Amer. Test Symp. (LATS)*, Mar. 2018, pp. 1–4. - [51] J. S. Kauppila *et al.*, "A bias-dependent single-event compact model implemented into BSIM4 and a 90 nm CMOS process design kit," *IEEE Trans. Nucl. Sci.*, vol. 56, no. 6, pp. 3152–3157, Dec. 2009. - [52] B. Narasimham et al., "Characterization of digital single event transient pulse-widths in 130-nm and 90-nm CMOS technologies," *IEEE Trans. Nucl. Sci.*, vol. 54, no. 6, pp. 2506–2511, Dec. 2007. - [53] O. Schrape, A. Breitenreiter, C. Schulze, S. Zeidler, and M. Krstic, "Radiation-hardness-by-design latch-based triple modular redundancy flip-flops," in *Proc. IEEE 12th Latin Amer. Symp. Circuits Syst.* (LASCAS), Feb. 2021, pp. 1–4. - [54] R. Weigand, "Single event effect mitigation in digital integrated circuits for space," in *Proc. Top. Workshop Electron. Part. Phys.*, Aachen, Germany, 2010. - [55] A. Gujja, S. Chellappa, C. Ramamurthy, and L. T. Clark, "Redundant skewed clocking of pulse-clocked latches for low power soft error mitigation," in *Proc. 15th Eur. Conf. Radiat. Effects Compon. Syst.* (RADECS), Sep. 2015, pp. 1–7. - [56] S. Kumar, S. Chellappa, and L. T. Clark, "Temporal pulse-clocked multibit flip-flop mitigating SET and SEU," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, May 2015, pp. 814–817. - [57] O. Schrape, A. Breitenreiter, M. Andjelkovic, and M. Krstic, "D-SET mitigation using common clock tree insertion techniques for triple-clock TMR flip-flop," in *Proc. 21st Euromicro Conf. Digit. Syst. Design (DSD)*, Prague, Czech Republic, Aug. 2018, pp. 201–205. [58] S. Rezgui, J. J. Wang, E. C. Tung, B. Cronquist, and J. McCollum, - [58] S. Rezgui, J. J. Wang, E. C. Tung, B. Cronquist, and J. McCollum, "New methodologies for set characterization and mitigation in flash-based FPGAs," *IEEE Trans. Nucl. Sci.*, vol. 54, no. 6, pp. 2512–2524, Dec. 2007. - [59] F. Smith, "A new methodology for single event transient suppression in flash FPGAs," *Microprocess. Microsyst.*, vol. 37, no. 3, pp. 313–318, May 2013. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0141933112001937 - [60] P. Balasubramanian and N. E Mastorakis, "Power, delay and area comparisons of majority voters relevant to TMR architectures," 2016, arXiv:1603.07964. [Online]. Available: http://arxiv.org/abs/1603.07964 - [61] T. Ban and L. A. de Barros Naviner, "A simple fault-tolerant digital voter circuit in TMR nanoarchitectures," in *Proc. 8th IEEE Int. NEWCAS Conf.*, Jun. 2010, pp. 269–272. - [62] B. Razavi, "TSPC logic [a circuit for all seasons]," *IEEE Solid-State Circuit Mag.*, vol. 8, no. 4, pp. 10–13, Nov. 2016. [63] Y. Ji-Ren, I. Karlsson, and C. Svensson, "A true single-phase-clock - [63] Y. Ji-Ren, I. Karlsson, and C. Svensson, "A true single-phase-clock dynamic CMOS circuit technique," *IEEE J. Solid-State Circuits*, vol. SSC-22, no. 5, pp. 899–901, Oct. 1987. - [64] J. Yuan and C. Svensson, "High-speed CMOS circuit technique," *IEEE J. Solid-State Circuits*, vol. 24, no. 1, pp. 62–70, Feb. 1989. - [65] O. Schrape, M. Andjelkovic, A. Breitenreiter, A. Balashov, and M. Krstic, "Design concept for radiation-hardening of triple modular redundancy TSPC flip-flops," in *Proc. 23rd Euromicro Conf. Digit. Syst. Design (DSD)*, Aug. 2020, pp. 616–621. - [66] M. P. Baze, J. C. Killens, R. A. Paup, and W. P. Snapp, "SEU hardening techniques for retargetable, scalable, sub-micron digital circuits and libraries," in *Proc. 13th Biennial Single Effects Symp.*, Manhattan Beach, CA, USA, 2002. - [67] O. Schrape, A. Breitenreiter, S. Zeidler, and M. Krstic, "Aspects on timing modeling of radiation-hardness by design standard cell-based ΔTMR flip-flops," in *Proc. 22nd Euromicro Conf. Digit. Syst. Design* (DSD), Aug. 2019, pp. 639–642. - [68] A. Balashov and O. Schrape, "Aspects of library characterization of digital standard cells with complex circuit topology," in *Proc. Cadence User Conf.*, 2018. Oliver Schrape received the Diploma degree in computer science from Humboldt University of Berlin, Germany, in 2008. Since 2007, he has been with the Department of System Architectures, IHP. His research interests include high-speed digital design, design automation for differential logic applications, and design methodologies and techniques for fault-tolerant and radiation-hardness-bydesign applications. Steffen Zeidler received the Diploma degree from the University of Potsdam in 2007 and the Dr.-Ing. degree from Brandenburg University of Technology, Cottbus, Germany, in 2013. Since 2007, he has been a member of the Graduate School DEDIS Nano, Brandenburg University of Technology. Since 2014, he is leading the Test Service Team. His research interests include asynchronous design, design-fortestability, fault-tolerant design, and test of asynchronous circuits. Marko Andielković received the Dipl.-Ing. degree in electronics from the Faculty of Electronic Engineering, University of Nis, Serbia, in 2008. Since 2010, he has been a Scientific Researcher with the Faculty of Electronic Engineering, University of Nis, where he was working on characterization of custom-made dosimeters, evaluation of dosimetric properties of commercial semiconductor components, and design of readout electronics for experimental evaluation of various types of dosimeters. Since 2016, he has been with IHP, where he is employed as a Research Associate with the System Architectures Department. His research interests include characterization and modeling of in computer engineering from Technical University Berlin, Germany, in 2015. After his graduation he joined IHP, Frankfurt, Germany, where he is a member of the Prof. Milos Krstic' Department of System Architectures and Fault Tolerant Computing. His main research interests include methods for fault susceptibility analysis and partial fault tolerance and their integration into the digital design flow. Alexey Balashov was born in Russia, in 1977. He received the Ph.D. degree in solid-state microelectronic from Moscow Institute of Electronic Engineering, Russia, in 2005. Before joining IHP, Frankfurt, Germany, he was with the Analog Design Group, Cadence Design Systems, and the Analog Design Group, Freescale Semiconductor. Since 2013, he has been with the Design Kit Group, Technology Department, IHP. Miloš Krstić received the Dr.-Ing. degree in electronics from Brandenburg University of Technology, Cottbus, Germany, in 2006. Since 2001, he has been with IHP, Frankfurt, Germany, where he leads the Department of System Architectures. Since 2016, he has also been a Professor of design and test methodology with the University of Potsdam. For the last few years, his work was mainly focused on fault tolerant architectures and design methodologies for digital systems integration. He has been managing many international and national research and devel- opment projects at IHP (GALAXY, EMPHASE, IC-NAO, ENROL, RTU-ASIC, SEPHY, DIFFERENT, VHiSSi, and RESCUE). He is also leading and coordinating space activities at IHP. He has published more than 200 journals and conference papers, and registered nine patents.