# A 4-Transistors/1-Resistor Hybrid Synapse Based on Resistive Switching Memory (RRAM) Capable of Spike-Rate-Dependent Plasticity (SRDP)

Valerio Milo, Student Member, IEEE, Giacomo Pedretti, Student Member, IEEE, Roberto Carboni, Student Member, IEEE, Alessandro Calderoni, Nirmal Ramaswamy, Senior Member, IEEE, Stefano Ambrogio, Member, IEEE, and Daniele Ielmini<sup>®</sup>, Senior Member, IEEE

Abstract-Mimicking the cognitive functions of the brain in hardware is a primary challenge for several fields, including device physics, neuromorphic engineering, and biological neuroscience. A key element in cognitive hardware systems is the ability to learn via biorealistic plasticity rules, combined with the area scaling capability to enable integration of high-density neuron/synapse networks. To this purpose, resistive switching memory (RRAM) devices have recently attracted a strong interest as potential synaptic elements. Here, we present a novel hybrid 4-transistors/1-resistor synapse capable of spike-rate-dependent plasticity. The frequency-dependent learning behavior of the synapse is shown by experiments on HfO<sub>2</sub> RRAM devices. Unsupervised learning, update, and recognition of one or more visual patterns in sequence is demonstrated at the level of neural network, thus, supporting the feasibility of hybrid CMOS/RRAM integrated circuits matching the learning capability in the human brain.

*Index Terms*—Neuromorphic networks, online learning, pattern learning, resistive switching memory (RRAM), spike-rate dependent plasticity (SRDP).

#### I. INTRODUCTION

N EUROMORPHIC computing is attracting an increasing interest for cognitive functions, such as pattern recognition [1] and natural language processing [2]. In a neuromorphic circuit, integrate-and-fire (I&F) neurons are connected by synapses, and usually process information by an eventdriven spiking activity [3]. Spikes serve for both carrying the information and inducing plasticity in the synapses, which forms the basis for learning. Brain-inspired learning rules

Manuscript received August 5, 2017; revised December 3, 2017; accepted January 23, 2018. Date of publication April 25, 2018; date of current version November 30, 2018. This work was supported by the European Research Council through the European Union's Horizon 2020 Research and Innovation Program under Grant 648635. (*Corresponding author: Daniele Ielmini.*)

V. Milo, G. Pedretti, R. Carboni, and D. Ielmini are with the Dipartimento di Elettronica, Informazione e Bioingegneria, Italian Universities Nanoelectronics Team, Politecnico di Milano, 20133 Milano, Italy (e-mail: valerio.milo@polimi.it; giacomo.pedretti@polimi.it; roberto.carboni@polimi.it; daniele.ielmini@polimi.it).

A. Calderoni and N. Ramaswamy are with Micron Technology, Inc., Boise, ID 83707 USA (e-mail: acaldero@micron.com; dramaswamy@micron.com).

S. Ambrogio was with the Dipartimento di Elettronica, Informazione e Bioingegneria, Italian Universities Nanoelectronics Team, Politecnico di Milano, 20133 Milano, Italy. He is now with IBM Research-Almaden, San Jose, CA 95120 USA (e-mail: stefano.ambrogio@ibm.com).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TVLSI.2018.2818978

(a) Synapse pre-synaptic post-synaptic terminal terminal (b)  $V_{\text{TE}}$ П RRAM  $(f_{3})$ M1 M3 ₋∟ PRE spike Dela (f<sub>PRF</sub>) M2 M4 Fire ∆tr PRE Current POS noise  $(f_4)$ POST

Fig. 1. (a) Sketch of a biological synapse connecting PRE- and POSTsynaptic neurons. (b) Schematic of corresponding PRE-synapse-POST circuit. 4T1R synapse is capable of LTP via  $M_1/M_2$  branch, which is controlled by PRE spikes at average frequency  $f_{\rm PRE}$  induced by external stimuli, and LTD via  $M_3/M_4$  branch, which is activated by PRE and POST noise spikes at average frequencies  $f_3$  and  $f_4$ , respectively.

are generally based on the timing of the spike arriving from the presynaptic neuron, or PRE, and the spike delivered by the postsynaptic neuron, or POST [Fig. 1(a)]. For instance, in spike-timing-dependent plasticity (STDP), the change of synaptic weight is dictated by the delay between the PRE and POST spikes.

STDP has been demonstrated to occur in certain synapses in the brain [4], [5], and is currently among the most popular approaches for unsupervised training of neural networks [6]–[8].

Other learning rules have been considered to be responsible for learning in biological neural networks. According to the

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/

Bienenstock–Cooper–Munro theory [9], synaptic plasticity is governed by the PRE- and POST-spike frequencies, rather than the timing of a pair of PRE and POST spikes. A high frequency of PRE and POST spikes leads to potentiation, while a low frequency leads to depression. This spike-rate-dependent plasticity (SRDP) has been recognized as a biorealistic learning rule [10], and linked to triplet-based learning rules [11], where potentiation relies on the temporal occurrence of three spikes [12]. Integrated circuits capable of learning by STDP or SRDP rules generally require complicated, and large synaptic blocks hosting multiple transistors and capacitors [13], [14]. To enable small-area synapse, hence, high-density neural circuits, emerging memories such as resistive switching memory (RRAM) and phase change memory (PCM) have recently attracted a strong interest [15]-[26]. The development of RRAM-based SRDP synapses is still a major challenge for neuromorphic engineering [27]-[32].

This paper presents a RRAM-based SRDP synapse with a 4-transistors/1-resistor (4T1R) structure. In the synapse, the RRAM provides the synaptic weight for spike-based communication, whereas potentiation/depression is achieved via three-spike overlapping according to a modified triplet rule.

We implement this scheme into a 4T1R synapse prototype and provide extensive experimental characteristics. Our data demonstrate pattern learning by SRDP in hardware, by separately showing depression of background synapses and potentiation of pattern synapses. To corroborate our experimental results, we simulate an  $8 \times 8$ , two-layer neural network evidencing learning of a single visual pattern for variable configuration of the initial weights and investigate the learning efficiency of the network as a function of the noise frequency. Finally, our simulations demonstrate online learning of two visual patterns submitted in sequence evidencing real-time adaptation of SRDP-based 4T1R synapses.

#### **II. SYNAPSE STRUCTURE**

Fig. 1(b) illustrates the circuit architecture of the SRDP synapse in this paper. The synapse consists of a hybrid CMOS/RRAM structure, combining four MOS transistors and a bipolar-switching RRAM device [16], [33], and serving as connection between a PRE and a POST [27]. In the synaptic circuit, the transistors are arranged in two branches, namely, transistors  $M_1$  and  $M_2$  which are responsible for synaptic long-term potentiation (LTP), and transistors  $M_3$  and  $M_4$ for synaptic long-term depression (LTD). The RRAM device is connected in series to the parallel of branches  $M_1/M_2$ and  $M_3/M_4$ . The PRE spike is applied to the gate of  $M_1$ and, after a delay by a time  $\Delta t_D$ , to the gate of  $M_2$ . The gate of  $M_3$  is driven by a random noise PRE spiking. The POST spike consists of an I&F circuit, which delivers a fire spike to the top electrode (TE) of the RRAM device as the internal potential resulting from integration exceeds a certain threshold [21], [34]. The POST also generates a negative noise spike that is alternatively submitted to the TE and, after inversion, to the gate of  $M_4$ . The POST multiplexer activates the fire channel on at every POST fire, temporarily inhibiting the noise channel to the TE. Noise spikes can be obtained



Fig. 2. Illustrative description of spike timing inducing (a) LTP for high frequency PRE spiking activity, (b) no LTP for low-frequency PRE spiking activity, and (c) stochastic LTD via PRE and POST noise spikes.

by tunable random number generator circuits, e.g., by amplification of thermal noise, e.g., 1/f noise [35] or random telegraph noise [36], or by random set processes in RRAM devices [37], [38].

The hybrid CMOS/RRAM structure of our synapse has some key advantages compared to the previous approaches in which SRDP was demonstrated by specific RRAM materials, such as Ag<sub>2</sub>S [28], Ag/AgInSbTe/Ag [29], Pt/FeO<sub>x</sub>/Pt [30], Al/TiO<sub>2-x</sub>/AlO<sub>x</sub>/Al [31], and Ag/SiON [32]. In particular, our synapse relies on memory-grade RRAM technology with fast switching, long endurance, and long-term retention, which might be used in a multipurpose system on chip for several functions, including embedded nonvolatile memory for code/data storage, generation of random keys for hardware security functions, such as a physical unclonable function [39], and neuromorphic synapse/neuron circuits for on-chip cognitive computation.

## **III. SYNAPSE POTENTIATION AND DEPRESSION**

## A. Potentiation at High PRE-Spike Frequency

Synapse potentiation takes place at high frequency of PRE spiking, as shown in Fig. 2(a). In fact, if the PRE frequency is higher than  $\Delta t_D^{-1}$  ( $f_{PRE} > \Delta t_D^{-1}$ ), there is a strong probability for the gate of  $M_1$  (activated by a spike at time *t*) and the gate of  $M_2$  (activated by a previous spike delayed by  $\Delta t_D$ ) to be stimulated at the same time. The repeated and simultaneous activation of  $M_1$  and  $M_2$ , forming a NAND gate, results in current spikes which are integrated in the I&F circuit and finally cause fire. The fire spike is, then, delivered to the



Fig. 3. Measured and calculated cumulative distributions of resistance R (a) before and (b) after learning. (c) Measured and calculated average R for increasing  $f_{PRE}$ . (d) Number of overlapping PRE spikes activating  $M_1$  and  $M_2$  as a function of  $f_{PRE}$ .

TE of RRAM such that the overlapping spikes at  $M_1$ ,  $M_2$ , and TE induce a set process of the resistive device, hence an LTP event. Note that the positive fire spike is also applied to the gate of  $M_4$  after inversion, which deactivates the  $M_3/M_4$ branch. In summary, a high PRE spiking frequency causes LTP through the  $M_1/M_2$  branch. This result supports the need for a triplet of spikes (PRE–PRE–POST) to induce a frequencydependent potentiation of a synapse [11], [12].

## B. Depression at Low PRE-Spike Frequency

As shown in Fig. 2(b), PRE spiking at low frequency ( $f_{PRE} \ll \Delta t_D^{-1}$ ) cannot activate the NAND-type  $M_1/M_2$  branch, and thus, LTP cannot take place. On the other hand, random noise spikes from the PRE and the POST can simultaneously activate  $M_3$  and  $M_4$ , respectively, as shown in Fig. 2(c). Since the negative POST noise is applied to the TE, the simultaneous noise spiking of PRE and POST leads to a stochastic reset process of the RRAM device, hence a synaptic LTD event. As a result, the SRDP synapse undergoes LTP or LTD depending on the competition between the spikecontrolled activation of the  $M_1/M_2$  and the  $M_3/M_4$  branches, respectively [27].

Note that the 2-branch, 4T1R structure might be relatively expensive from the viewpoint of area consumption, e.g., compared to 1T1R synapses [21] and 2T1R synapses [22] for STDP. However, this is the minimum structure to serve the function of online potentiation/depression from rate-coded spiking information.

## **IV. SYNAPSE CHARACTERISTICS**

The potentiation/depression dynamics of the 4T1R synapse was studied by individually testing each branch by an integrated 2T1R structure, consisting of two MOS transistors and a HfO<sub>2</sub> RRAM device in series [40]. The bipolar-switching RRAM used in these experiments had a Ti TE and a TiN bottom electrode. The active material was Si-doped HfO<sub>2</sub> deposited with an amorphous phase. The Ti TE also plays the role of creating an oxygen exchange layer, by inducing an oxygen-vacancy-rich layer by oxygen gettering [40]. The TiN layer served as inert bottom electrode to prevent breakdown during the bipolar switching operation of the device. In addition, the size of MOS transistors used in the structure was  $W/L = 3 \ \mu m/1.45 \ \mu m$  [41].

To demonstrate the synaptic potentiation induced by a highfrequency PRE spiking, we characterized the LTP branch applying a constant positive voltage of 2 V to the TE, while the gate of  $M_1$  was stimulated by a train of random spikes with amplitude 3.2 V, pulsewidth 1 ms, and average frequency  $f_{PRE}$ . The same train was delayed by a time  $\Delta t_D = 10$  ms, then applied to the gate of  $M_2$ . The  $M_2$  pulse amplitude was also reduced to 1.6 V to limit the overall current to a compliance level  $I_C = 50 \ \mu A$  during set process for a controlled LTP. The RRAM device was prepared in a high resistance state (HRS) of about 150 k $\Omega$  to check the LTP statistics during a 0.75-slong training process with a given value of  $f_{PRE}$ . The training experiment was repeated 1000 times on the same devices for each value of  $f_{PRE}$ . Fig. 3 shows the measured and calculated distributions of R before (a) and after each training process (b), for increasing  $f_{PRE}$ . The initial distribution in Fig. 3(a) corresponds to the initial HRS, as obtained by a reset pulse of -1.6 V applied to the TE with gate voltage 3.2 V applied to  $M_1$  and  $M_2$ . The distributions in Fig. 3(b) after training, show increasing fractions of low resistance state (LRS) for increasing  $f_{PRE}$ , with an average LRS resistance of 20 k $\Omega$ . In particular, note that the probability of set transition is high only for  $f_{PRE} \ge 100$  Hz, corresponding to an average time between two consecutive spikes of about  $\Delta t_D$ . Fig. 3(a) and (b) also show calculated distributions obtained by our stochastic simulator of RRAM synapse [21], derived from an analytical model of the bipolar RRAM [42]. The distributions were accurately predicted by calculating the probability for spike overlap within the 0.75-s-long training sequence, and assuming a R-dependent variability for LRS and HRS [21]. Fig. 3(c) summarizes the results by showing the measured and calculated average R as a function of  $f_{PRE}$ . The transition to the LTP regime occurs abruptly for  $f_{\text{PRE}} = \Delta t_D^{-1}$ .

Note that the SRDP synapse in Fig. 3 works as a binary synapse, namely, the RRAM device in the 4T1R structure is always found in either LRS or HRS. This is because of the rather abrupt transitions of set and reset processes in the adopted HfO<sub>2</sub> RRAM [21]. However, the adoption of RRAM devices with materials capable of gradual set/reset processes, such as  $Pr_{1-x}Ca_xMnO_3$  [43] or  $TaO_x/TiO_x$  bilayers [44], might result in analog SRDP of the synapse, with advantages in terms of gray-scale learning [16].

These results can be understood by the increasing probability for spike overlapping at  $M_1$  and  $M_2$  for increasing  $f_{PRE}$ ,



Fig. 4. Measured and calculated average R resulting from LTD branch characterization for increasing PRE-noise frequency  $f_3$  at fixed POST-noise frequency  $f_4$ .

as shown in Fig. 3(d). Both experiments and calculations show that the overlap probability increases with  $f_{PRE}^2$ , as expected for the joint probability of two independent spikes in the Poissonian train exciting the LTP branch at the same time [27].

To demonstrate LTD, we tested the same 2T1R synapse by stimulating one transistor  $(M_3)$  by a spike train of amplitude 3.2 V at variable frequency  $f_3$ , while the other transistor  $(M_4)$ was stimulated by a spike train of amplitude 1.6 V and average frequency  $f_4 = 10$  Hz. The same pulse sequence of the gate of  $M_4$  was applied after inversion to the TE. This training sequence was maintained for 6000 epochs, equivalent to 6 s, and each experiment was repeated five times after preparing the device in the LRS. Fig. 4 shows the measured and calculated R as a function of  $f_3$ , indicating a transition to the LTD regime for  $f_3 > f_4$ , as the overlap probability becomes sufficiently large to allow for at least one reset transition [27].

Note that the particular choice of frequency operation for potentiation and depression is dictated by the analogy with biological systems, e.g., experiments on synaptic plasticity *in vitro* [10]. Note, however, that by tuning  $\Delta t_D$ ,  $f_{PRE}$ , and noise frequencies  $f_3$  and  $f_4$ , it is possible to freely vary the operation frequency, e.g., for accelerated training of neural networks. The ultimate frequency for SRDP synapse is in the range of 1 GHz, because of limitations in the RRAM switching time of a fraction of nanoseconds [45], [46].

## V. EXPERIMENTAL DEMONSTRATION OF LEARNING

To prove the feasibility of unsupervised learning by SRDP at the level of synaptic network, we considered the use of the SRDP synapse within a feed-forward perceptron like neural network, where the input information is coded into the spiking frequency. Note, however, that the applicability of SRDP synapses is not restricted to a particular neuromorphic system or architecture. Indeed, SRDP synapses are generically suitable for the training of any spiking neural network, e.g., feed-forward or recurrent networks, in the presence of ratecoded spikes.



Fig. 5. Illustrative scheme of a two-layer perceptron neural network capable of pattern learning according to SRDP rule where high and low PRE spiking rates lead to pattern potentiation and background depression, respectively.

Fig. 5 depicts the considered two-layer perceptron, where the PREs in the first layer generate spikes at high or low frequency, depending on their position being within or outside of a pattern, assumed to correspond to a reference image. The PRE spikes are submitted to a single POST in the second layer via SRDP synapses. Thanks to the SRDP behavior, synapses in the pattern will experience LTP because of the high spiking frequency, whereas synapses in the background (i.e., outside of the pattern) will undergo LTD due to the low PRE spiking frequency overwhelmed by random noise spiking. The SRDP algorithm was applied to integrated 2T1R structures used alternatively as LTD and LTP branches in the 4T1R synapse. LTD and LTP operation schemes were applied for 1 s, each on the same 2T1R structure. As a reference synaptic network, we adopted an array of  $8 \times 8$  SRDP synapses that were initially prepared in a random state with resistance between LRS and HRS levels.

Fig. 6(a) shows the visual pattern that was considered as input for image learning demonstration. The training procedure consists of two phases: in the first phase (LTD), random noise images, such as the one in Fig. 6(b), were submitted for 1 s to all synapses to achieve LTD.

Starting from the initial synaptic weight distribution in Fig. 6(c), the first training phase resulted in LTD as demonstrated by the HRS weights in Fig. 6(d). In the second phase, the LTP mode was adopted by stimulating background and pattern synapses with random spikes at low frequency ( $f_{PRE} =$ 5 Hz) and high frequency ( $f_{PRE} = 150$  Hz), respectively, for 1 s. The final weight distribution in Fig. 6(e) demonstrates learning of the pattern shown in Fig. 6(a), thanks to the spiking frequency being higher than  $\Delta t_D^{-1}$ .

Fig. 7(a) shows the measured synaptic weights 1/R as a function of time during the two phases of training. In the first period, both pattern and background synapses approach low weight due to noise-induced stochastic LTD.

In the second period, synaptic weights in the pattern increase due to LTP process induced by SRDP, while background synapses remain at a low conductance due to



Fig. 6. Illustration of (a) input pattern and (b) example of random noise image submitted during the training process. Color plots of synaptic weights (c) initially prepared in a random state between LRS and HRS, (d) after LTD phase, and (e) as a result of pattern presentation during the LTP phase of the training process.



Fig. 7. (a) Time evolution of measured pattern (red) and background (cyan) conductance showing synaptic LTD within 1 s due to PRE and POST noise spiking and the selective potentiation of synapses in the pattern because of high frequency PRE stimulation during the following 1-s-long LTP phase. (b) Mean evolution of measured pattern and background synaptic weights as a function of time supporting background depression and pattern potentiation.

low-frequency spiking. Fig. 7(b) shows the corresponding average synaptic weights for the pattern and the background as a function of time, clearly indicating the LTD and LTP phases.

### VI. SIMULATION STUDY

## A. Synapse Operation

To support the experimental study of 4T1R synapse, we carried out extensive simulations at level of single device and neural network. We first calculated a color map, reported in Fig. 8, showing synaptic weight change  $R_0/R$  as a function of  $f_{\text{PRE}}$  and the reciprocal of time delay  $\Delta t_D^{-1}$  by settling an initial intermediate resistance  $R_0 = 100 \text{ k}\Omega$  and training time of 1 s. Ideally, LTP transition should take place for any  $f_{\text{PRE}} \ge \Delta t_D^{-1}$ , however, being the training time limited to 1 s, no conductance change is observed as  $f_{\text{PRE}}$  and  $\Delta t_D^{-1}$  assume low values because no spike overlap events occur. In addition, the map evidences that LTD transition can also be observed for  $f_{\text{PRE}} < \Delta t_D^{-1}$  provided that PRE and POST noise rates, both set to  $(\Delta t_D^{-1}/10)$ , are sufficiently high.

## B. Single Pattern Learning

To further corroborate the SRDP learning by the 4T1R synapse, we simulated the two-layer perceptron network in Fig. 5. The same  $8 \times 8$  pattern of Fig. 6(a) was adopted for simplicity. Fig. 9(a) shows the sequence of spikes submitted at each of the 64 channels, evidencing different spiking frequencies at the pattern ( $f_{PRE} = 100 \text{ Hz}$ ) and background ( $f_{PRE} = 1 \text{ Hz}$ ). Fig. 9(b) shows the distributions of time intervals



Fig. 8. Calculated color map of synapse conductance change  $R_0/R$  for variable  $f_{\text{PRE}}$  and  $\Delta t_D^{-1}$  evidencing LTP (red), LTD (blue), and no weight change (green).

between consecutive spikes for pattern and background, evidencing an exponential decrease with frequency which is typical of random Poissonian events. Fig. 9(c) shows the distribution of interspike times for PRE and POST noise spiking with rate of  $f_3 = 50$  Hz and  $f_4 = 10$  Hz, respectively. Fig. 10 shows the calculated synaptic weights in a color plot at times (a) 0 s, (b) 5 s, and (c) 10 s, and the detailed time evolution of the calculated 1/R during the whole training process. Initial weights are uniformly distributed between LRS and HRS. Pattern synapses are potentiated within about 1 s from the start of training, while background synapses approach low weight more slowly, as the noise spiking activity has lower frequency compared to  $f_{PRE}$  in the pattern. Note that the pattern synapses may be temporarily disturbed from their high weight due to stochastic noise. We have quantified this disturb in a probability of 1% for pattern synapses to have low weight during training, under the conditions of this simulation. Also, we calculated the synaptic weights as a function of time during training under the same conditions as Fig. 10, except the initial distribution being prepared in HRS (Fig. 11) or LRS (Fig. 12). In the first case, learning only requires LTP of pattern synapses, whereas in the second case complete learning requires LTD of the background synapses, thus requires longer time.



Fig. 9. (a) PRE spikes as a function of time showing high- and low-frequency stimulation for pattern and background input channels, respectively. Distributions of time intervals between two consecutive spikes for (b) pattern/background channels and (c) PRE/POST noise channels.



Fig. 10. Color plots of weights at times (a) t = 0 s, (b) t = 5 s, and (c) t = 10 s. (d) Time evolution of calculated synaptic weights initialized in a random state between LRS and HRS levels. The evolution of conductance as a function of time evidences fast potentiation of pattern synapses (red) and a slower depression of background synapses (cyan). Black and blue lines: time evolution of mean pattern and background synapses, respectively.



Fig. 11. Color plots of weights at times (a) t = 0 s, (b) t = 5 s, and (c) t = 10 s. (d) Evolution of calculated synaptic weights as a function of time starting from initial HRS weights. Synaptic evolution reveals a very fast pattern learning since background is already fully depressed.

## C. Impact of Noise on Learning Efficiency

Noise plays a leading role in SRDP by inducing LTD. On the other hand, noise affects all synapses at the same extent, thus may also disturb pattern learning. To study the impact of noise on learning, we evaluated the efficiency of perceptron network as a function of PRE noise frequency  $f_3$ 



Fig. 12. Color plots of weights at times (a) t = 0 s, (b) t = 5 s, and (c) t = 10 s. (d) Time evolution of calculated synaptic weights, which are initially prepared in LRS, evidencing a slower pattern learning in comparison with the previous two cases because all background synapses need to be depressed.

and POST noise frequency  $f_4$ . The learning efficiency was evaluated by calculating the learning probability Plearn, defined as the probability of POST fire in response to the submission of the pattern after the training stage, and error probability  $P_{\rm error}$ , defined as the probability of POST fire in response to the submission of an input random noise [21], [47]. The pattern in Fig. 6(a) was used for the training phase, which lasted 5000 epochs, equivalent to 5 s. Fig. 13 shows (a) the calculated Plearn and (b) Perror in a color plot as a function of  $f_3$  and  $f_4$ .  $P_{\text{learn}}$  becomes very close to 1 as either  $f_3$  or  $f_4$ decreases, thus making noise disturbance negligible. As  $f_3$ and  $f_4$  increase,  $P_{\text{learn}}$  decreases because noise spikes make the learning process strongly unstable. On the other hand,  $P_{\rm error}$  shows the opposite behavior, as a low noise rate induces no LTD; thus, any random noise may excite synapses in the LRS and cause false fire. A high-noise frequency instead causes strong LTD and suppression of false fires, although true fires are also affected. We identified the noise rates for the best tradeoff between efficient learning and low false fires, which can be found along the curve with a constant geometric average  $(f_3 f_4)^{1/2} = 40$  Hz.

### D. Online Learning of Sequential Patterns

One of the advantages of bidirectional SRDP, i.e., the availability of both LTP and LTD, is online learning, where



Fig. 13. Calculated color maps showing the effect of PRE and POST noise average frequencies  $f_3$  and  $f_4$  on (a) learning probability and (b) error probability of "X" pattern via a perceptron neural network with RRAM-based synapses capable of SRDP. Optimal performance is achieved if  $f_3$  and  $f_4$  obey the tradeoff relation described by indicated curve.



Fig. 14. (a) Raster plot of PRE spikes evidencing the change of input pattern at time 5 s. Color plots of weights at (b) t = 0 s, (c) t = 5 s, and (d) t = 10 s during learning of a sequence of images with PRE and POST noise spiking rates equal to 50 and 20 Hz, respectively. (e) Time evolution of synaptic weights showing a fast potentiation of "X" weights and a gradual depression of background synapses within 5 s. At 5 s, the "X" pattern is replaced with the "C" pattern and all weights adapt to new submitted pattern according to SRDP learning rule.



Fig. 15. (a) Raster plot of PRE input spikes due to sequential patterns. Color plots of synaptic weights at (b) t = 0 s, (c) t = 5 s, and (d) t = 10 s during an online learning process with PRE and POST low-frequency noise spiking at 10 and 5 Hz, respectively. (e) Time evolution of synaptic weights evidencing final potentiation of synapses in both patterns since the first stored pattern "X" cannot be erased without sufficiently strong noise activity.

the synaptic network learns the currently submitted pattern and is capable of erasing, or forgetting, any previously stored pattern [16], [48]. To support the capability of online learning, we simulated the presentation of two different patterns in sequence to our perceptron network. Fig. 14(a) shows the spiking sequence submitted by the PRE layer, including a first phase with pattern "X" for 5 s, followed by a second phase where pattern "C" was submitted for 5 s. Fig. 14 also shows the color maps of  $8 \times 8$  synaptic weights at times (b) 0 s, (c) 5 s, and (d) 10 s, evidencing accurate learning of the submitted patterns. Fig. 14(e) shows the synaptic weights as a function of time, indicating convergence to LRS or HRS of pattern synapses or background synapses, respectively, in each phase. In particular, as pattern "X" starts being excited at low frequency at 5 s, the corresponding synapses are depressed by PRE and POST random noise spiking activities at 50 and 20 Hz, respectively. Therefore, as the input pattern is changed, our neural network is capable of forgetting the first pattern to adapt to the second one by SRDP plastic 4T1R synapses, by properly tuned noise spiking activity. However, if the online learning process was carried out with too low PRE and POST noise spike rates equal to 10 and 5 Hz, respectively, the PRE input spike trains shown in Fig. 15(a) would lead from initial random weights to the simultaneous potentiation of synapses within both "X" and "C" patterns [Fig. 15(b)-(d)], thus, preventing a selective online adaptation of synaptic weights to the visual patterns submitted in sequence to the first one.

### VII. CONCLUSION

This paper presents a novel synapse architecture for SRDP, that is considered as a fundamental learning rule in the human brain. The hybrid synapse combines one RRAM device with four MOS transistors arranged in two NAND-type branches, serving the LTP and LTD functions in SRDP. Noise is used to induce LTD of synapses connected to neurons spiking at low frequency. The synapse is demonstrated by experiments on integrated 2T1R structures, while extensive simulations support stable learning of one or more patterns by SRDP and the ability to properly tune the spiking frequency of noise sources to enable high learning accuracy.

# VIII. LIST OF DIFFERENCES

A preliminary design of a hybrid CMOS/RRAM with the 4T1R structure, capable of SRDP was reported in [27].

With respect to the previous report, in this paper, we present a broader experimental analysis of potentiation/depression characteristics of the synapse, providing a comprehensive study of online pattern learning of neural networks equipped with 4T1R synapses.

In particular, the experimental demonstration of pattern learning in Figs. 6 and 7 is originally shown in this paper. The simulation study of synapse potentiation as a function of PRE-spike frequency and internal delay  $\Delta t_D$  in Fig. 8 is originally reported in this paper. The simulation study of unsupervised pattern learning in Figs. 10–12 is originally reported in this paper. The simulation study of online learning for various random noise spiking shown in Figs. 14 and 15 is originally reported in this paper.

#### REFERENCES

 Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," *Nature*, vol. 521, pp. 436–444, May 2015, doi: 10.1038/nature14539.

- [2] J. Hirschberg and C. D. Manning, "Advances in natural language processing," *Science*, vol. 349, no. 6245, pp. 261–266, Jul. 2015, doi: 10.1126/science.aaa8685.
- [3] G. Indiveri and S.-C. Liu, "Memory and information processing in neuromorphic systems," *Proc. IEEE*, vol. 103, no. 8, pp. 1379–1397, Aug. 2015, doi: 10.1109/JPROC.2015.2444094.
- [4] G. Q. Bi and M. M. Poo, "Synaptic modifications in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type," *J. Neurosci.*, vol. 18, no. 24, pp. 10464–10472, Dec. 1998.
- [5] L. F. Abbott and S. B. Nelson, "Synaptic plasticity: Taming the beast," *Nature Neurosci.*, vol. 3, no. 11, pp. 1178–1183, Nov. 2000, doi: 10.1038/81453.
- [6] T. Masquelier and S. J. Thorpe, "Unsupervised learning of visual features through spike timing dependent plasticity," *PLoS Comput. Biol.*, vol. 3, no. 2, p. e31, Feb. 2007, doi: 10.1371/journal.pcbi.0030031.
- [7] M. Suri et al., "CBRAM devices as binary synapses for low-power stochastic neuromorphic systems: Auditory (cochlea) and visual (retina) cognitive processing applications," in *IEDM Tech. Dig.*, Dec. 2012, pp. 235–238, doi: 10.1109/IEDM.2012.6479017.
- [8] P. U. Diehl and M. Cook, "Unsupervised learning of digit recognition using spike-timing-dependent plasticity," *Frontiers Comput. Neurosci.*, vol. 9, p. 99, Aug. 2015, doi: 10.3389/fncom.2015.00099.
- [9] E. L. Bienenstock, L. N. Cooper, and P. W. Munro, "Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex," *J. Neurosci.*, vol. 2, no. 1, pp. 32–48, Jan. 1982.
- [10] M. F. Bear, "A synaptic basis for memory storage in the cerebral cortex," *Proc. Natl. Acad. Sci. USA*, vol. 93, no. 24, pp. 13453–13459, Nov. 1996.
- [11] J. Gjorgjieva, C. Clopath, J. Audet, and J. P. Pfister, "A triplet spike timing dependent plasticity model generalizes the Bienenstock-Cooper-Munro rule to higher-order spatiotemporal correlations," *Proc. Natl. Acad. Sci. USA*, vol. 108, no. 48, pp. 19383–19388, Nov. 2011, doi: 10.1073/pnas.1105933108.
- [12] J.-P. Pfister and W. Gerstner, "Triplets of spikes in a model of spike timing-dependent plasticity," *J. Neurosci.*, vol. 26, no. 38, pp. 9673–9682, Sep. 2006, doi: 10.1523/JNEUROSCI.1425-06.2006.
- [13] G. Indiveri, F. Corradi, and N. Qiao, "Neuromorphic architectures for spiking deep neural networks," in *IEDM Tech. Dig.*, Dec. 2015, pp. 68–71, doi: 10.1109/IEDM.2015.7409623.
- [14] N. Qiao and G. Indiveri, "Scaling mixed-signal neuromorphic processors to 28 nm FD-SOI technologies," in *Proc. IEEE Biomed. Circuits Syst. Conf. (BioCAS)*, Oct. 2016, pp. 552–555, doi: 10.1109/Bio-CAS.2016.7833854.
- [15] T. Serrano-Gotarredona, T. Masquelier, T. Prodromakis, G. Indiveri, and B. Linares-Barranco, "STDP and STDP variations with memristors for spiking neuromorphic learning systems," *Frontiers Neurosci.*, vol. 7, p. 2, Feb. 2013, doi: 10.3389/fnins.2013.00002.
- [16] G. Pedretti *et al.*, "Memristive neural network for on-line learning and tracking with brain-inspired spike timing dependent plasticity," *Sci. Rep.*, vol. 7, Jul. 2017, Art. no. 5288, doi: 10.1038/s41598-017-05480-0.
- [17] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, and W. Lu, "Nanoscale memristor device as synapse in neuromorphic systems," *Nano Lett.*, vol. 10, no. 4, pp. 1297–1301, 2010, doi: 10.1021/n1904092h.
- [18] S. Ambrogio, S. Balatti, F. Nardi, S. Facchinetti, and D. Ielmini, "Spiketiming dependent plasticity in a transistor-selected resistive switching memory," *Nanotechnology*, vol. 24, no. 38, p. 384012, Sep. 2013, doi: 10.1088/0957-4484/24/38/384012.
- [19] S. Yu, Y. Wu, R. Jeyasingh, D. Kuzum, and H.-S. P. Wong, "An electronic synapse device based on metal oxide resistive switching memory for neuromorphic computation," *IEEE Trans. Electron Devices*, vol. 58, no. 8, pp. 2729–2737, Aug. 2011, doi: 10.1109/TED.2011.2147791.
- [20] K. Seo, I. Kim, S. Jung, M. Jo, S. Park, J. Park, J. Shin, K. P. Biju, J. Kong, K. Lee, B. Lee, and H. Hwang, "Analog memory and spiketiming-dependent plasticity characteristics of a nanoscale titanium oxide bilayer resistive switching device," *Nanotechnology*, vol. 22, no. 25, p. 254023, Jun. 2011, doi: 10.1088/0957-4484/22/25/254023.
- [21] S. Ambrogio *et al.*, "Neuromorphic learning and recognition with onetransistor-one-resistor synapses and bistable metal oxide RRAM," *IEEE Trans. Electron Devices*, vol. 63, no. 4, pp. 1508–1515, Apr. 2016, doi: 10.1109/TED.2016.2526647.
- [22] Z. Wang, S. Ambrogio, S. Balatti, and D. Ielmini, "A 2-transistor/1resistor artificial synapse capable of communication and stochastic learning in neuromorphic systems," *Frontiers Neurosci.*, vol. 8, p. 438, Jan. 2015, doi: 10.3389/fnins.2014.00438.

- [23] S. Yu, B. Gao, Z. Fang, H. Yu, J. Kang, and H.-S. P. Wong, "A low energy oxide-based electronic synaptic device for neuromorphic visual systems with tolerance to device variation," *Adv. Mater.*, vol. 25, no. 12, pp. 1774–1779, Mar. 2013, doi: 10.1002/adma.201203680.
- [24] E. Covi, S. Brivio, A. Serb, T. Prodromakis, M. Fanciulli, and S. Spiga, "Analog memristive synapse in spiking networks implementing unsupervised learning," *Frontiers Neurosci.*, vol. 10, p. 482, Oct. 2016, doi: 10.3389/fnins.2016.00482.
- [25] M. Hansen, F. Zahari, M. Ziegler, and H. Kohlstedt, "Doublebarrier memristive devices for unsupervised learning and pattern recognition," *Frontiers Neurosci.*, vol. 11, p. 91, Feb. 2017, doi: 10.3389/fnins.2017.00091.
- [26] M. V. Nair, L. K. Muller, and G. Indiveri, "A differential memristive synapse circuit for on-line learning in neuromorphic computing systems," *Nano Futures*, vol. 1, no. 3, p. 035003, Nov. 2017, doi: 10.1088/2399-1984/aa954a.
- [27] V. Milo *et al.*, "Demonstration of hybrid CMOS/RRAM neural networks with spike time/rate-dependent plasticity," in *IEDM Tech. Dig.*, Dec. 2016, pp. 440–443, doi: 10.1109/IEDM.2016.7838435.
- [28] T. Ohno, T. Hasegawa, T. Tsuruoka, K. Terabe, J. K. Gimzewski, and M. Aono, "Short-term plasticity and long-term potentiation mimicked in single inorganic synapses," *Nature Mater.*, vol. 10, pp. 591–595, Aug. 2011, doi: 10.1038/nmat3054.
- [29] Y. Li, Y. Zhong, J. Zhang, L. Xu, Q. Wang, H. Sun, H. Tong, X. Cheng, and X. Miao, "Activity-dependent synaptic plasticity of a chalcogenide electronic synapse for neuromorphic systems," *Sci. Rep.*, vol. 4, p. 4906, May 2014, doi: 10.1038/srep04906.
- [30] W. He *et al.*, "Enabling an integrated rate-temporal learning scheme on memristor," *Sci. Rep.*, vol. 4, p. 4755, Apr. 2014, doi: 10.1038/srep04755.
- [31] M. Ziegler, C. Riggert, M. Hansen, T. Bartsch, and H. Kohlstedt, "Memristive Hebbian plasticity model: Device requirements for the emulation of Hebbian plasticity based on memristive devices," *IEEE Trans. Biomed. Circuits Syst.*, vol. 9, no. 2, pp. 197–206, Apr. 2015, doi: 10.1109/TBCAS.2015.2410811.
- [32] Z. Wang *et al.*, "Memristors with diffusive dynamics as synaptic emulators for neuromorphic computing," *Nat. Mater.*, vol. 16, no. 1, pp. 101–108, Jan. 2017, doi: 10.1038/nmat4756.
- [33] A. Calderoni, S. Sills, and N. Ramaswamy, "Performance comparison of O-based and Cu-based ReRAM for high-density applications," in *Proc. Int. Memory Workshop (IMW)*, 2014, pp. 1–4, doi: 10.1109/IMW.2014.6849351.
- [34] E. Chicca, F. Stefanini, C. Bartolozzi, and G. Indiveri, "Neuromorphic electronic circuits for building autonomous cognitive systems," *Proc. IEEE*, vol. 102, no. 9, pp. 1367–1388, Sep. 2014, doi: 10.1109/JPROC.2014.2313954.
- [35] Z. Wei *et al.*, "True random number generator using current difference based on a fractional stochastic model in 40-nm embedded ReRAM," in *IEDM Tech. Dig.*, Dec. 2016, pp. 107–110, doi: 10.1109/IEDM.2016.7838349.
- [36] C.-Y. Huang, W. C. Shen, Y.-H. Tseng, Y.-C. King, and C.-J. Lin, "A contact-resistive random-access-memory based true random number generator," *IEEE Electron Device Lett.*, vol. 33, no. 8, pp. 1108–1110, Aug. 2012, doi: 10.1109/LED.2012.2199734.
- [37] S. Gaba, P. Sheridan, J. Zhou, S. Choi, and W. Lu, "Stochastic memristive devices for computing and neuromorphic applications," *Nanoscale*, vol. 5, no. 13, pp. 5872–5878, 2013, doi: 10.1039/c3nr01176c.
- [38] S. Balatti, S. Ambrogio, Z. Wang, and D. Ielmini, "True random number generation by variability of resistive switching in oxide-based devices," *IEEE J. Emerg. Sel. Topics Circuits Syst.*, vol. 5, no. 2, pp. 214–221, Jun. 2015, doi: 10.1109/JETCAS.2015.2426492.
- [39] S. Balatti *et al.*, "Physical unbiased generation of random numbers with coupled resistive switching devices," *IEEE Trans. Electron Devices*, vol. 63, no. 5, pp. 2029–2035, May 2016, doi: 10.1109/TED.2016.2537792.
- [40] S. Balatti *et al.*, "Voltage-controlled cycling endurance of HfO<sub>x</sub>based resistive-switching memory (RRAM)," *IEEE Trans. Electron Devices*, vol. 62, no. 10, pp. 3365–3372, Oct. 2015, doi: 10.1109/TED.2015.2463104.
- [41] Z. Wang, S. Ambrogio, S. Balatti, S. Sills, A. Calderoni, N. Ramaswamy, and D. Ielmini, "Postcycling degradation in metal-oxide bipolar resistive switching memory," *IEEE Trans. Electron Devices*, vol. 63, no. 11, pp. 4279–4287, Nov. 2016, doi: 10.1109/TED.2016.2604370.

- [42] S. Ambrogio, S. Balatti, D. C. Gilmer, and D. Ielmini, "Analytical modeling of oxide-based bipolar resistive memories and complementary resistive switches," *IEEE Trans. Electron Devices*, vol. 61, no. 7, pp. 2378–2386, Jul. 2014, doi: 10.1109/TED.2014.2325531.
- [43] J.-W. Jang, S. Park, G. W. Burr, H. Hwang, and Y.-H. Jeong, "Optimization of conductance change in Pr<sub>1-x</sub>Ca<sub>x</sub>MnO<sub>3</sub>-based synaptic devices for neuromorphic systems," *IEEE Electron Device Lett.*, vol. 36, no. 5, pp. 457–459, May 2015, doi: 10.1109/led.2015.2418342.
- [44] I.-T. Wang, Y.-C. Lin, Y.-F. Wang, C.-W. Hsu, and T.-H. Hou, "3D synaptic architecture with ultralow sub-10 fJ energy per spike for neuromorphic computation," in *IEDM Tech. Dig.*, Dec. 2014, pp. 665–668, doi: 10.1109/IEDM.2014.7047127.
- [45] A. C. Torrezan, J. P. Strachan, G. Medeiros-Ribeiro, and R. S. Williams, "Sub-nanosecond switching of a tantalum oxide memristor," *Nanotechnology*, vol. 22, no. 48, p. 485203, 2011, doi: 10.1088/0957-4484/22/48/485203.
- [46] H. Y. Lee *et al.*, "Evidence and solution of over-reset problem for HfO<sub>x</sub> based resistive memory with sub-ns switching speed and high endurance," in *IEDM Tech. Dig.*, Dec. 2010, pp. 460–463, doi: 10.1109/IEDM.2010.5703395.
- [47] S. Ambrogio *et al.*, "Novel RRAM-enabled 1T1R synapse capable of low power STDP via burst-mode communication and real time unsupervised machine learning," in *Symp. VLSI Tech. Dig.*, 2016, pp. 196–197, doi: 10.1109/VLSIT.2016.7573432.
- [48] S. Ambrogio *et al.*, "Unsupervised learning by spike timing dependent plasticity in phase change memory (PCM) synapses," *Frontiers Neurosci.*, vol. 10, p. 56, Mar. 2016, doi: 10.3389/fnins.2016.00056.



**Valerio Milo** (S'17) received the B.S. and M.S. degrees in electrical engineering from the Politecnico di Milano, Milano, Italy, in 2012 and 2015, respectively, where he is currently pursuing the Ph.D. degree in electrical engineering.

His current research interests include the modeling of the emerging resistive switching memory devices and their applications for neuromorphic computing systems.



**Giacomo Pedretti** (S'17) received the B.S. and M.S. degrees in electrical engineering from the Politecnico di Milano, Milano, Italy, in 2013 and 2016, respectively, where he is currently pursuing the Ph.D. degree in electrical engineering.

His current research interests include design and characterization of neuromorphic networks for beyond-CMOS computing systems.



**Roberto Carboni** (S'16) received the B.S. and M.S. degrees in electrical engineering from the Politecnico di Milano, Milano, Italy, in 2013 and 2016, respectively, where he is currently pursuing the Ph.D. degree in electrical engineering.

His current research interests include characterization and modeling of resistive switching and magnetoresistive memories.



Alessandro Calderoni received the Laurea degree (*cum laude*) in electrical engineering from the Politecnico di Milano, Milano, Italy, in 2006.

He is currently a Senior Device Engineer with the Emerging Memory Cell Technology Team, Micron Technology, Boise, ID, USA. His current research interests include the characterization of various emerging memory devices and selectors for highdensity applications.



Stefano Ambrogio (M'16) received the B.S., M.S., (*cum laude*), and Ph.D. degrees in electrical engineering from the Politecnico di Milano, Milano, Italy, in 2010, 2012, and 2016, respectively.

He is currently a Post-Doctoral Researcher with IBM Research-Almaden, San Jose, CA, USA. His current research interests include nonvolatile memory and cognitive computing.

Dr. Ambrogio was a recipient of the IEEE EDS Rappaport Award in 2015.



**Nirmal Ramaswamy** (M'07–SM'09) received the bachelor's degree in metallurgical engineering from IIT Madras, Chennai, India, and the M.S. and Ph.D. degrees in material science and engineering from Arizona State University Downtown Phoenix campus, Phoenix, AZ, USA.

He is currently a fellow of the Emerging Memory Research and Development Team, Micron Technology, Boise, ID, USA. His current research interests include various emerging memory technologies for high-density applications.



**Daniele Ielmini** (SM'09) received the Ph.D. degree in nuclear engineering from the Politecnico di Milano, Milano, Italy, in 2000.

In 2002, he joined the Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, as an Assistant Professor and an Associate Professor in 2010, where he has been a Professor since 2016. His current research interests include emerging nanoelectronic devices, such as phase change memory and resistive switching memory.

Dr. Ielmini was a recipient of the Intel Outstanding Researcher Award in 2013, the ERC Consolidator Grant in 2014, and the IEEE EDS Rappaport Award in 2015.