# DESIGN OF LFSR BASED FAST ERROR-RESILIENT TERNARY CONTENT ADDRESSABLE MEMORY

E. Prasanthi<sup>1</sup>, Syed Neha<sup>2</sup>, Tanguturu Manasa<sup>2</sup>, Sajjanapu Anunya<sup>2</sup>, Vankivolu Sudeepthi<sup>2</sup>, Sandrapalli Chandrakala<sup>2</sup>

1 Assistance Professor, Dept. of ECE, Geethanjali Institute of Science and Technology, Andhra Pradesh

2 UG Students, Dept. of ECE, Geethanjali Institute of Science and Technology, Andhra Pradesh.

### **Abstract**

As LFSR based ternary content-addressable memory (TCAM) on field-programmable gate arrays (FPGAs) is used for packet classification in software-defined networking (SDN) and Open Flow applications. SRAMs implementing TCAM contents constitute the major part of a TCAM design on FPGAs, which are vulnerable to soft errors. The protection of LFSR -based TCAMs against soft errors is challenging without compromising critical path delay and maintaining a high search performance. This extension presents a low-cost and low-response-time technique for the protection of LFSR -based TCAMs. This technique uses simple, single-bit parity for fault detection which has a minimal critical path overhead. This technique exploits the binary-encoded TCAM table maintained in LFSR -based TCAMs for update purposes to implement a low-response-time error-correction mechanism at low cost. The error-correction process is carried out in the background, allowing lookup operations to be performed simultaneously, thus maintaining a high search performance.

Keywords: TCAM, LFSR, FPGA, Error Resilient, Look Up Table, SRAM

## 1. Introduction

The design of LFSR based Fast Error-Resilient Ternary Content-Addressable Memory (TCAM) is motivated by the need for efficient searching of data patterns and error-resilience in memory architectures. Traditional TCAMs are capable of performing fast searches of data patterns but are not designed to handle errors in the data. In contrast, LFSR based Fast Error-Resilient TCAMs are capable of performing efficient searches while also being able to tolerate errors in the data. This design is particularly relevant for networking devices, such as routers and switches, where fast and accurate packet processing is critical. Networking devices use TCAMs for routing table lookups, access control list matching, and other tasks that require fast searching of data patterns. However, errors can occur in the data due to transmission errors or other sources, which can result in incorrect routing decisions or security breaches. LFSR based Fast Error-Resilient TCAMs are designed to address these issues, by enabling efficient searching while also being able to tolerate errors in the data. The use of LFSRs in the design of Fast Error-Resilient TCAMs also provides a high degree of flexibility in the size of the search key and the number of errors that can be tolerated. The LFSR can be designed to generate multiple pseudo-random sequences for comparison with the search key, which allows for a higher level of error tolerance. Overall, the design of LFSR based is motivated by the need for efficient and error-resilient memory architectures in networking devices and other applications where fast and accurate searching of data patterns is critical. Content-addressable memory (CAM) allows the stored content to be searched in parallel in a single cycle, achieving a high search performance. A binary CAM stores and searches data in only two states: "0" and "1," whereas a ternary CAM (TCAM) represents data in three different states: "0," "1," and do not care state "x." TCAMs are extensively used in network systems for packet classification and filtering. Modern static random access memory (SRAM)-based field programmable gate array (FPGA) technology offers the

flexibility and reconfigurability with high performance required in softwaredefined networking (SDN) and OpenFlow network accelerators for big data. Owing to the disturbances from high-energy neutron particles, circuits on SRAM-based FPGAs are susceptible to single-event upsets (SEUs) . The on-chip embedded memory has been known as the most vulnerable to SEUs in advanced process technologies because of their increasingly small size and highly compact memory cells. An SEU in embedded memory generates a transient error until the corrupted data is overwritten. SEUs can result in either single-bit upsets (SBUs) or multiple-bit upsets (MBUs). Cuckoo hashing offers a low-cost solution for implementing efficient binary CAMs on FPGAs. The TCAM function in most of the SRAM-based FPGA solutions is defined by the content of the configured embedded memories [i.e., block RAM (BRAM), distributed RAM (distRAM) and a transient error may lead to a false match/mismatch and returns an incorrect match address. Accordingly, in the case of a soft error, the affected word of SRAM should be overwritten to retrieve the correct matching information during lookups. However, the protection of SRAM-based TCAM solutions is challenging without compromising the critical path delay and maintaining high search performance. This brief presents a low-cost, low-response time and easy for integration technique for the protection of SRAM-enabled TCAMs without compromising the search performance. The error detection is carried out in a simple way using single-bit parity checking at a minimal delay and logic overhead. The proposed error-correction technique exploits the redundant binary-encoded TCAM table maintained in SRAM-based TCAM solutions for update purposes to correct soft errors. It maintains a high search performance while the proposed errorcorrection mechanism is carried out in the background, allowing search operations to be performed simultaneously. The proposed error-correction technique has a low response time, ensuring a faultless TCAM design for lookups, during the entire (almost) processing time.

# 2. Literature Survey

Mahendra, Telajala Venkata(2020)[1] introduced a precharge-free ternary content addressable memory (PF-TCAM) and proposed searching approach enhances the rate of search by reducing half of the ML evaluation time as it eliminates precharge phase prior to every search by performing search **HALF** cycle..Zhong, Hongtao(2021)[2] proposed dynamic nanoelectromechanical (NEM) relays (DyTAN), and utilize one-shot refresh (OSR) to solve the memory refresh problem. By exploiting the unique NEM relay characteristics, DyTAN outperforms the existing works in the balance between density, speed, and power efficiency. Compared with the 16T SRAM-based TCAM, the 5T CMOS dynamic TCAM, the 2T2R TCAM, and the 2FeFET TCAM, evaluations show that the proposed DyTAN reduces the write energy by up to  $2.3 \times 1.3 \times 1$  $131\times$ , and  $13.5\times$ , and improves the search energy-delay-product (EDP) by up to  $12.7\times$ ,  $1.7\times$ ,  $1.3\times$ , and 2.8×, respectively.Irfan, Muhammad(2020)[3] proposed a novel power-aware reconfigurable FPGA-based TCAM architecture that enables only a portion of the hardware to perform the search operation. We performed an extensive design space exploration to find the optimal number of banks on Xilinx FPGAs, which provides the maximum power saving. Moreover, proposed a solution to bank overflow using backup CAM (BUC) to handle the overflowed CAM entries.Luo, Jin(2021)[4] proposed ferroelectric tunnel FET (FeTFET) based content addressable memory (CAM) cell with only one transistor is proposed and experimentally demonstrated based on 14-nm FinFET technology node for the first time. By exploiting and modulating the non-volatile ferroelectric polarization for entry storage and the unique feature of ambipolar tunneling current for input searching query, XNORlike matching operation of ternary CAM can be realized in one Fe-FinTFET without the need of twin complementary circuit branches. Moreover, benefiting from the ferroelectric multi-domain feature and the intrinsic steep slope from tunneling mechanism, multi-bit CAM function for high density can also be experimentally implemented in the fabricated singe Fe-FinTFET device. Based on the proposed FeTFET CAM design, Hamming and Manhattan distance computing are demonstrated with high energy efficiency, showing its great potential for area- and energy-efficient machine learning. Ullah, Inayat, Joon-Sung Yang, and Jaeyong Chung(2020)[5] presented a lowcost and lowresponse-time technique for the protection of SRAM-based TCAMs. This technique uses simple, single-bit parity for fault detection which has a minimal critical path overhead. This technique exploits the binary-encoded TCAM table maintained in SRAM-based TCAMs for update purposes to implement a low-response-time error-correction mechanism at low cost. The error-correction process is carried out in the background, allowing lookup operations to be performed simultaneously, thus maintaining a high search performance. The proposed technique provides protection against soft errors with a response time of 293 ns, whereas maintaining a search rate of 222 million searches per second on a 1024 × 40 size TCAM on Artix-7 FPGA. [6]Kazemi, Arman, et al. "Fefet multi-bit content-addressable memories for in-memory nearest neighbor search." IEEE Transactions on Computers 71.10 (2021): 2565-2576.Kazemi, Arman(2021)[6] proposed a novel distance function that can be natively evaluated with multi-bit content-addressable memories (MCAMs) based on ferroelectric FETs (FeFETs) to perform a single-step, in-memory NN search. They evaluated the efficacy of FeFET MCAMs in the context of few-shot learning applications with different datasets. As an example, they achieved a 78.54% accuracy for a 5-way, 5-shot classification task for the mini-ImageNet dataset (only 1.5% lower than software-based implementations) when using a 3-bit MCAM for NN search. They considered the effects of FeFET threshold voltage variations on the application accuracy and analyze the area and search energy requirements of FeFET MCAMs for accurate operations. Our results indicate that MCAMs require 2× lower area and search energy than TCAMs to achieve the same accuracy. Furthermore, they experimentally demonstrated a bit implementation of FeFET MCAM using AND arrays from GLOBALFOUNDRIES to further validate the design concept.Wang, Xuehong(2021)[7], proposed concept is silicon verified using the 180 nm CMOS technology with transition-metal-oxide (TMO) RRAM integrated at the back-end-of-line (BEOL). It achieves a considerable ML-ratio of 1860 and a low IMATCH of 11.15 nA on average. In a typical search operation, it shows a negligible ML drop at the match case and a large ML swing range at the mismatch case. The SPICE simulation results further show it can support a long word-length (WDL) of 256 under a clock rate of 100 MHz for search operations, which demonstrated its promise for highly parallel nonvolatile TCAM.Karunakar, Gaddameedi(2022)[8], focused on implementation of Error-Resilient Ternary Content-Addressable Memory (ER-TCAM) for fast data storage with high speed read-write operations. Here, Linear Feedback Shift Register (LFSR)s were introduced for generating the random pattern sequences with address synchronization properties. Finally, LFSR based ER-TCAM provided the optimal data storage with high self-error detection, correction properties [9]Lim, Sehee, et al. "Cross-Coupled Ferroelectric FET-Based Ternary Content Addressable Memory With Energy-Efficient Match Line Scheme." IEEE Transactions on Circuits and Systems I: Regular Papers (2022).Lim, Sehee(2022)[9] proposed a novel FeFET TCAM that is free from the write problems of the previous FeFET TCAMs and tolerant to process variationsThey also proposed a novel match line scheme to improve search energy and time by reducing match line capacitance and the amount of discharged voltage in the match evaluation phase. Industrialcompatible 28 nm technology-based simulation results with the Preisach FeFET model show that the proposed FeFET TCAM achieves the highest search yield. Wang, Xianggao (2022) [10] developed a nonvolatile ternary content addressable memory (TCAM) with a cell size of 0.01 µm2 utilizing the Ge-based memory diode (MD), which has the most area-efficient TCAM design reported. The MDs have a high current ratio between ON and OFF states and a large rectifying ratio, showing the potential usage in large-dimension TCAM arrays. Besides, the functionality of parallel search was demonstrated with a 2-bit MD-TCAM array by experiment, and the electrical characterization showed expected results. With the help of the sub-ns ultrafast measurement system, it is confirmed that the search energy of MD-TCAM could reach as low as 1.0 fJ/bit/mismatch, and one search operation can be performed within 200 ps. Lee, Jae Seong, and Woo Young Choi(2021)[11] proposed for the first time. In the proposed unit NEMTCAM cell, a single nanoelectromechanical (NEM) memory switch replaces two static random access memory cells. Due to the monolithic 3-D (M3D) integration and nonvolatile. Yin, Xunzhao(2022)[12] proposed a hybrid ferroelectric NAND-NOR (HFNN) TCAM design to further improve the energy efficiency. A HFNN based segmented architecture is proposed to reduce the search delay and energy by search operation pipeline Hussain, Sheikh Wasmir(2023)[13] proposed a shared ML scheme (SMS) based on a two-step evaluation CAM cell, in order to reduce ML transitions per word besides improving density in the comparison units. A 64 × 32-bit proposed SMS-CAM, using 45-nm CMOS technology, dissipates 1.06 fJ/bit/search and achieves 100.77-ps search time. At the cost of 6.55% sacrificial delay over a conventional CAM, the proposed CAM reduces 12.38% - 30.48% energy over a self-power off gating, the conventional and an early terminate ML precharge designs. Low energy-delay results better trade-off budget besides decreasing macro area in the SMS-CAM, which could be useful in search applications of network devices. Vicuña, Kevin(2022)[14] explored performance of non-volatile ternary content addressable memories (NV-TCAMs), exploiting double-barrier magnetic tunnel junction (DMTJ) as comparatively evaluated with respect to the single barrier MTJ (SMTJ)-based solution. The comparison is performed at the circuit-level, considering different memory words. Overall, simulation results show that the DMTJ-based NV-TCAM is a good alternative to replace SMTJ-based NV-TCAM, mainly due to the search operation improvement. Yang, Ling(2022)[15] proposed an inmemory search prototype based on phase change memory (PCM). First, using the PCM, a highly compact (8F 2) and low-energy (0.3 fJ/bit/search) nonvolatile ternary addressable memory (TCAM)is demonstrated and achieved a 6.6× energy saving and 50× cell area saving over the 16T SRAM-TCAM. Thanks to the non-volatility and massive parallelism of the PCM TCAM to computing Hamming distance, the frequent memory access is alleviated, enabling search operation in situ. The prototype shows a high throughput of 256 GB/s in data duplication application, obtaining more than 250× energy saving and 42× improvement in throughput over CPU. Their study provided a low-areaoverhead, low-energy, and high-throughput solution for the storage system to perform search in situ.

## 3. Proposed Methodology

Soft errors are a major concern for modern electronic circuits and, in particular, for memories. A soft error can change the contents of the bits stored in a memory and cause a system failure. The soft error rate in terrestrial applications is low. For example, in it was estimated that for a 65-nm static random access memory (SRAM) memory, the bit error rate was on the order of 10–9 errors per year. That would translate to only one error per year for a system that uses 1 Gbit of memory. However, even such a low error rate is a big concern for critical applications such as communication networks on which the network elements such as routers have to provide a high level of reliability and availability.

The proposed architecture has a significant improvement in the Area-Time product. This section gives the detailed analysis of proposed LFSR based ER-TCAM method. The suggested design for ER-TCAM error detection is shown in Figure 1. The bits of the read LFSR words are EX-ORed to produce an error signal when a lookup input search key is applied. The log2N-bit error code generated from the error signals from the N-LFSRs of the TCAM architecture is used to identify each damaged LFSR in a specific way. To the error-correction module is sent the error code and any associated search-key bit patterns. Figure 2 illustrates the ER-TCAM architecture for error correction, which primarily consists of an LFSR for storing the binary-encoded data of the TCAM table, an Error Correction Vector (ECV) calculation unit, an address generating unit (AGU), and a read/write

controller. Each cycle of the MOD-D counter results in a fresh log2D bit sequence. The LFSR address is processed in such a way that the lower log2D bits from the counter choose which LFSR words to include in the sub-block, and the log2N-bits of the LFSR ID serve as the address's most important bits and point to the beginning of the matching sub-block in LFSR. In this manner, the TCAM table's matching partition's whole collection of binary-encoded words is accessible to the AGU. It takes D clock cycles to calculate the match bits and the corresponding parity bit, which together make up the ECV, after the TCAM words received are matched with the C-bit pattern to get a match-bit per cycle. To write the calculated ECV over the damaged LFSR word, the read/write controller provides a write enable high signal for the relevant LFSR. The ER-TCAM permits search activities during the errorcorrection process since LFSRs recognize the TCAM function is available for lookup operations These LFSRs are set up by the ER-TCAM as straightforward dual-port RAM, which simultaneously executes read and write operations in parallel. The error correcting procedure entirely overlaps the search operations of the ER-TCAM since the ECV is written via the write port of LFSR after it has been calculated. Despite the possibility of a soft mistake, the LFSR that is storing the binary-encoded TCAM table has a very low error occurrence probability when compared to LFSRs that are implementing TCAM. This is because the LFSR is quite tiny. With relatively minimal memory and error-correction delay overhead, the ER-TCAM is nevertheless able to safeguard LFSR that is used to store the binary-encoded TCAM table.



Figure 1. Error detection mechanism of ER-TCAM.



Figure 2. Error detection mechanism of ER-TCAM

## 3.1 Working of TCAM

A TCAM is a large and dynamic storage of key-record (k, R) pairs. That is, a TCAM has, at any time, many entries with each entry being a (k, R) pair. Just as these entries may be inserted into a TCAM at any time, they may also be searched for, retrieved from, and deleted from the TCAM at any time. A record R is always associated with its own key k and is uniquely identified by it. As a consequence, a primary requirement in the design of a TCAM is the need to constantly maintain this association between each key and its associated record [9]. This is in spite of the various types of TCAM instructions like insertion, deletion, search, retrieval, etc. of records that may be performed on these (k, R) pairs to make the TCAM a dynamic data structure. The TCAM is like a SRAM) which is organized as a 2-dimensional array of memory cells as shown in Figure 3. But unlike a RAM that merely stores bits in its simple memory cells, a CAM additionally includes in each RAM cell a considerable amount of extra hardware.



Figure 3. Schematic block diagram of a TCAM.

The main job of this extra circuitry is to perform parallel comparison between the stored bits in each CAM cell and an external search query bit and to combine the individual bit comparison results into a word comparison result. Each row in the CAM array is called a slot. A slot is just like a word in a RAM, although the word is much longer in a CAM. A processor can store arbitrary binary data in each slot and can specify an external search key or search operand in a separate comparand register which is outside the array of CAM slots. Each bit in the comparand register is compared with the corresponding bit of each slot in the stored array in parallel. Since multiple matches may occur in many applications, the output of the parallel search in an N-slot CAM is provided by an N-bit Response Register whose  $i^{th}$  bit is set if the search key matches the content of the i-th slot in the CAM. The N-bit Response Register is often followed by a priority encoder to allow sequential readout of the content of all the matching slots. Alternatively, the slot numbers of all slots that match the search key are explicitly listed as the CAM output.

The TCAM can be implemented on a CAM in a somewhat straightforward manner because the association needed by a record with its dedicated unique key is always maintained statically and automatically in a rigid manner. This is because both the key and the record are packed together in the same slot. Each (k, R) pair may be stored in one of the equally sized slots [10]. The basic slot sizes

commercially available are 36-bits, 48-bits, etc. However, by combining a few slots, wider slots can be obtained, although only a limited expansion is possible.

Because CAMs do not use external address lines to select the matching data, there is no theoretical limit to the number of chips that can be connected for depth expansion (number of slots). For building a DM, each slot has to be partitioned into two fields, namely, the key (field 1) and the record (field 2). All keys should be of the same size as all records. For the purpose of searching partial contents (all searches in the DM will be by the keys and not by the records), each bit in each slot in the array as well as in the comparand register (slot) that stores the search key may be masked to enable or disable its participation in a search. However, this masking facility is not available in a binary CAM but is available only in a ternary CAM (TCAM), where each bit can have 3 values: zero, one, or "don't care". The "don't care" bits do not participate in the search (match) process. Thus, all the slots in the CAM can be searched in parallel by a partial content, namely, the key. In the case of the TCAM, since the keys are all unique, only one slot (one k, R pair) will match with a search key. The content of this only matching slot may be simply read out and the record field may be separated. Unfortunately, though it offers a very high-speed search, CAM has several limitations, some are of general nature and some are specifically for building a TCAM.Commercial CAM chips have limited word length and a limited capacity of slots.

The former will not allow arbitrary size and variable length of records creating a problem of flexibility. The latter will not allow large-sized dictionaries to be built creating a problem of scalability. As the fourth important limitation for its use in building a DM, the four DM instructions, namely, MAX, MIN, NEAR (k) and CNT, cannot be performed through parallel search and will require a time-consuming linear search. Besides these three limitations specific to the design of a DM, two general limitations of CAM are very well known. Owing to the need for a considerable additional amount of circuitry to be associated with each RAM cell in the array, CAM suffers from the twin problems of excessive cost and excessive power consumption.

## 4. Results and Discussion

The simulation results will be valid in Vivado tool. The area and power Delay values we shown in below figures



Figure 4.RTL schematic representation of proposed system



Figure 5. Simulation output waveforms



Figure 6. Output Area



Figure 7. Output Delay



Figure 8. Output Power

Table 1. Output

| Metric                | <b>Existing SRAM-TCAM</b> | Proposed ER-TCAM |
|-----------------------|---------------------------|------------------|
| LUTs                  | 950                       | 826              |
| LUT-FFs               | 389                       | 332              |
| Time delay (ns)       | 0.735                     | 0.239            |
| Power consumption (w) | 3.05                      | 2.13             |

Comparison Between Proposed Method and Existing Method.

### 5. Conclusion

The implementation of ER-TCAM for quick data storage with fast read-write operations is the main topic of this paper. To creating random pattern sequences with address synchronization features, LFSR cells were introduced. The ideal data storage with strong self-error detection, correction properties was offered by LFSR-based ER-TCAM. The simulations showed that the proposed method outperformed existing approaches in terms of area, latency, and power. Further, this work can be extended with parallel memory allocation systems using LFSR based ER-TCAM modules for improved performance.

### References

- [1] Mahendra, Telajala Venkata, et al. "Energy-efficient precharge-free ternary content addressable memory (TCAM) for high search rate applications." IEEE Transactions on Circuits and Systems I: Regular Papers 67.7 (2020).
- [2] Zhong, Hongtao, et al. "DyTAN: Dynamic ternary content addressable memory using nanoelectromechanical relays." IEEE Transactions on Very Large Scale Integration (VLSI) Systems 29.11 (2021).
- [3] Irfan, Muhammad, et al."RPE-TCAM: Reconfigurable power-efficient ternary content-addressable memory on FPGAs." IEEE Transactions on Very Large Scale Integration (VLSI) Systems 28.8 (2020).
- [4] Penchalaiah, Usthulamuri, and VG Siva Kumar. "Design and Implementation of Low Power and Area Efficient Architecture for High Performance ALU." Parallel Processing Letters 32.01n02 (2022): 2150017.
- [5] S. V. G. Kumar, M. Vadivel, U. Penchalaiah, P. Ganesan and T. Somassoundaram, "Real Time Embedded System for Automobile Automation," 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), Pondicherry, India, 2019, pp. 1-6, doi: 10.1109/ICSCAN.2019.8878820.
- [6] Kazemi, Arman, et al. "Fefet multi-bit content-addressable memories for in-memory nearest neighbor search." IEEE Transactions on Computers 71.10 (2021).
- [7] Wang, Xuehong, et al. "A 4T2R RRAM bit cell for highly parallel ternary content addressable memory." IEEE Transactions on Electron Devices 68.10 (2021).
- [8] Karunakar, Gaddameedi, and Biroju Papachary. "Implementation of LFSR based Fast Error-Resilient Ternary Content-Addressable Memory." 2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS). IEEE, 2022.
- [9] Lim, Sehee, et al. "Cross-Coupled Ferroelectric FET-Based Ternary Content Addressable Memory With Energy-Efficient Match Line Scheme." IEEE Transaction on Circuits and Systems I: Regular Papers (2022).
- [10] Wang, Xianggao, et al. "A Highly Compact Nonvolatile Ternary Content Addressable Memory (TCAM) With Ultralow Power and 200-ps Search Operation." IEEE Transactions on Electron Devices 69.8 (2022).
- [11] Lee, Jae Seong, and Woo Young Choi. "Nanoelectromechanical-switch-based ternary content-addressable memory (NEMTCAM)." IEEE Transactions on Electron Devices 68.10 (2021).
- [12] Yin, Xunzhao, et al. "Ferroelectric ternary content addressable memories for energy efficient associative search." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2022).
- [13] Hussain, Sheikh Wasmir, et al. "SMS-CAM: Shared matchline scheme for content

Research Article

- addressable memory." Integration 88 (2023).
- [14] Vicuña, Kevin, et al. "DMTJ-Based Non-Volatile Ternary Content Addressable Memory for Energy-Efficient High-Performance Systems." 2022 IEEE 13th Latin America Symposium on Circuits and System (LASCAS). IEEE, 2022.
- [15] Yang, Ling, et al. "In-Memory Search with Phase Change Device-based Ternary Content Addressable Memory." IEEE Electron Device Letters 43.7 (2022).