With a marked increase in interest in artificial intelligence (AI) and memory computing resistive random access memory (ReRAM) may be the key to unlocking its ability to mimic the human brain-but challenges remain.
Last year’s IEDM brought together many of the latest research papers on advancing various types of memory, both emerging and existing. Not surprisingly, many of them are committed to remembering how to improve memory computing, artificial intelligence, and machine learning (ML), and even mimic the human brain.
ReRAM has long been synonymous with neurotic computing, and Weebit Nano has expressed interest in its technological pursuit, even though it is secondary to the company’s other business priorities.
Meanwhile, the University of Michigan has been developing various ReRAM prototypes for at least a decade. Wei D. Lu, a professor in the department of electrical engineering and computer science at the university, explained that ReRAM offers the potential for high-density non-volatile storage and efficient memory computing, while accelerators that support ReRAM can solve the von Neumann bottleneck in Michigan. His IEDM presentation outlines some devices and how parallelism addresses the power, latency, and cost requirements of growing AI models and edge computing applications.
CPU that takes advantage of parallelism will still encounter memory bottlenecks. While GPU allows faster memory access, Lu says a new computing architecture is needed that fundamentally improves throughput and computing efficiency. A memory protection unit (MPU) can significantly improve parallelism and put memory and logic together to achieve device-level computing and better promote memory computing.
MPU can significantly improve parallelism and put memory and logic together to achieve device-level computing and better facilitate memory computing (image courtesy of the University of Michigan)
Lu stated that the potential of ReRAM for computing in memory lies in the application for ReRAM arrays for computing structures since it can be used to carry out learning and reasoning functions locally. ReRAM also allows bidirectional data flow. Moreover, larger neural networks could be constructed by using modular systems that use an MPU that is tiled to provide higher performance. The % 0ALu group believes that ReRAM’s capabilities for in-memory computing is in the application for ReRAM arrays for computing structures since it is able to perform reasoning and learning functions locally. ReRAM can also support bidirectional data flow, while more powerful neural networks can be constructed by using modular systems that use MPU architectures that are tiled to increase speed.
Lu stated that the potential of ReRAM for computing in memory is in the utilization for ReRAM arrays to serve as computational devices because they is able to perform reasoning and learning functions locally. ReRAM also supports bidirectional flow of data and larger neural networks are possible to implement using modular systems using tiled MPU architecture to increase performance.
Handling RERAM Challenge
But, there are many major issues in ReRAM devices. On one hand readout circuits that are based on high-precision converters for analog to digital pose significant challenges, and performance could be affected by equipment that isn’t ideal as well as changes in units. Another issue is that Asymmetric and nonlinear conductance changes that are observed in ReRRAM devices could significantly reduce the accuracy of training, Lu said.
The potential solutions to this problem are multi-range quantization and Binary neural networks. Lu claimed that the use of architecture-aware learning will solve the performance issues that arise from the non-ideality of devices, and the application in the 2T2R architecture to create the binary weights that aids in addressing the third problem. Hybrid precision training also can tackle the second and third problems due to the fact that it offers significant improvements in performance and computational efficiency through the training of large neural networks in less precise formats.
PCM, also known as phase change memory (PCM) is also a possibility to enhance memory calculation. IBM Research Europe has been investigating the use of PCM to resolve the issue of temperature sensitivity in computer simulation. According to Irem Boybat, who is a part of the IBM Research memory computing team as well as the rapid development of artificial intelligence neural networks there’s a concern about the efficiency of computation. Deep learning requires a lot of computational power and subversive computing models must be implemented to ensure that the “artificial intelligence revolution” is to last.
“the size of the language model is growing exponentially,” Boybat stated. According to Boybat the process involves moving massive amounts of data from the memory to the processing unit which is costly and leaves a significant carbon footprint.
Simulated Memory Computing blurs the line between processing and memory by performing certain processing tasks within the memory, and is achieved through the usage of the physical characteristics of the device. Boybat declares that PCM is a good possibility for memory computing as it stores information in a dense way and the power used for static storage is low. IBM Research recently demonstrated two PCM-based memory computing chips over the last year.
Temperature sensitivity remains the focus of the research team and the mushroom PCM is utilized to retain research. The temperature and resistance heater that are built into the chip suggest that problems with retention aren’t anticipated in the range between 30 and eighty degrees Celsius. IBM Research’s studies study the effects of temperature fluctuations and the drift of multi-level PCM in-memory computing.
With the assistance by the IBM Research AI Hardware Center and IBM Research AI Hardware Center, the team observed that even though PCM displayed sensitivity to temperature that was related to conductance but conductance’s normalized distribution remains relatively constant over the time-temperature curve that was used. The team developed a solid statistical model to explain the effects of temperature on conductance drift that was confirmed through PCM conductance tests.
They employed more than 1 million PCM devices in order to show that basic compensation schemes are able to be used in a wide range of networks, with high accuracy in temperatures between 33 and 80 degrees Celsius.
Imitate the human brain
BIC will circumvent the von Neumann bottleneck in the medium to long term. (image courtesy of China Institute of Microelectronics, Chinese Academy of Sciences)
Another current area of research outside of memories is development and development of networks for neural processing that is more in line to the brain of humans. The research of the ReRAM-based Brain-like Computing (BIC) which was developed by Liu Ming on behalf of numerous researchers from Fudan University and the Institute of Microelectronics of the Chinese Academy of Sciences and Fudan University, is being powered by the latest usage in artificial intelligence computation, Liu stated, and is expected to double every three months.
The increasing usage in artificial intelligence-based computing makes hardware that is brain-inspired essential to sustain advancement. Although the new technology for memory could enhance the hierarchy of computing in the short-term, BIC will avoid von Neumann bottlenecks in the longer term. BIC includes memory computing as well as neuromorphological computing.
Understanding BIC requires to distinguish between the algorithms of AI that are the neural networks used in computer science as well as those of neuroscience and biology. Artificial neural network (ANN) is a continuous signal processing system within the spatial domain and the spike neural system (SNN) has the advantage of being more bio-compatible since it is based on how the brain functions. Liu stated that ReRAM is a suitable platform for BIC due to its rich shifting dynamic capabilities and is able to allow large-scale integration, low-power peripherals, and specialized applications architectures that can be used to build BIC chip systems and chips.
Liu stated that, over the past decade, after study at other universities, beginning with simulation and integrated SNN multicore is soon feasible. The computational density and performance of ReRAM SNN can be a great opportunity to improve performance. the chip that incorporates the event-driven representation of events and the integrated multi-core is likely to become real. However, to design BIC chips that are suitable for real-world applications, there’s much research to be completed on the level of architecture.
ReRAM’s characteristics make it a favorite candidate for artificial intelligence applications that mimic humans’ brain. However, IEDM focused its focus on magnetoresistive random access memory during IEDM 2021, with the entire day of talks as well as the two IEEE Magnetic Society events to understand the link between magnetism and microelectronics to help advance technology.
For ferroelectric memory for random access (FRAM), CEA-Leti announced the demonstration of what they claim to be the first node with a 130nm size 16-kbit array which is moving the technology nearer to commercialization. Ultra-low power, high-durability, fast and CMOS-compliant BEOL FRAM memory utilizes the latest HfO2-based ferroelectric material that is more eco-friendly than PZT since it’s non-lead.
Possible applications could include embedded applications like Internet of things (IoT) devices as well as wearable devices. The research is supported through the EU 3eFERRO Union project, which is designed to create new ferroelectric materials that make FRAM an attractive non-volatile memory option in Internet of things applications.
While most IEDM research papers utilize new memory technologies in cutting-edge technology such as artificial intelligence neural morphological computing , and memories computing, advancement using existing memory (such like dynamic memory) is the primary research focus for many research researchers.
Intel has released a number of research papers as part of the IEDM campaign that focus on the improvement of scale and adding fresh features on the chip. Intel’s component research describes the work being done to address the design, process and assembly challenges associated with hybrid bonded interconnects and puts forward the idea that will increase the interconnection density in package by 10 times. Prior to this, Intel announced the launch of Foveros Direct in July, which allows for the use of bumps with a size of less than 10 micron and provides an increase of a magnitude of interconnections for 3D stacks.
Other papers focus on how Intel deals with the expected post-FinFET era by stacking multiple CMOS transistors, which aims to achieve logical scaling improvements of up to 30 to 50 percent by installing more transistors per square millimeter to continue to advance Moore’s law. Another effort to advance Moore’s Law includes the coming Egyptian era, research shows how to use new materials only a few atoms thick to make transistors that overcome the limitations of traditional silicon channels-millions of transistors per chip area.
Intel also outlined the research to bring new functions to silicon by integrating GAN-based power switches and silicon-based CMOS on 300mm wafers, which will achieve low-loss, high-speed power supply to the CPU while reducing motherboard components and space.