Past Projects
I. Brain-Inspired Neuromorphic Computation
A. Energy Efficient Deep Learning
With the increasing availability of compute power, state of the art CNNs are growing rapidly in size, making them prohibitive to deploy in power-constrained environments. To enable ubiquitous use of deep learning techniques, we are looking into techniques that reduce the size of a model with minimal degradation in accuracy. Some of the algorithmic techniques we have devised to tackle this include identifying redundancy among the weights of a model in a single shot, discretized leaning and inference methods, and strategies that change network architecture to enable efficient inference. We have also developed hardware driven approximation methods that leverage the inherent error resilience of CNNs for model compression.
Publications:
- Garg, Isha, Priyadarshini Panda, and Kaushik Roy. "A low effort approach to structured CNN design using PCA." arXiv preprint arXiv:1812.06224 (2018).
- Panda, Priyadarshini, Abhronil Sengupta, and Kaushik Roy. "Conditional deep learning for energy-efficient and enhanced pattern recognition." 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2016.
- Chakraborty, Indranil, et al. "PCA-driven Hybrid network design for enabling Intelligence at the Edge." arXiv preprint arXiv:1906.01493 (2019).
- P. Panda, A. Ankit, P. Wijesinghe, and K. Roy, “FALCON: Feature Driven Selective Classification for Energy-Efficient Image Recognition,” IEEE Transactions on CAD, 2016.
- Sarwar, et al. "Energy-efficient neural computing with approximate multipliers." ACM Journal on Emerging Technologies in Computing Systems (JETC) 14.2 (2018): 16.
- P. Panda, K. Roy, “Attention Tree: Learning Hierarchies of Visual Features for Large-Scale Image Recognition”, arXiv preprint, arxiv.org/abs/1608.00611.
Student Researchers: Jason Allred, Aayush Ankit, Indranil Chakraborty, Isha Garg, Deboleena Roy
B. Continual and Incremental Learning
The capacity to learn new things without forgetting already present knowledge is innate to humans. However, all neural networks suffer from the problem of catastrophic forgetting, making it hard to grow networks and learn newer incoming data in a fluid manner. We are exploring techniques that utilize stochasticity or architectural enhancements that can enable lifelong learning.
Publications:
- Allred, Jason M., and Kaushik Roy. "Stimulating STDP to Exploit Locality for Lifelong Learning without Catastrophic Forgetting." arXiv preprint arXiv:1902.03187 (2019).
- Roy, Deboleena, Priyadarshini Panda, and Kaushik Roy. "Tree-CNN: A hierarchical deep convolutional neural network for incremental learning." arXiv preprint arXiv:1802.05800 (2018).
- Panda, Priyadarshini, et al. "Asp: Learning to forget with adaptive synaptic plasticity in spiking neural networks." IEEE Journal on Emerging and Selected Topics in Circuits and Systems 8.1 (2017): 51-64.
- Sarwar, Syed Shakib, Aayush Ankit, and Kaushik Roy. "Incremental learning in deep convolutional neural networks using partial network sharing." arXiv preprint arXiv:1712.02719 (2017).
Student Researchers: Aayush Ankit, Jason Allred, Deboleena Roy
C. Mimicking neural and synaptic computations by spintronic devices
Spintronic devices that can offer a direct mapping to neural and synaptic functionalities in artificial and spiking neural networks are being currently explored. Unsupervised/supervised neural computing platforms based on such spintronic devices can potentially lead to ultra-low power and compact pattern recognition systems.
Publications:
- G. Srinivasan, A. Sengupta, K. Roy, "Magnetic Tunnel Junction Based Long-Term Short-Term Stochastic Synapse for a Spiking Neural Network with On-Chip STDP Learning", Scientific Reports, 2016.
- A. Sengupta, P. Panda, P. Wijesinghe, Y. Kim, K. Roy, "Magnetic Tunnel Junction Mimics Stochastic Cortical Spiking Neurons", Scientific Reports, 2016.
- A. Sengupta, M. Parsa, B. Han, K. Roy, "Probabilistic Deep Spiking Neural Systems Enabled by Magnetic Tunnel Junction", IEEE Transactions on Electron Devices, 2016.
Student Researchers: Jason Allred, Priyadarshini Panda
II. Spin-Transfer Torque Devices, Circuits and Systems
A. Modeling spintronic devices
Modeling a typical spintronic device requires self consistent solution of magnetization dynamics and electron transport equations. We have developed a simulation framework from the device to circuit level for analyzing spin based novel systems and architectures. The current focus is to model new physical phenomena, for example, use of topological insulators, the spin-Hall effect and electric field assisted switching of spin devices.
Publications:
- A. K. Reza, X. Fong, Z. A. Azim, K. Roy, "Modeling and Evaluation of Topological Insulator/Ferromagnet Heterostructure Based Memory", IEEE TED, 2016.
- X. Fong, Y. Kim, K. Yogendra, D. Fan, A. Sengupta, A. Raghunathan, K. Roy, "Spin-Transfer Torque Devices for Logic and Memory: Prospects and Perspectives", IEEE TCAD (Keynote Paper).
- X. Fong, S. Gupta, N. Mojumder, S H. Choday, and K. Roy, "KNACK: A Hybrid Spin-Charge Mixed-Mode Simulator for Evaluating Different Genres of STT-MRAMs", SISPAD 2011.
Student Researchers: Zubair Al Azim, Saima Sharmin, Akhilesh Jaiswal, Ahmed Reza
B. Spintronic Memories
STT-MRAMs are being projected as the leading emerging memory technology, that can replace silicon based memories in last level cache applications. However, slow write speed, high write energy consumption and various failure mechanisms are the major challenges associated with STT-MRAMs. Low power and reliable memories are being explored using alternative switching mechanisms such as the spin-Hall Effect and the use of architectural solutions like Error Correcting Codes.
Publications:
- X. Fong, Y. Kim, R. Venkatesan, S. H. Choday, A. Raghunathan, and K. Roy, “Spin-transfer Torque Memories: Devices, Circuits and Systems”, Proceedings of the IEEE, 2016.
- A. Jaiswal, X. Fong, K. Roy "Comprehensive Scaling Analysis of Current Induced Switching in Magnetic Memories Based on In-Plane and Perpendicular Anisotropies", IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS), 2016.
- S. Sharmin, A. Jaiswal, K. Roy, "Modeling and Design Space Exploration for Bit-Cells Based on Voltage-Assisted Switching of Magnetic Tunnel Junctions", IEEE TED, 2016.
Student Researchers: Saima Sharmin, Akhilesh Jaiswal
C. Boolean Computation
Spintronic devices have advantages such as non-volatility, miniaturized area and zero leakage. These key advantages have made them alluring for future logic and memory design. All Spin Logic (ASL) is a recently proposed logic style that uses spintronic devices for logic applications. We are exploring an automatic synthesis methodology to design logic circuits using ASL, in addition to new spin-orbit torque-based domino-style spin logic (SOT-DSL).
Publications:
- Z. Pajouhi, S. Venkataramani, K. Yogendra, A. Raghunathan, K. Roy, "Exploring Spin-Transfer-Torque Devices for Logic Applications", IEEE TCAD, 2015.
- M.-C. Chen, Y. Kim, K. Yogendra, K. Roy, "Domino-Style Spin–Orbit Torque-Based Spin Logic", IEEE Magnetic Letters, 2015.
Student Researchers: Mei-Chin Chen
D. Non-Boolean Computation
Non-boolean neuromorphic computing systems where the core neuronal and synaptic functionalities are being emulated by spintronic devices requires a rethinking of circuits and systems for proper interfacing and functioning of the network. This project is focussed on building a device-circuit-system simulation framework for such neural pattern recognition systems.
Publications:
- G. Srinivasan, A. Sengupta, K. Roy, "Magnetic Tunnel Junction Enabled All-Spin Stochastic Spiking Neural Network", DATE 2017. (Invited Paper).
- A. Sengupta, A,Banerjee, K. Roy, "Hybrid Spintronic-CMOS Spiking Neural Network With On-Chip Learning: Devices, Circuits and Systems", Physical Review Applied, 2016. (Featured in MIT Technology Review: Emerging Technology from arXiV and DoD R&E Science and Technology News Bulletin)
- A. Sengupta, K. Roy, "A Vision for All-Spin Neural Networks: A Device to System Perspective", IEEE Transactions on Circuits and Systems-I: Regular Papers, 2016. (ISCAS 2016 Special Issue)
Student Researchers: Abhronil Sengupta, Gopalakrishnan Srinivasan
III. Spintronic Sensors for Interconnects
Spintronic devices can function as efficient low power sensors for high-speed long-distance interconnect architectures. We are currently exploring the use of Domain-Wall based Spin-Torque sensors at the receiving end of a current-mode interconnect scheme. The use of Magnetic Tunnel Junctions at the detector of an optical interconnect scheme is also being explored.
Publications:
- Z. A. Azim, A. Sharma, K. Roy, "Buffered Spin-Torque Sensors for Minimizing Delay and Energy Consumption in Global Interconnects", IEEE Magnetic Letters, 2016.
- Z. A. Azim, A. Sengupta, S. S. Sarwar, K. Roy, "Spin-Torque Sensors for Energy Efficient High-Speed Long Interconnects", IEEE TED, 2016.
- Z. A. Azim, X. Fong, T. Ostler, R. Chantrell, K. Roy, "Laser Induced Magnetization Reversal for Detection in Optical Interconnects", IEEE EDL, 2015.
Student Researchers: Zubair Al Azim
IV. Approximate Computing
Approximate computing relies on the ability of many systems and applications to self-heal or to tolerate some loss of quality or optimality in the computed result. The main idea is to exploit the inherent error resiliency or error tolerance of the system to achieve energy efficiency, or in other words, trading accuracy with energy consumption. Such trade-off, in most cases, is also associated with performance improvements like faster operations, area reduction etc.
Publications:
- S. S. Sarwar, S. Venkataramani, A. Raghunathan, K. Roy, “Multiplier-less Artificial Neurons Exploiting Error Resiliency for Energy-Efficient Neural Computing”, DATE 2016.
- G. Srinivasan, P. Wijesinghe, S. S. Sarwar, A. Jaiswal, K. Roy, “Significance Driven Hybrid 8T-6T SRAM for Energy-Efficient Synaptic Storage in Artificial Neural Networks”, DATE 2016.
- P. Wijesinghe, C. M. Liyanagedera and Kaushik Roy, "Fast, Low Power Evaluation of Elementary Functions Using Radial Basis Function Networks", DATE 2017.
Student Researchers: Gopalakrishnan Srinivasan, Syed Shakib Sarwar, Parami Wijesinghe
V. Oscillator Based Non-Boolean Computation
Spin Torque Nano Oscillators (STNOs) can be efficiently used to perform computations that are unsuitable or inefficient in von-Neumann computing models. Their frequency of oscillation can be in few tens of gigahertz range operating at low input currents. These attractive features and the ability to obtain frequency locking using a variety of techniques, make STNOs an attractive candidate for non-Boolean computation such as edge detection of an image, associative computing, pattern recognition, etc.
Publications:
- K. Yogendra, D. Fan, K. Roy, "Magnetic Pattern Recognition using Injection Locked Spin Torque Nano-Oscillators", IEEE TED 2016.
- C. M. Liyanagedera, K. Yogendra, D. Fan, K. Roy, "Spin torque nano-oscillator based Oscillatory Neural Network", IJCNN 2016. (Best student paper nomination).
- K. Yogendra, D. Fan, K. Roy, "Coupled Spin Torque Nano Oscillators for Low Power Neural Computation", IEEE TMAG, 2015.
Student Researchers: Yong Shim, Chamika Liyanagedera, Minsuk Koo
VI. Transistors in Sub-10nm Technologies
Sub-10nm technology is expected to have severe short channel effects along with new leakage mechanisms such as direct source-to-drain tunneling. In order to improve the leakage and to obtain highest performance in these deeply-scaled device circuits, we are investigating various types of transistors such as FinFETs, Schottky barrier FETs, and tunneling field-effect-transistors (TFET). We are exploring the design space of such transistors to optimize for sub-10nm technology, and analyze their behaviors in circuits.
Publications:
- A. Sharma, A. K. Reza, K. Roy, “Proposal of an Intrinsic-Source Broken-Gap Tunnel-FET to Reduce Band-Tail Effects on Subthreshold-Swing: A Simulation Study”, IEEE TED, 2016.
- A. Sharma, A. ArunGoud, J. P. Kulkarni, K. Roy, “Source-underlapped GaSb-InAs TFETs with applications to Gain Cell Embedded DRAMs”, IEEE TED, 2016
- W.-S. Cho, K. Roy, "The effects of direct source-to-drain tunneling and variation in the body thickness on (100) and (011) sub-10nm Si double-gate transistors," IEEE EDL, 2015.
Student Researchers: Ankit Sharma
VII. Ultralow Voltage Subthreshold Circuits and systems
For ultralow power and portable applications, design of digital subthreshold logic is investigated with transistors operated in the subthreshold region (supply voltage corresponding to Logic 1, less than the threshold voltage of the transistor). The subthreshold leakage current of the device is used for computation. Standard design techniques suitable for superthreshold design can be used in the subthreshold region. However, it has been shown that a complete co-design at all levels of hierarchy (device, circuit, and architecture) is necessary to reduce the overall power consumption while achieving acceptable performance (hundreds of kilohertz) in the subthreshold regime of operation.
Publications:
- G. Srinivasan, A. Sengupta, K. Roy, "Magnetic Tunnel Junction Based Long-Term Short-Term Stochastic Synapse for a Spiking Neural Network with On-Chip STDP Learning", Scientific Reports, 2016.
- A. Sengupta, P. Panda, P. Wijesinghe, Y. Kim, K. Roy, "Magnetic Tunnel Junction Mimics Stochastic Cortical Spiking Neurons", Scientific Reports, 2016.
- A. Sengupta, M. Parsa, B. Han, K. Roy, "Probabilistic Deep Spiking Neural Systems Enabled by Magnetic Tunnel Junction", IEEE Transactions on Electron Devices, 2016.
Student Researchers: Abhronil Sengupta, Gopalakrishnan Srinivasan
VIII. Designing circuits beyond traditional Silicon
Scaling of Silicon technology continues while research has started in other novel materials for future technology generations beyond the year 2015. Carbon nanotubes (CNTs) with their excellent carrier mobility are a promising candidate. We investigated different carbon nanotube based field effect transistors (CNFETs) for an optimal switch. Schottky Barrier (SB) CNFETs, MOS CNFETs, and state-of-the-art Si MOSFETs were systematically compared from a circuit/system design perspective. We are working on developing compact models for carbon nanotube based transistors, interconnects with metallic nanotubes as well as developing tools and software to evaluate circuit/system level performance of nanotube based digital circuits. We are also working on other technological options beyond traditional silicon. These include nanowires, III-V high mobility transistors, computation with nano magnets, Spin FETs, as well as two terminal molecular devices.
The chief areas of research in this field include:
- Carbon Nano Tube based circuits(both BTBT CNT and CNTFET)
- Polysilicon TFT based on Carbon Nano Tubes
- Magnetic Quantum Cellular Automata(MQCA) based circuits and architectures
- Subthreshold memory design
Student Researchers: Charles Augustine, Sumeet Gupta, Niladri Mojumder, Xuanyao Fong, Kerem Camsari, Jaydeep Kulkarni Jing Li, Arijit Raychowdhury, Qiaki Chen
IX. Signal Processing Architectures using Magnetic QCA(MQCA)
Tremendous amount research effort has been put into scale the MOSFET to the ultimate limits of molecular dimension. And at the same time several other devices has been investigated to work as replacement for silicon. Magenetic Quantum Cellular Automata is one among the promising candidates. We are working on developing compact models for MQCA as well as developing software to evaluate circuit/system level performance of Signal Processing circuits using MQCA.
Student Researchers: Charles Augustine, Xuanyao Fong, Sumeet Gupta, Arijit Raychowdhury
X. High Performance and Low Power Flexible Electronics
Low Temperature Polycrystalline Silicon Thin Film Transistors (LTPS TFTs) have been widely used in Active Matrix Liquid Crystal Display (AMLCD) as pixel-switching-elements with high supply voltage (10~20V). These devices are low performance (due to thick silicon body and gate oxide) and hence, not suitable for both high performance and low power digital circuits. In keeping with the general trend of the CMOS technology, LTPS TFTs can also be scaled to sub-micron regime. The scaled device can achieve higher performance than standard TFTs and can be a promising candidate for both sub-threshold and super-threshold operation.
The chief areas of research in this field can be summarized below:
- Device modeling and optimization for super- and sub-threshold operation (extent to organic TFTs)
- Modeling and Analysis of inherent process variation (GBs induced variation) in scaled TFT technology
- Statistical timing analysis and variation tolerant design in circuit/architecture considering the inherent variations to improve yield
Student Researchers: Jing Li, Himanshu Markendeya, Selin Baytok, Swaroop Ghosh, Aditya Bansal
XI. Green Computing(Ultra Low Power Electronics)
Student Researchers: Jaydeep Kulkarni, Charles Augustine, Xuanayo Fong, Arijit Raychowdhury, Nilanjan Banerjee, Amit Agarwal, Yiran Chen, Animesh Datta, Swarup Bhunia, H. Mahmoodi
XII. Low Power, Variation Aware System design
The two major issues being faced by today's IC designers are power and process variation. Concurrently addressing both these issues is a challenging task since low power schemes and process tolerant methods represent contradictory requirements in terms of system design. To target these issues simultaneously, novel design methodologies are required. We observed that for a certain class of systems (especially digital signal processing systems) by allowing the "right tradeoffs" between output quality and power requirements, voltage scaling for low power dissipation can be obtained even in presence of process parameter variations. We investigate and develop novel designs for such systems in this research.
Student Researchers: Georgios Karakonstantis, Charles Augustine, Debabrata Mohapatra, Jung-Hwan Choi, Xuanyao Fong, Patrick Ndai, Nilanjan Banerjee
XIII. Memory Technology and Design
A. Training Methodologies for Deep Spiking Neural Networks
Spiking Neural Networks (SNNs), regarded as the third generation of neural nets, attempt to more closely mimic certain types of computations performed in the human brain to achieve higher energy efficiency in cognitive tasks. SNNs encode input information in the temporal domain using sparse spiking events. The intrinsic sparse spike-based information processing capability of SNNs can be exploited to achieve improved energy efficiency in neuromorphic hardware implementations. However, SNN training algorithms are much less well developed, leading to gap in the accuracy offered by SNNs compared to their ANN counterparts. We proposed different training methodologies that overcome the discontinuous nature of spike trains and effectively utilize spike timing information for training large-scale SNNs that yield comparable accuracy to ANNs for complex recognition tasks.
ANN-to-SNN Conversion
In order to circumvent the training difficulty posed by the non-differentiable dynamics of the spiking neurons, we proposed ANN-to-SNN conversion scheme for realizing deep SNNs. The ANN-to-SNN conversion scheme trains standard ANN architectures like VGG and ResNet using ReLU activation and gradient descent error backpropagation. The trained weights are then mapped to SNN composed of Integrate-and-Fire (IF) spiking neurons by incorporating suitable weight and threshold balancing mechanisms to minimize accuracy loss during SNN inference. Our work is one of the first to demonstrate near loss-less ANN-to-SNN conversion and competitive accuracy on ImageNet.
Publications:
- Sengupta, A., Ye, Y., Wang, R., Liu, C. and Roy, K. "Going deeper in spiking neural networks: VGG and residual architectures," Frontiers in neuroscience, 13, 2019.
- P. Panda, K. Roy, "Unsupervised Regenerative Learning of Hierarchical Features in Spiking Deep Networks for Object Recognition", IJCNN 2016.
- J. Allred, K. Roy, "Unsupervised Incremental STDP Learning using Forced Firing of Dormant or Idle Neurons", IJCNN 2016.
Student Researchers: Abhronil Sengupta
Spike-Based Error Backpropagation
In order to effectively incorporate spike timing information, we proposed spike-based error backpropagation algorithm for directly training SNNs using low-pass filtered spike train as the differentiable approximation for the Leaky-Integrate-and-Fire (LIF) spiking neuron. We demonstrated competitive accuracy using ~10× lower inference latency compared to that obtained using the conversion approaches.
Publications:
- Lee, C., Panda, P., Srinivasan, G. and Roy, K. "Training deep spiking convolutional neural networks with stdp-based unsupervised pre-training followed by supervised fine-tuning," Frontiers in neuroscience, 12, 2018.
- Lee, C., Sarwar, S. S., Panda, P., Srinivasan, G. and Roy, K. "Enabling Spike-based Backpropagation in State-of-the-art Deep Neural Network Architectures," arXiv preprint arXiv:1903.06379, 2019.
Student Researchers: Chankyu Lee, Gopal Srinivasan
Spike Timing Dependent Plasticity (STDP)
We proposed bio-plausible STDP-based unsupervised training methodology for both fully-connected and convolutional SNNs to enable on-chip training and inference in edge devices.
Publications:
- Srinivasan, G., Roy, S., Raghunathan, V. and Roy, K. "Spike timing dependent plasticity based enhanced self-learning for efficient pattern recognition in spiking neural networks," In 2017 International Joint Conference on Neural Networks (IJCNN), p. 1847-1854, IEEE, May 2017.
- Lee, C., Srinivasan, G., Panda, P. and Roy, K. "Deep spiking convolutional neural network trained with unsupervised spike timing dependent plasticity," IEEE Transactions on Cognitive and Developmental Systems, 2018.
- Srinivasan, G., Panda, P. and Roy, K. "STDP-based unsupervised feature learning using convolution-over-time in spiking neural networks for energy-efficient neuromorphic computing," ACM Journal on Emerging Technologies in Computing Systems (JETC), 14(4), p.44, 2017.
Student Researchers: Chankyu Lee, Sourjya Roy, Gopal Srinivasan
Stochastic Spike Timing Dependent Plasticity (Stochastic-STDP)
We proposed STDP-based stochastic learning rules, incorporating Hebbian and anti-Hebbian mechanisms, for achieving energy- and memory-efficient on-chip training and inference in SNNs composed of binary and quaternary synaptic weights. We also demonstrated efficient realization of stochastic STDP-trained binary SNNs enabled by CMOS and emerging device technologies.
Publications:
- Srinivasan, G., Sengupta, A. and Roy, K. "Magnetic tunnel junction based long-term short-term stochastic synapse for a spiking neural network with on-chip STDP learning," Scientific reports, 6, p.29545, 2016.
- Srinivasan, G. and Roy, K. "ReStoCNet: Residual Stochastic Binary Convolutional Spiking Neural Network for Memory-Efficient Neuromorphic Computing," Frontiers in Neuroscience, 13, p.189, 2019.
- Sengupta, A., Srinivasan, G., Roy, D. and Roy, K. "Stochastic inference and learning enabled by magnetic tunnel junctions," In 2018 IEEE International Electron Devices Meeting (IEDM), p. 15-6, IEEE, December 2018.
Student Researchers: Deboleena Roy, Gopal Srinivasan
B. Compute-in-Memory using CMOS SRAM and DRAM Arrays
“In-Memory computing“ is a promising candidate to achieve significant throughput and energy benefits. Currently, we are exploring novel ideas to incorporate compute capabilities inside SRAM and DRAM. We have shown that digital bulk-bitwise/arithmetic operations and analog binary/multi-bit dot-product computations/ multiplication can be performed inside SRAM arrays. Moreover, in-memory full addition has been shown using commodity DRAM banks. Currently, we are exploring different methodologies to add compute functionalities in embedded DRAM gain cells. We run neuromorphic/machine learning applications (e.g. ANNs, SNNs, k-NN) on such in-memory computing based systems to evaluate performance and energy benefits against conventional von Neumann systems.
Publications:
- Agrawal, A. Jaiswal, B. Han, G. Srinivasan, and K. Roy, “Xcel-ram: Accelerating binary neural networks in high-throughput SRAM compute arrays,” in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 8, pp. 3064-3076, Aug. 2019.
- A. Jaiswal, I. Chakraborty, A. Agrawal, and K. Roy, “8t sram cell as a multi-bit dot product engine for beyond von-neumann computing,” arXiv preprint arXiv:1802.08601, 2018.
- A. Agrawal, A. Jaiswal, C. Lee, and K. Roy, “X-sram: Enabling inmemory boolean computations in cmos static random access memories,” IEEE Transactions on Circuits and Systems I: Regular Papers, no. 99, pp. 1–14, 2018.
Student Researchers: Amogh Agrawal, Mustafa Ali, Sangamesh Kodge
C. Spin-based devices and Memristive crossbars as in-Memory Computing Primitives
Spin-based and resistive memories are promising candidates to replace CMOS-based memory technologies due to their non volatility. We are currently exploring different methodologies to add compute capabilities to such non-volatile memories. Additionally, we develop memristor-based accelerator architectures to enable the acceleration of a wide variety of Machine Learning (ML) inference workloads. Moreover, we are exploring various methods to model non-idealities in memristor crossbars and their effect on application accuracy.
Publications:
- Aayush Ankit, et.al. “PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference”. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '19). ACM, New York, NY, USA, 715-731.
- A. Jaiswal, A. Agrawal, and K. Roy, “In-situ, in-memory stateful vector logic operations based on voltage controlled magnetic anisotropy,” Scientific Reports, vol. 8, no. 1, p. 5738, 2018.
- I. Chakraborty, D. Roy and K. Roy, "Technology Aware Training in Memristive Neuromorphic Systems for Nonideal Synaptic Crossbars," in IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 2, no. 5, pp. 335-344, Oct. 2018.
- Amogh Agrawal, Chankyu Lee and Kaushik Roy, “X-CHANGR: Changing Memristive Crossbar Mapping for Mitigating Line-Resistance Induced Accuracy Degradation in Deep Neural Networks,” arXiv preprint arXiv:1907.00285, 2019.
Student Researchers: Aayush Ankit, Indranil Chakraborty, Amogh Agrawal, Mustafa Ali
D. ROM-Embedded RAM Structures in SRAM and STT-MRAM and their adoption to Accelerate Neuromorphic Applications
Embedding ROM storage in RAM arrays provides an almost cost-free opportunity to perform transcendental functions and other lookup-table (LUT) based computations in memory. We have developed RAM arrays with the ability to store ROM data in the same array using different memory technologies like CMOS SRAM and STT-MRAM. Moreover, we adopt such ROM-embedded RAM structures in accelerator architectures to accelerate Neuromorphic applications that depend heavily on such LUT-based operations.
Publications:
- A. Agrawal, A. Ankit and K. Roy, "SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives," in IEEE Transactions on Computers, vol. 68, no. 8, pp. 1190-1200, 1 Aug. 2019.
- A. Agrawal and K. Roy, "RECache: ROM-Embedded 8-Transistor SRAM Caches for Efficient Neural Computing," 2018 IEEE International Workshop on Signal Processing Systems (SiPS), Cape Town, 2018, pp. 19-24.
- X. Fong, R. Venkatesan, D. Lee, A. Raghunathan and K. Roy, "Embedding Read-Only Memory in Spin-Transfer Torque MRAM-Based On-Chip Caches," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no. 3, pp. 992-1002, March 2016.
- D. Lee and K. Roy, "Area Efficient ROM-Embedded SRAM Cache," in IEEE Transactions on Very Large-Scale Integration (VLSI) Systems, vol. 21, no. 9, pp. 1583-1595, Sept. 2013.
- D. Lee, X. Fong, and K. Roy, “R-MRAM: A ROM-Embedded STT MRAM Cache,” IEEE Electron Device Letters 34 (10), 1256-1258, 2013.
Student Researchers: Amogh Agrawal
E. Embedded Cache Memories
Embedded cache memories are expected to occupy 90% of the total die area of a system-on-a-chip. Nano-scaled SRAM bitcells having minimum sized transistors are vulnerable to inter-die as well as intra-die process variations. Intra-die process variations include random dopant fluctuation (RDF), line edge roughness (LER) etc. We are developing device, circuit and architecture level techniques for robust, low power nano-scaled memory technologies.
The chief areas of research in this field can be summarized below:
- Process variation tolerant design of nano scaled SRAM bitcell and peripheral design
- Process variation aware design of STT MRAM bitcell/architecture (STT-Spin Transfer Torque)
Student Researchers: Jaydeep Kulkarni, Ashish Goel, Ik-joon Chang, Jing Li, S. Mukhopadhyay, Chris Kim, Hamid Mahmoodi, Amit Agarwal, Aditya Bansal, Saakshi Gangwal, Dheepa Lekshmanan
XIV. Process Variations and Error Resilience
Student Researchers: Patrick Ndai, Ashish Goel, Kunhyuk Kang, Swaroop Ghosh, Nilanjan Banerjee, Myeong-Eun Hwang, Keejong Kim, Keejong Kim, Ali. Keshavarzi, Naran. Sirisantana, Seung-Hoon Choi, Yonghee Im, Woopyu Jeong, Shiyou Zhao, Dinesh Somashekhar, Cassandra Crotty Neau, Chris Hyungil Kim, Swarup Bhunia, Bipul Paul, Hamid Mahmoodi, Saibal Mukhopadhyay, Amit Agarwal, Yiran Chen, Animesh Datta, James D. Gallagher
XV. Reliability
Achieving a satisfactory level of lifetime reliability has become a challenging task in scaled technologies due to the increased reliability issues such as Negative Bias Temperature Instability (NBTI), Time Dependent Dielectric Breakdown (TDDB), Hot Carrier Injection (HCI), electro-migration, etc. In this lab, we are working on modeling and analysis of some of the most alarming reliability concerns, such as NBTI. In addition, we are developing elegant techniques to alleviate the reliability problems at both circuit and design synthesis level.
Student Researchers: Sang Phill Park, George Panagopoulos, Niladri Mojumder, Kunhyuk. Kang, Keejong Kim, Saakshi Gangwal
XVI. Scaled CMOS Devices/Circuits (Double-Gate Technology)
Student Researchers: Ashish Goel, Jaydeep Kulkarni, Niladri Mojumder, Deepha Lekshmanan, Qikai Chen, Saakshi Gangwal, Tamer Cakici, Amit Agarwal, Cassondra Neau, Chris Kim, Hamid Mahmoodi, Hari Ananthanarayanan, Jae-Joon Kim, Liqiong Wei, Rongtian Zhang, Saibal Mukhopadhyay
XVII. VLSI Test and Fault Tolerance
Sub-wavelength lithography has led to large variation in transistor geometries and flat-band voltage while intrinsic variations in nano-scaled devices such as line-edge roughness (LER), random dopant fluctuations (RDF), or body thickness variations in thin body SOI have led to large spatial variations in transistor threshold voltage. Such variations along with higher levels of integration can lead to large spread in circuit delay, power, and robustness across different dies. Parameter variations adversely affect minimum geometry circuits such as SRAM cells leading to read, write, access time, and hold failures, while logic circuits may experience parametric failures such as delay, or excessive leakage. To reduce test cost and improve yield, design and test should be considered together. We propose an integrated on-chip tester/BIST and design-for-test circuitries to reduce test cost; sensors to detect process corners; and design and post-Silicon techniques to avoid/repair failures to improve yield.
The overall design and test approach is as follows:.
- Modeling of process variation and failures
- Design-for-testability
- On-chip tester/BIST
- Post-silicon self-calibration and self-repair for improved yield
- Pre-silicon self-repairing for logic.
Student Researchers: Ashish Goel, Jing Li, Mesut Meterelliyoz, Swaroop Ghosh, Arijit Raychowdhury, Qikai Chen, Swarup Bhunia, Saibal Mukhopadhyay, Naran Sirisantana, Ali Keshavarzi, Zhanping Chen, Xiaodong Zhang, Khurram Muhammad
XVIII. Adversarial Attacks and Robustness
The ability to maliciously perturb any input to a deep learning model and cause a misclassification with high confidence belies a lack of security and credibility to what the models have learned. We are working on methods that explain why adversarial attacks occur, and how we can make models more robust to them.
Publications:
- Panda, Priyadarshini, and Kaushik Roy. "Explainable learning: Implicit generative modelling during training for adversarial robustness." arXiv preprint arXiv:1807.02188 (2018).
- Sharmin, Saima, et al. "A Comprehensive Analysis on Adversarial Robustness of Spiking Neural Networks." IJCNN 2019.
- Panda, Priyadarshini, Indranil Chakraborty, and Kaushik Roy. "Discretization based Solutions for Secure Machine Learning against Adversarial Attacks." IEEE Access (2019).
Student Researchers: Indranil Chakraborty, Chankyu Lee, Wachirawit Ponghiran, Saima Sharmin
XIX. Stochastic Computing: Algorithms to Devices
Stochastic computing algorithms find widespread utility in a range of applications including stochastic neural networks, Bayesian inference, and optimization problems (graph coloring, traveling salesman problem) where exact solutions may not be required and error resiliency is built into the algorithms. CMOS-based realizations of stochastic algorithms are area- and power-intensive due to the need for expensive random number generators to implement the stochastic operations. We have devised non-von Neumann architectures using the inherent stochastic switching characteristics of Magnetic Tunnel Junctions (MTJs) in the presence of thermal noise for energy-efficient realization of the stochastic algorithms.
A. Stochastic Neural Networks
We proposed stochastic Spiking Neural Networks (SNNs), realized using MTJ as stochastic spiking neurons that switch probabilistically based on the input, wherein the binary weights are trained offline using error backpropagation. In addition, we proposed SNNs composed of stochastic binary weights trained using hardware friendly Spike Timing Dependent Plasticity (STDP) based probabilistic learning algorithm, which can be enabled by MTJ-based synaptic crossbar arrays with high energy barrier to realize state-compressed hardware with on-chip learning capability. Advantageously, use of high energy barrier MTJ (30-40KT where K is the Boltzmann constant and T is the operating temperature) not only allows compact stochastic primitives, but also enables the same device to be used as a stable memory element meeting the data retention requirement. Such stochastic MTJ-based realization can be potentially an order of magnitude energy-efficient compared to CMOS-only implementations.
Publications:
- Srinivasan, G., Sengupta, A. and Roy, K. "Magnetic tunnel junction based long-term short-term stochastic synapse for a spiking neural network with on-chip STDP learning," Scientific Reports, 6, p.29545, 2016.
- Sengupta A., Panda, P, Wijesinghe P., Kim Y., and Roy, K., “Magnetic Tunnel Junction Mimics Stochastic Cortical Spiking Neurons,” Scientific Reports, 6, p. 30039, 2016.
- Srinivasan, G. and Roy, K. "ReStoCNet: Residual Stochastic Binary Convolutional Spiking Neural Network for Memory-Efficient Neuromorphic Computing," Frontiers in Neuroscience, 13, p.189, 2019.
- Sengupta, A., Srinivasan, G., Roy, D. and Roy, K. "Stochastic inference and learning enabled by magnetic tunnel junctions," In 2018 IEEE International Electron Devices Meeting (IEDM), p. 5-6, IEEE, December 2018.
- Sengupta, A., Parsa, M., Han, B. and Roy, K. "Probabilistic deep spiking neural systems enabled by magnetic tunnel junction," IEEE Transactions on Electron Devices, 63(7), pp.2963-2970, 2016.
- Liyanagedera, C.M., Sengupta, A., Jaiswal, A. and Roy, K. "Stochastic spiking neural networks enabled by magnetic tunnel junctions: From nontelegraphic to telegraphic switching regimes," Physical Review Applied, 8(6), p.064017, 2017.
Student Researchers: Bing Han, Chamika Liyanagedera, Maryam Parsa, Deboleena Roy, Gopal Srinivasan
B. Stochastic Optimization
We have also demonstrated the effectiveness of stochastic MTJ-based compute primitive for efficiently realizing Bayesian inference and the Ising computing model (variant of Boltzmann machine) to solve difficult combinatorial optimization problems like the traveling salesman problem and graph coloring problem. The proposed stochastic MTJ-based device can act as “natural annealer”, helping the algorithms move out of local minima and arrive at near-optimal solutions.
Publications:
- Shim, Y., Chen, S., Sengupta, A. and Roy, K. "Stochastic spin-orbit torque devices as elements for bayesian inference," Scientific reports, 7(1), p.14101, 2017.
- Shim, Y., Jaiswal, A. and Roy, K. "Ising computation based combinatorial optimization using spin-Hall effect (SHE) induced stochastic magnetization reversal," Journal of Applied Physics, 121(19), p.193902, 2017.
- Shim, Y., Sengupta, A. and Roy, K. "Biased Random Walk Using Stochastic Switching of Nanomagnets: Application to SAT Solver," IEEE Transactions on Electron Devices, 65(4), pp.1617-1624, 2018.
- Wijesinghe, P., Liyanagedera, C. and Roy, K. "Analog approach to constraint satisfaction enabled by spin orbit torque magnetic tunnel junctions," Scientific reports, 8(1), p.6940, 2018.
Student Researchers: Shuhan Chen, Chamika Liyanagedera
XX. Neuromorphic Computing Enabled by CMOS and Emerging Device Technologies
A. Deep Neural Networks Enabled by Emerging Technologies
In the current era of ubiquitous autonomous intelligence, there is a growing need for moving Artificial Intelligence (AI) to the edge to cope with the increasing demand for autonomous systems like drones, self-driving cars, and smart wearables. Deploying deep neural networks in resource constrained edge devices necessitates significant rethinking of the conventional Von Neumann architecture. We have proposed non-Von Neumann architectures enabled by emerging device technologies such as Magnetic Tunnel Junctions (MTJs), Ag-Si memristors, and Resistive Random Access Memories (ReRAMs) for efficiently realizing Analog Neural Networks (ANNs) and bio-plausible Spiking Neural Networks (SNNs).
Publications:
- Sengupta, A., Shim, Y. and Roy, K. "Proposal for an all-spin artificial neural network: Emulating neural and synaptic functionalities through domain wall motion in ferromagnets," IEEE transactions on biomedical circuits and systems, 10(6), pp.1152-1160, 2016.
- Chakraborty, I., Roy, D. and Roy, K. "Technology Aware Training in Memristive Neuromorphic Systems for Nonideal Synaptic Crossbars," IEEE Transactions on Emerging Topics in Computational Intelligence, 2(5), pp.335-344, 2018.
- Ankit, A., Hajj, I.E., Chalamalasetti, S.R., Ndu, G., Foltin, M., Williams, R.S., Faraboschi, P., Hwu, W.M.W., Strachan, J.P., Roy, K. and Milojicic, D.S. "PUMA: A programmable ultra-efficient memristor-based accelerator for machine learning inference," In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 715-731, ACM, April 2019.
- Sengupta, A., Banerjee, A. and Roy, K. "Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems," Physical Review Applied, 6(6), p.064003, 2016.
- Sengupta, A., Ankit, A. and Roy, K. "Efficient Neuromorphic Systems and Emerging Technologies: Prospects and Perspectives," In Emerging Technology and Architecture for Big-data Analytics, pp. 261-274, Springer, Cham, 2017.
- Ankit, A., Sengupta, A., Panda, P. and Roy, K. "Resparc: A reconfigurable and energy-efficient architecture with memristive crossbars for deep spiking neural networks," In Proceedings of the 54th Annual Design Automation Conference 2017, p. 27, ACM, June 2017.
Student Researchers: Aayush Ankit, Aparajita Banerjee, Indranil Chakraborty, Deboleena Roy
B. Spin Orbit Torque (SOT)-MTJs, Ag-Si Memristors, and CMOS as ‘Stochastic Bits’ for non-Von Neumann Neural Computing
In addition to neural hardware architectures, the nature of computing - deterministic versus stochastic has substantial influence on the degree of computational efficiency. Deterministic neuronal and synaptic models require multi-bit precision to store the parameters governing their dynamics. We proposed ‘stochastic bits’ enabled neurons and synapses (stochastic during training and deterministic during inference) that compute probabilistically with one-bit precision for state-compressed neuromorphic computing. We presented energy-efficient realizations of the ‘stochastic bit’ using SOT-MTJ, Ag-Si memristor, and CMOS technology. We demonstrated the efficacy of ‘stochastic bits’ enabled neural networks using binary ANNs and SNNs for energy- and memory-efficient training and/or inference on-chip.
Publications:
- Sengupta, A., Parsa, M., Han, B. and Roy, K. "Probabilistic deep spiking neural systems enabled by magnetic tunnel junction," IEEE Transactions on Electron Devices, 63(7), p.2963-2970, 2016.
- Sengupta, A., Panda, P., Wijesinghe, P., Kim, Y. and Roy, K. "Magnetic tunnel junction mimics stochastic cortical spiking neurons, Scientific reports, 6, p.30039, 2016.
- Srinivasan, G., Sengupta, A. and Roy, K. "Magnetic tunnel junction based long-term short-term stochastic synapse for a spiking neural network with on-chip STDP learning," Scientific reports, 6, p.29545, 2016.
- Srinivasan, G., Sengupta, A. and Roy, K. "Magnetic tunnel junction enabled all-spin stochastic spiking neural network," In Design, Automation & Test in Europe Conference & Exhibition (2017), p. 530-535, IEEE, March 2017.
- Liyanagedera, C.M., Sengupta, A., Jaiswal, A. and Roy, K. "Stochastic spiking neural networks enabled by magnetic tunnel junctions: From nontelegraphic to telegraphic switching regimes," Physical Review Applied, 8(6), p.064017, 2017.
- Wijesinghe, P., Ankit, A., Sengupta, A. and Roy, K. "An all-memristor deep spiking neural computing system: A step toward realizing the low-power stochastic brain," IEEE Transactions on Emerging Topics in Computational Intelligence, 2(5), pp.345-358, 2018.
- Roy, D., Srinivasan, G., Panda, P., Tomsett, R., Desai, N., Ganti, R., and Roy, K. "Neural Networks at the Edge," IEEE International Conference on Smart Computing, Washington, DC, USA, pp. 45-50, 2019.
Students Researchers: Aayush Ankit, Bing Han, Chamika Liyanagedera, Maryam Parsa, Deboleena Roy, Gopal Srinivasan
XXI. Recurrent Liquid State Machines for Spatiotemporal Pattern Recognition
Liquid State Machines (LSMs) are simple networks consisting of random connections of spiking neurons (both recurrent and feed-forward) that can learn complex tasks with very little trainable parameters. Such sparse and randomly interconnected recurrent SNNs exhibit highly non-linear dynamics that transform spatiotemporal inputs into rich high-dimensional representations based on the current and past context. The random input representations can be efficiently interpreted by an output (or readout) layer with trainable parameters. We proposed training and inference methodologies for single- and multi-liquid (ensemble) LSMs and demonstrated their efficacy for recognition (image, speech, and gesture recognition) and reinforcement learning tasks. In addition, we also developed analytical tools for explaining the LSM dynamics and performance.
Publications:
- Panda, P. and Roy, K. "Learning to generate sequences with combination of hebbian and non-hebbian plasticity in recurrent spiking neural networks," Frontiers in neuroscience, 11, p.693, 2017.
- Panda, P. and Srinivasa, N. "Learning to recognize actions from limited training examples using a recurrent spiking neural model," Frontiers in neuroscience, 12, p.126, 2018.
- Srinivasan, G., Panda, P. and Roy, K. "Spilinc: spiking liquid-ensemble computing for unsupervised speech and image recognition," Frontiers in neuroscience, 12, p.524, 2018.
- Wijesinghe, P., Srinivasan, G., Panda, P. and Roy, K. "Analysis of Liquid Ensembles for Enhancing the Performance and Accuracy of Liquid State Machines," Frontiers in neuroscience, 13, p.504, 2019.
- Ponghiran, W., Srinivasan, G. and Roy, K. "Reinforcement Learning with Low-Complexity Liquid State Machines," arXiv preprint arXiv:1906.01695, 2019.