Scalable and Fault Resilient Physical Neural Networks on a Single Chip

Published in CASES, 2014

Recommended citation: W. Shi, Y. Wen, Z. Liu, X. Zhao, D. Boumber, R. Vilalta and L. Xu, “Scalable and Fault Resilient Physical Neural Networks on a Single Chip”, CASES 2014

This paper presents a design and implementation of a physical neural network that is resilient to permanent hardware faults. To achieve scalability, it uses tiled neuron clusters where neuron outputs are efficiently forwarded to the target neurons using source based spanning tree routing. To achieve fault resilience in the face of increasing number of permanent hardware failures, the design proactively preserves neural network computing performance by selectively replicating performance critical neurons. Furthermore, the paper presents a spanning tree recovery solution that mitigates disruption to distribution of neuron outputs caused by failed neuron clusters. The proposed neuron cluster design is implemented in Verilog. We studied the fault resilience performance of the described design using a RBM neural network trained for classifying handwritten digit images. Results demonstrate that our approach can achieve improved fault resilience performance by replicating only 5% most important neurons.

Download paper here

DOI: 10.1145/2656106.2656126

Recommended citation (BibTex):

@inproceedings{DBLP:conf/cases/ShiWLZBVX14,  
  author    = {Weidong Shi and  
               Yuanfeng Wen and  
               Ziyi Liu and  
               Xi Zhao and  
               Dainis Boumber and  
               Ricardo Vilalta and  
               Lei Xu},  
  title     = {Fault resilient physical neural networks on a single chip},  
  booktitle = {2014 International Conference on Compilers, Architecture and Synthesis  
               for Embedded Systems, {CASES} 2014, Uttar Pradesh, India, October  
               12-17, 2014},  
  pages     = {24:1--24:10},  
  year      = {2014},  
  crossref  = {DBLP:conf/cases/2014},  
  url       = {https://doi.org/10.1145/2656106.2656126},  
  doi       = {10.1145/2656106.2656126},  
  timestamp = {Tue, 06 Nov 2018 11:07:42 +0100},  
  biburl    = {https://dblp.org/rec/bib/conf/cases/ShiWLZBVX14},  
  bibsource = {dblp computer science bibliography, https://dblp.org}  
}