Hardware and Architecture, Artificial Intelligence, Computer Engineering
This project aims to study applications aimed at the environment of space stations, satellites, and spacecraft that enable crew members or autonomous systems to make accurate decisions throughout a mission based on real-time data acquisition.
This project aims to investigate and propose solutions that enable the use of hardware and software projects in an environment with hostile radiation, ie, space environments.
This project aims to investigate and propose hardware (ISA) and software (compiler) solutions to improve the performance of Machine Learning applications based on neural networks. This implies both the development of architectural changes to the development of architecture-specific specialized libraries.
Jonas Gava, Alex Hanneman, Geancarlo Abich, Rafael Garibotti, Sergio Cuenca-Asensi, Rodrigo Possamai Bastos, Ricardo Reis, and Luciano Ost Institute of Electrical and Electronics Engineers (IEEE)
Geancarlo Abich, Anderson Ignacio Silva, Jonas Gava, Altamiro Amadeu Susin, Ricardo Reis, and Luciano Ost IEEE
Integrating Machine Learning (ML) inference models into edge computing devices has introduced several challenges related to improving power efficiency, performance, and reliability. As the susceptibility of these models to radiation-induced soft errors is a significant concern, applying lightweight mitigation techniques is key, mostly due to power and memory constraints inherent to edge devices. In this regard, assessing the potential power and performance penalties associated with deploying soft error mitigation techniques on customized ML inference models running in such resource-constrained devices is crucial. This paper, therefore, investigates the performance and power consumption implications of applying software-based mitigation techniques on ML inference models optimized for edge devices. The experiments demonstrate that implementing RAT technique reduced up to 3.2 x the susceptibility to the occurrence of soft errors caused by radiation with low performance and power consumption costs w.r.t. a P-TMR technique.
Geancarlo Abich, Anderson Ignacio da Silva, José Eduardo Thums, Rafael da Silva, Altamiro Amadeu Susin, Ricardo Reis, and Luciano Ost IEEE
Incorporating Machine Learning (ML) inference models into edge computing devices has presented some performance and reliability enhancement challenges. Multi-threaded ML models demonstrated power efficiency and performance gains when deployed on high-performance multicore platforms, including powerful general-purpose processors and graphics processing units (GPUs). However, there is a relative lack of investigation on the potential impact of parallel pre-trained ML models when executed in resource-constrained edge devices, which rely on low energy and reduced memory footprint processors. With that in mind, this work presents the impact of multi-threaded parallelism on the performance, power consumption, and soft error reliability of different ML inference models. Results show that parallel models enhanced by up to 2.6$\\times$ the performance per watt while reducing their susceptibility to the occurrence of radiation-induced soft errors in up to 6.6$\\times$ w.r.t. the original sequential versions.
Geancarlo Abich, Rafael Garibotti, Ricardo Reis, and Luciano Ost Institute of Electrical and Electronics Engineers (IEEE)
Driven by the success of machine learning algorithms for recognizing and identifying objects, there are significant efforts to exploit convolutional neural networks (CNNs) in edge devices. The growing adoption of CNNs in safety-critical embedded systems (e.g., autonomous vehicles) increases the demand for safe and reliable models. In this sense, this brief investigates the soft error reliability of two CNN inference models considering single event upsets (SEUs) occurring in register files, RAM, and Flash memory sections. The results show that the incidence of SEUs in flash memory sections tend to lead to more critical faults than those resulting from the occurrence of bit-flips in RAM sections and register files.
Geancarlo Abich, Rafael Garibotti, Jonas Gava, Ricardo Reis, and Luciano Ost IEEE
Convolution neural networks (CNNs) have been incorporated into resource-constrained edge devices to intelligently manage and process local data coming from a variety of sensors. Thread parallelism has been used to boost the performance of neural networks, but only few works address the effect of these parallel modifications on the soft error reliability of underlying models running on edge devices. In this sense, this work aims to assess the soft error reliability of a multi-threaded version of a CNN model developed based on the Arm CMSIS-NN kernels. Results show that the developed threaded CNN model increases performance at the cost of low memory footprint overhead. Promoted multi-threaded CNN model also provides better soft error reliability w.r.t. the original sequential version.
Geancarlo Abich, Jonas Gava, Rafael Garibotti, Ricardo Reis, and Luciano Ost Institute of Electrical and Electronics Engineers (IEEE)
Deep neural networks (DNNs) are being incorporated in resource-constrained IoT devices, which typically rely on reduced memory footprint and low-performance processors. While DNNs’ precision and performance can vary and are essential, it is also vital to deploy trained models that provide high reliability at low cost. To achieve an unyielding reliability and safety level, it is imperative to provide electronic computing systems with appropriate mechanisms to tackle soft errors. This paper, therefore, investigates the relationship between soft errors and model accuracy. In this regard, an extensive soft error assessment of the MobileNet model is conducted considering precision bitwidth variations (2, 4, and 8 bits) running on an Arm Cortex-M processor. In addition, this work promotes the use of a register allocation technique (RAT) that allocates the critical DNN function/layer to a pool of specific general-purpose processor registers. Results obtained from more than 4.5 million fault injections show that RAT gives the best relative performance, memory utilization, and soft error reliability trade-offs w.r.t. a more traditional replication-based approach. Results also show that the MobileNet soft error reliability varies depending on the precision bitwidth of its convolutional layers.
Geancarlo Abich, Rafael Garibotti, Vitor Bandeira, Felipe da Rosa, Jonas Gava, Felipe Bortolon, Guilherme Medeiros, Fernando G. Moraes, Ricardo Reis, and Luciano Ost Institution of Engineering and Technology (IET)
Luciano Ost, Loughborough University, Epinal Way Loughborough Leicestershire LE11 3TU, UK. Email: firstname.lastname@example.org Abstract Soft error resilience has become an essential design metric in electronic computing systems as advanced technology nodes have become less robust to high‐charged particle effects. Designers, therefore, should be able to assess this metric considering several software stack components running on top of commercial processors, early in the design phase. With this in mind, researchers are using virtual platform (VP) frameworks to assess this metric due to their flexibility and high simulation performance. In this regard, herein, this goal is achieved by analysing the soft error consistency of a just‐in‐time fault injection simulator (OVPsim‐FIM) against fault injection campaigns conducted with event‐driven simulators (i.e. more realistic and accurate platforms) considering single and multicore processor architectures. Reference single‐core fault injection campaigns are performed on RTL descriptions of Arm Cortex‐M0 and M3 processors, while gem5 simulator is used to multicore Arm Cortex‐A9 scenarios. Campaigns consider different open‐source and commercial compilers as well as real software stacks including FreeRTOS/Linux kernels and 52 applications. Results show that OVPsim‐FIM is more than 1000� faster than cycle‐accurate simulators and up to 312� faster than event‐driven simulators, while preserving the soft error analysis accuracy (i.e. mismatch below to 10%) for single and multicore processors.
Geancarlo Abich, Ricardo Reis, and Luciano Ost IEEE
Machine learning (ML) algorithms are being incorporated in resource-constrained IoT platforms, which typically rely on reduced memory footprint and low performance processors. While performance improvement, customized, and reduced-precision implementations of such algorithms have been studied extensively, their susceptibility to soft errors caused by radiation particles is still an open question. This work contributes by investigating the impact of precision bitwidth on the soft error reliability of the MobileNet convolutional neural network (CNN) when executed on an Arm Cortex-M processor. Results obtained from more than 500k fault injections show that the soft error reliability varies depending on the precision bitwidth of the convolutional layers.
Geancarlo Abich, Jonas Gava, Ricardo Reis, and Luciano Ost IEEE
Machine learning (ML) algorithms have provided straightforward solutions to a wide range of applications. The high computational demand of such algorithms limits their adoption in resource-constrained devices, typically relying on reduced memory footprint and low-power components (e.g., processors). While performance improvement, customized, and reduced-precision implementations of ML algorithms have been studied extensively, their susceptibility to soft errors caused by radiation particles is still an open question. In this regard, this work contributes to the soft error reliability assessment of a convolutional neural network (CNN) developed based on the Arm CMSIS-NN library. Results show that the soft error reliability varies depending on the instruction set architecture and the layer where the faults are injected.
Jean Carlo Hamerski, Geancarlo Abich, Ricardo Reis, Luciano Ost, and Alexandre Amory IEEE
Current multiprocessor systems might comprise dozens of processors, requiring a runtime management to provide performance while complying with system's constraints such as energy consumption, thermal balance, and fault tolerance. Self-adaptive multiprocessor systems have been proposed to cover some key aspects such as hardware abstraction, programming models and modular software architecture. This paper provides a modular middleware to assist the development of self-adaptive services and applications on MPSoC environments. Based on key design patterns, the proposed middleware uses highly efficient features of object-oriented programming for embedded systems and specific compiler optimization options. The results show a case study of a self-adaptive application on top of the proposed middleware. Additional experiments demonstrate a reduction in the execution time from 3% up to 19%, presenting a memory footprint overhead of 8.7% when compared to previous middleware without the features to ease modular software design.
Felipe T. Bortolon, Geancarlo Abich, Sergio Bampi, Ricardo Reis, Fernando Moraes, and Luciano Ost IEEE
Software reliability is an essential design metric in emerging large-scale multiprocessor embedded systems. Designers should identify soft error susceptibility of multiple applications executing in parallel early in the design time to ensure reliable system operation. This work proposes a non-intrusive fault injection engine that enables to conduct bespoke soft error analysis, allowing to identify and understand the soft error propagation through the processing elements (PEs). The proposed fault injection campaign evaluates the impact of soft errors considering real benchmarks in an RTL model of a distributed-memory NoC-based multiprocessor. Experiments demonstrate that 19% of soft errors are propagated to other PEs, where 31.6% of them led to erroneous computation and 58.4% to a system crash. Thus, the fault analysis must consider not only its local effect on the processor and memory but also how the fault propagates to other system components.
Jean Carlo Hamerski, Geancarlo Abich, Ricardo Reis, Luciano Ost, and Alexandre Amory IEEE
Shared memory and message passing are traditional parallel programming models used on multiprocessor system-on-chip environments. Underlying models are traditionally meant for static scenarios where all communicating entities and their intercommunication patterns are known a priori by the software engineer. The systems design following such programming models became complex due to dynamic behavior of applications at runtime. The goal of this work is to incorporate a publish-subscribe programming model to an MPSoC framework to decouple, in the time and space, the application development. The modified MPSoC framework is composed of a FreeRTOS kernel running on homogeneous processing elements distributed into a network-on-chip. The results present reduction around of 2% to 30% in DTW application execution time, and low overhead in memory footprint when comparing the original MPI primitives with the publish-subscribe programming model.
G. Abich, M. G. Mandelli, F. R. Rosa, F. Moraes, L. Ost, and R. Reis IEEE
With the ever-increasing complexity of both embedded application workloads and multiprocessor platforms grows the demand for efficient mapping heuristics able of allocating several application workloads at runtime. The majority of promoted mapping techniques are bespoke implementations that consider an in-house operating system, which is developed to a particular architecture, restricting its adoption in other platforms. This work proposes a FreeRTOS extension that supports distributed task mapping heuristics, which enables to balance application workloads in multiprocessor architectures at runtime. Promoted extension is validated through a trustworthy number of scenarios considering large scale Cortex-M-based multiprocessor systems executing up to 600 application tasks.
G. Abich Wiley