ARDiS

ARDiS is the first open-source and portable framework to provide a unified, architecture-agnostic platform for running system-level resource management (RM) techniques directly on real hardware.

ARDiS eliminates the need to ”reinvent the wheel,” enabling researchers to design, implement, and evaluate sophisticated RM strategies��including machine learning-based approaches—with minimal effort and maximum reproducibility.

Paper link
GitHub link

AdaPT

AdaPT is a fast emulation framework that extends PyTorch to support approximate multipliers-based DNN inference as well as approximation-aware retraining. AdaPT can be seamlessly deployed and is compatible with most DNNs. You can evaluate the framework on several DNN models and application fields including CNNs, LSTMs, and GANs for any approximate multiplier.

Citation:
Dimitrios Danopoulos, Georgios Zervakis, Kostas Siozios, Dimitrios Soudris, Jörg Henkel, "AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch." arXiv preprint arXiv:2203.04071 (2022).

Available via GIT

CoRQ

CoRQ is an on-chip partial reconfiguration manager for FPGA. It coffers the possibility to enlist different reconfiguration jobs and prefetch the Bitstreams to the on-chip RAM to achieve fast reconfiguration of the FPGA.

Available via GIT

LoopBreaker

LoopBreaker can be used to achieve the fast disable of the partial reconfiguration regions in multi-tenant FPGAs. It can be used also to extract the different parts of the bitstream. Hence, being able to perform out of order reconfiguration for the FPGA.

Available via GIT

Design for Reliability

Under the following link you can find our "Degradation-Aware Cell Libraries", our recently physics-based mode for "Instantaneous Aging Effects", our physics-based aging model "Interdependencies of Degradation Effects" which models the joint impact of BTI and HCI aging effects and our "Thermal-Aware Cell Libraries". Our Degradation-Aware Cell Libraries and Thermal-Aware Cell Libraries are fully compatible with existing EDA tool flows such for Synthesis and timing/power analysis. Therefore, they can be used directly without requiring any further modifications.

Available for download at: Dependable Hardware

lpACLib: An Open-Source Library for Low-Power Approximate Computing Modules

Introduction and Short Description

“lpACLib” is an open-source library for Low-Power Approximate Computing Modules (like adders and multiplier of different bit-widths) available for download at: https://sourceforge.net/projects/lpaclib/.

It contains both synthesizable VHDL description and behavioral implementations in C (MATLAB implementations in progress). Besides our novel designs, it also contains implementations for several state-of-the-art arithmetic modules and their approximate versions and their area, power, and quality characterization. One of the key purposes of this open-source library is to facilitate research and development in approximate computing at higher abstraction levels, and to facilitate reproducible research and comparisons. For instance, these approximate arithmetic modules in different combinations can be used to develop novel approximate accelerators or more complex approximate circuits. This will also save precious research and development time, and will eliminate the huge amount of redundancy in the typical design work-flow of approximate blocks (the non-trivial task of re-implementing state-of-the-art).

Citation

In case of usage, please refer to our corresponding DAC 2016 publication:

Muhammad Shafique, Rehan Hafiz, Semeen Rehman, Walaa El-Harouni, Jörg Henkel, "A Low Latency Generic Accuracy Configurable Adder", in 53^nd ACM/EDAC/IEEE Design Automation Conference & Exhibition (DAC), 2016.

Features

The “lpACLib” library contains the VHDL description of accurate and approximate versions of several arithmetic modules and accelerators. Moreover, it also provides the corresponding software behavioral models/implementations developed in C (MATLAB implementations in progress) to enable quality characterization.

This library contains the following components.

State-of-the-Art Implementations:

P. Kulkarni, P. Gupta and M. Ercegovac, “Trading Accuracy for Power with an Underdesigned Multiplier Architecture”, in 24th International Conference on VLSI Design, Chennai, India, 2011, pp. 346-351.
V. Gupta, D. Mohapatra, S. P. Park, A. Raghunathan and K. Roy, “IMPACT: IMPrecise adders for low-power approximate computing“, in International Symposium on Low Power Electronics and Design (ISLPED), Fukuoka, Japan, 2011, pp. 409-414.
V. Gupta, D. Mohapatra, A. Raghunathan and K. Roy, “Low-Power Digital Signal Processing Using Approximate Adders“, in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 32, no. 1, pp. 124-137, Jan. 2013.

Disclaimer:
We have implemented these designs as per the truth tables provided in the paper. Results may differ depending upon the implementation style, synthesis configuration, and used technology.

List of Implementations (more will be coming soon…):
The library includes the following new designs (The complete list is given in the “readme.txt” file in the HW folder. Also see the Snapshot)

Accurate 1-bit and multi-bit adders
Five approximate 1-bit adder and several multi-bit adders
Accurate 2x2, 4x4, 8x8, 16x16 multipliers
Two approximate designs for 2x2 multipliers and their accuracy configurable versions.
Approximate 4x4, 8x8, 16x16 multipliers using accurate adders for partial product summation
Other approximate multiplier designs using mixes of approximate adders and 2x2 multipliers
Accurate and different approximate designs of SAD (Sum of Absolute Differences) accelerator

Extensions:
In future, we will keep extending this library with more components and more state-of-the-art implementations. We cordially welcome other research groups, researchers, and developers to contribute to this open-source library, extend it with more design options and implementations, or even refine the existing designs and provide different synthesis configurations and results. In this regard, please contact Dr. Muhammad Shafique (swahlah∂yahoo.com) for more discussion.

Detailed Comments

See the “readme” file under the HW folder for detailed comments related to the hardware circuits.
Each file lists all the necessary files (sub-circuits) needed for its compilation.
Testbenches are provided for several circuits “testbench.vhd“. These may require editing depending upon the tool environment and settings.
“pt_script.tcl” is for running primetime power analysis tool. Appropriate places in the file that require editing before a run are marked.
“script.tcl” is for running Synopsys Design Compiler for area and power analysis. Appropriate places in the file that require editing before a run are marked.
Note: for SAD circuits the files are different as they include clocking for delay analysis.

Library of Approximate Adders

Introduction

We provide MATLAB and Verilog Models of GeAr, and previously proposed adders (ACA-I, ETAII, ACA-II and GDA) at https://sourceforge.net/projects/approxadderlib/

GeAr is a low latency Generic Accuracy Configurable Adder that provides a higher number of potential configurations compared to state-of-the-art approximate adders, thus enabling a high degree of flexibility and trade-off between performance and output quality.

These MATALB and Verilog models can allow software programmer as well as hardware designers to evaluate their code and design. To the best of our knowledge, this is the first open-source library of approximate adders that facilitates reproducible comparisons and further research and development in this direction across various layers of design abstraction.

This work is a result of collaborative effort between Vision Image and Signal Processing (VISpro) Lab at SEECS-NUST and Chair for Embedded Systems (CES) at Karlsruhe Institute of Technology (KIT), Germany.

Citation

In case of usage, please refer to our corresponding DAC 2015 publication:

Muhammad Shafique, Waqas Ahmad, Rehan Hafiz, Jörg Henkel, "A Low Latency Generic Accuracy Configurable Adder", in 52^nd ACM/EDAC/IEEE Design Automation Conference & Exhibition (DAC), 2015.

Features

All the MATLAB functions and hence the adders parameterizable and can be configured to construct any type of adder configuration.
The error probability function can be used to calculate the error probability of GeAr adder as well as following previous adders i.e. ACA-I, ETAII, ACA-II and GDA.

Main Contributors

Muhammad Shafique
Waqas Ahmad
Rehan Hafiz
Jörg Henkel

Software Guide

The software directory contains two folders, MATLAB R2013a (containing MATLAB functions and codes) and ISE Design Suite 14.5 (containing VERILOG HDL codes). The MATLAB directory contains the functions of GeAr adder and previously developed adders like ACA-I, ETAII, ACA-II and GDA. It also contains the error probability calculation function. The details of input and output parameters of each function are mentioned in the function and also a text file is included. The ISE directory contains the codes of GeAr, ACA-I, ETAII, ACA-II and GDA for fixed configurations. Each adder has a text file that defines the configurations and inputs and outputs of the codes.

MatEx: Efficient Transient and Peak Temperature Computation for Compact Thermal Models

In many core systems, run-time scheduling decisions, such as task migration, core activations/deactivations, voltage/frequency scaling, etc., are typically used to optimize the resource usages. Such run-time decisions change the power consumption, which can in turn result in transient temperatures much higher than any steady-state scenarios. Therefore, to be thermally safe, it is important to evaluate the transient peaks before making resource management decisions. This paper presents a method for computing these transient peaks in just a few milliseconds, which is suited for run-time usage. This technique works for any compact thermal model consisting in a system of first-order differential equations, for example, RC thermal networks. Instead of using regular numerical methods, our algorithm is based on analytically solving the differential equations using matrix exponentials and linear algebra. This results in a mathematical expression which can easily be analyzed and differentiated to compute the maximum transient temperatures. Moreover, our method can also be used to efficiently compute all transient temperatures for any given time resolution without accuracy losses. We implement our solution as an open-source tool called MatEx. Our experimental evaluations show that the execution time of MatEx for peak temperature computation can be bounded to no more than 2.5 ms for systems with 76 thermal nodes, and to no more than 26.6 ms for systems with 268 thermal nodes, which is three orders of magnitude faster than the state-of-the-art for the same settings.

Citation:
Santiago Pagani, Jian-Jia Chen, Muhammad Shafique, and Jörg Henkel, "MatEx: Efficient Transient and Peak Temperature Computation for Compact Thermal Models", in Proceedings of the 18th IEEE/ACM Design, Automation & Test in Europe (DATE), Grenoble, France, March 2015.

Paper
Source Code

Thermal Safe Power (TSP)

Chip manufacturers provide the Thermal Design Power (TDP) for a specific chip. The cooling solution is designed to dissipate this power level. But because TDP is not necessarily the maximum power that can be applied, chips are operated with Dynamic Thermal Management (DTM) techniques. To avoid excessive triggers of DTM, usually, system designers also use TDP as power constraint. However, using a single and constant value as power constraint, e.g., TDP, can result in big performance losses in many-core systems. Having better power budgeting techniques is a major step towards dealing with the dark silicon problem.
This paper presents a new power budget concept, called Thermal Safe Power (TSP), which is an abstraction that provides safe power constraint values as a function of the number of simultaneously operating cores. Executing cores at any power consumption below TSP ensures that DTM is not triggered. TSP can be computed offline for the worst cases, or online for a particular mapping of cores. Our simulations show that using TSP as power constraint results in 50.5% and 14.2% higher average performance, compared to using constant power budgets (both per-chip and per-core) and a boosting technique, respectively. Moreover, TSP results in dark silicon estimations which are more optimistic than estimations using constant power budgets.

Citation:
Santiago Pagani, Heba Khdr, Waqaas Munawar, Jian-Jia Chen, Muhammad Shafique, Minming Li, and Jörg Henkel, "TSP: Thermal Safe Power - Efficient power budgeting for Many-Core Systems in Dark Silicon", in IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), New Delhi, India, October 2014.

Paper
Source Code

Non-Research Software

The software available on this site is provided "as is" and any expressed or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall CES, the Karlsruhe Institute of Technology (KIT), or any of their contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.

Crowd-Sourced Bluetooth Tracker Apps

BLELoc and Crowd Blue Locator are two proof-of-concept crowdsourced Bluetooth tracking applications for Android OS. They are implemented as part of the Software-Entwicklung (PSE) course. These applications track the Bluetooth devices around an app user anonymously (unknown to the tracking user) and send a location and timestamp to the device owner when trackers encounter a searched Bluetooth device. The owners can register their paired Bluetooth devices in the apps and enable or disable the search. The apps can be used to track, for instance, personal Bluetooth headphones, wireless speakers, smart-watches, fitness bands, other phones/tablets (when Bluetooth is on), smart pens, Bluetooth mouse and keyboards and other pairable IoT devices independent of manufacturer.

The two apps differ in where they discard irrelevant observations. Crowd Blue Locator reports all Bluetooth device observations to our server. All observations of unsearched devices are discarded automatically by the server. If the devices are searched, the observations are sent to the device owners. BLELoc shares a list of all searched devices and an encrypted list of observations for all searched devices. A device owner can only decode observations of his/her registered devices. BLELoc app discards the unsearched devices on the smartphone.

The apps are shared in the following state:

Proof-of-concept, i.e., not optimized.
Require paired devices for privacy reasons (does not work with Bluetooth beacons).
May reduce the battery life of Bluetooth devices and the tracking smartphones.

To install, 3rd party apps must be enabled.

Download

Crowd Blue Locator BLELoc