Prof. Dr. Jürgen Teich

Department of Computer Science

Our research centers around the systematic design (CAD) of hardware/software systems, ranging from embedded systems to HPC platforms. One principal research direction is domain-specific computing that tries to tackle the very complex programming and design challenge of parallel heterogeneous computer architectures. Domain-specific computing drastically separates the concerns of algorithm development and target architecture implementation, including parallelization and low-level implementation details. The key idea is to take advantage of the knowledge being inherent in a particular problem area or field of application, i.e., a particular domain, in a well-directed manner and thus, to master the complexity of heterogeneous systems. Such domain knowledge can be captured by reasonable abstractions, augmentations, and notations, e.g., libraries, Domain-specific programming languages (DSLs), or combinations of both (e.g., embedded DSLs implemented via template metaprogramming). On this basis, patterns can be utilized to transform and optimize the input description in a goal-oriented way during compilation, and, finally, to generate code for a specific target architecture. Thus, DSLs provide high productivity plus typically also high performance. We develop DSLs and target platform languages to capture both domain and architecture knowledge, which is utilized during the different phases of compilation, parallelization, mapping, as well as code generation for a wide variety of architectures, e.g., multi-core processors, GPUs, MPSoCs, FPGAs. All these steps usually go along with optimizing and exploring the vast space of design options and trading off multiple objectives, such as performance, cost, energy, or reliability.

Research projects

  • Diffusion-weighted imaging and quantitative susceptibility mapping of the breast, liver, prostate, and brain
  • Development of new MRI pulse sequences
  • Development of new MRI post-processing schemes
  • Joint evaluation of new MR methods with radiology
  • Domain-specific Computing for Medical imaging
  • Hipacc – the Heterogeneous Image Processing Acceleration Framework
  • AI Laboratory for System-level Design of ML-based Signal Processing Applications
  • Architecture Modeling and Exploration of Algorithms for Medical Image Processing

  • Neural Approximate Accelerator Architecture Optimization for DNN Inference on Lightweight FPGAs

    (Third Party Funds Single)

    Term: 1. May 2024 - 30. April 2027
    Funding source: DFG-Einzelförderung / Sachbeihilfe (EIN-SBH)

    Embedded Machine Learning (ML) constitutes an admittedly fast-growing field that comprises ML algorithms, hardware, and software capable of performing on-device sensor data analyses at extremely low power, enabling thus several always-on and battery-powered applications and services. Running ML-based applications on embedded edge devices witnesses a phenomenal research and business interest for many reasons, including accessibility, privacy, latency, cost, and security. Embedded ML is primarily represented by artificial intelligence (AI) at the edge (EdgeAI) and on tiny, ultra resource constrained devices, a.k.a. TinyML. TinyML poses requirements for energy efficiency but also low latency as well as to retain accuracy in acceptable levels mandating, thus, optimization of the software and hardware stack.
    GPUs form the default platform for DNN training workloads, due to their high parallelism computing originating by the massive number of processing cores. Though, GPU is often not an optimal solution for DNN inference acceleration due to the high energy-cost and the lack of reconfigurability, especially for high sparsity models or customized architectures. On the other hand, Field Programmable Gate Arrays (FPGAs) have a unique privilege of potentially lower latency and higher efficiency than GPUs while offering high customization and faster time-to-market combined with potentially longer useful life than ASIC solutions.
    In the context of TinyML, NA³Os focuses on a neural approximate accelerator-architecture co-search targeting specifically lightweight FPGA devices. This project investigates design techniques to optimally and automatically map DNNs to resource- constrained FPGAs while exploiting principles of approximate computing. Our particular topics of investigation include:

    • Efficient mapping of DNN operations onto approximate hardware components (e.g., multipliers, adders, DSP Blocks, BRAMs).
    • Techniques for fast and automated design space exploration of mappings of DNNs defined by a set of approximate operators and a set of FPGA platform constraints.
    • Investigation of a hardware-aware neural architecture co-search methodology targeting FPGA-based DNN accelerators.
    • Evaluation of robustness vs. energy efficiency tradeoffs.
    • Finally, all developed methods shall be evaluated experimentally by providing a proper synthesis path and comparing the quality of generated solutions with state-of-the-art solutions.
  • Open-Source Design Tools for the Co-Development of AI Algorithms and AI Chips

    (Third Party Funds Single)

    Term: 1. May 2024 - 30. April 2027
    Funding source: Bundesministerium für Bildung und Forschung (BMBF)
    URL: https://www.elektronikforschung.de/projekte/di-edai

    Motivation

    Chip design is the essential step when developing microelectronics for specific products and applications. Competence in chip design can strengthen Germany's innovation and competitiveness and increase its technological sovereignty in Europe. In order to leverage this potential, the German and European chip design ecosystem is to be expanded. To this end, the BMBF has launched the Microelectronics Design Initiative with four key areas of focus: a strong network as a central exchange platform, training and further education for talented individuals and specialists, research projects to strengthen design capabilities, and expanding research structures.

    Project Goals

    The aim of the project is to develop modern AI chips that are designed with a particular focus on security, trustworthiness, and energy efficiency in various application scenarios. Another goal is to implement a seamless transition from software-based AI algorithm development to efficient hardware implementation. The focus here is on the close linking of AI and hardware in the design process as well as the development of various AI accelerators and corresponding architectures. The end result should be an automated design methodology that extends from the AI software to the AI hardware.

    The focus of our chair within DI-EDAI is, in particular, on the development of a co-exploration approach that optimizes both neural network models and associated AI-specific microprocessor extensions, taking into account non-functional requirements (e.g., cost, speed, accuracy, energy, security). The results, in the form of hardware blocks and EDA software, shall be published as open source and contribute to creating an ecosystem for designing sustainable and transparent AI systems.

  • Automatic Cross-Layer Synthesis of High Performance, (Ultra-)Low Power Hardware Implementations from Data Flow Specifications by Integration of Emerging FeFET Technology

    (Third Party Funds Single)

    Term: 1. March 2024 - 1. March 2027
    Funding source: Deutsche Forschungsgemeinschaft (DFG)
    URL: https://www.cs12.tf.fau.de/forschung/projekte/hiloda-nets/

    High throughput data and signal processing applications can be specified preferably by dataflow networks, as these naturally allow the exploitation of parallelism as well globally (at the level of a network of communicating actors) as locally at the actor level, e.g., by implementing each actor as a hardware circuit. Today, there exist a few system-level design approaches to aid an algorithm designer in compiling a dataflow network to a set of processors or, alternatively, to synthesize the network directly in hardware for achieving high processing speeds. But embedded systems, particularly in the context of IoT applications, have additional requirements: Safe operation, even in an environment of intermittent power shortages, and in general (ultra-)low power requirements. Altogether, these requirements seem to be contradictory.

    Our proposed project named HiLoDa (High performance, (ultra-Low) power Dataflow) Nets attacks this obvious discrepancy and conflict in requirements by a) introducing, exploiting, and integrating for the first time emerging FeFET technology for the design of actor networks, i.e., by investigating and designing persistable FIFO-based memory units. b) In particular, circuit devices being able to operate in mixed volatile/non-volatile mode of operation shall be modeled, characterized, and designed. c) By combining the system-level concept of dataflow, which is based on self-scheduled activations of computations with emerging CMOS-compatible FeFET technology, inactive actors or even subnets shall inherit the capability of self-powering (down and wakeup). In addition, for a continuously safe mode of operation, a down-powering must also be triggered upon any intermittent shortage of power supply. Analogously, actors shall perform an auto-wakeup after recovery from a power shortage but also subject to fireability.

    HiLoDa Nets will be able to combine high clock-speed data processing of each synthesized actor circuit in power-on mode and automatic state retention using FeFET technology in power-off mode, self-triggered during time intervals of either data unavailability or power shortage. d) A fully automatic cross-layer synthesis from system-level dataflow specification to optimized circuit implementation involving FeFET devices shall be developed. This includes e) the DSE (design space exploration) of actor clusterings at the system level to explore individual power domains for the optimization of throughput, circuit cost, energy savings, and endurance. Finally, f) HiLoDa Nets shall be compared to conventional CMOS technology implementations with respect to energy consumption for applications such as spiking neural networks. Likewise, shutdown (backup) and recovery latencies from power shortages shall be evaluated and optimized.

  • Optimization and Toolchain for Embedding AI

    (Third Party Funds Single)

    Term: 1. March 2023 - 28. February 2026
    Funding source: Industrie

    Artificial Intelligence (AI) methods have quickly progressed from research to productive applications in recent years. Typical AI models (e.g., deep neural networks) yield high memory demands and computational efforts for training and when making predictions during operation. This is opposed to the typically limited resources of embedded controllers used in automotive or industrial applications. To comply with these limitations, AI models must be streamlined on different levels to be applicable to a given specific embedded target hardware, e.g., by architecture and feature selection, pruning, and other compression techniques. Currently, model adaptation to fit the target hardware is achieved by iterative, manual changes in a “trial-and-error” manner: the model is designed, trained, and compiled to the target hardware while applying different optimization techniques. The model is then checked for compliance with the hardware constraints, and the cycle is repeated if necessary. This approach is time-consuming and error-prone.

    Therefore, this project, funded by the Schaeffler Hub for Advanced Research at Friedrich-Alexander-Universität Erlangen-Nürnberg (SHARE at FAU), seeks to establish guidelines for hardware selection and a systematic toolchain for optimizing and embedding AI in order to reduce the current efforts of porting machine learning models to automotive and industrial devices.

  • HYPNOS – Co-Design of Persistent, Energy-efficient and High-speed Embedded Processor Systems with Hybrid Volatility Memory Organisation

    (Third Party Funds Group – Sub project)

    Overall project: DFG Priority Programme (SPP) 2037 - Disruptive Memory Technologies
    Term: 21. September 2022 - 21. September 2025
    Funding source: DFG / Schwerpunktprogramm (SPP)
    URL: https://spp2377.uos.de/

    This project is funded by the German Research Foundation (DFG) within the Priority Program SPP 2377 "Scalable Data Management for Future Hardware".

    HYPNOS explores how emerging non-volatile memory (NVM) technologies could beneficially replace not only main memory in modern embedded processor architectures, but potentially also one or multiple levels of the cache hierarchy or even the registers and how to optimize such a hybrid-volatile memory hierarchy for offering high speed and low energy tradeoffs for a multitude of application programs while providing persistence of data structures and processing state in a simple and efficient way.   

    On the one hand, completely non-volatile (memory) processors (NVPs) that have emerged for IoT devices are known to suffer from low write times of current NVM technologies as well as by orders of magnitude lower endurance than, e.g., SRAM, thus prohibiting an operation at GHz speeds. On the other hand, existing NVM main memory computer solutions suffer from the need of the programmer to explicitly persist data structures through the cache hierarchy.     

    HYPNOS (Named after the Greek god of sleep.) systematically attacks this intertwined performance/endurance/programmability gap by taking a hardware/software co-design approach:

    Our investigations include techniques for

    a) design space exploration of hybrid NVM memory processor architectures} wrt. speed and energy consumption including hybrid (mixed volatile) register and cache-level designs,

    b) offering instruction-level persistence for (non-transactional) programs in case of, e.g., instantaneous power failures through low-cost and low-latency control unit (hardware) design of checkpointing and recovery functions, and additionally providing

    c) application-programmer (software) persistence control on a multi-core HyPNOS system for user-defined checkpointing and recovery from these and other errors or access conflicts backed by size-limited hardware transactional memory (HTM).

    d) The explored processor architecture designs and different types of NVM technologies will be systematically evaluated for achievable speed and energy gains, and for testing co-designed backup and recovery mechanisms, e.g., wakeup latencies, etc., using a gem5-based multi-core simulation platform and using ARM processors with HTM instruction extensions.

    As benchmarks, i) simple data structures, ii) sensor (peripheral device) I/O and finally iii) transactional database applications shall be investigated and evaluated. 

  • GRK2475: Cyberkriminalität und Forensische Informatik

    (Third Party Funds Group – Overall project)

    Term: 1. October 2019 - 30. September 2028
    Funding source: DFG / Graduiertenkolleg (GRK)
    URL: https://www.cybercrime.fau.de/

    Neue Informationstechnologien erlauben immer auch neue Möglichkeiten der Begehung von Straftaten, die häufig mit dem Begriff „Cyberkriminalität“ belegt werden. Im Hinblick auf die Abhängigkeit hochentwickelter Gesellschaften von (kritischen) IT-Infrastrukturen bedroht diese Kriminalität heute die Stabilität unseres Wirtschafts- und Gesellschaftssystems. Die neuen Informationstechnologien eröffnen jedoch auch neue Möglichkeiten der Strafverfolgung, wie etwa automatisierte Datensammlung und –auswertung im Netz oder heimlich in IT-Systeme eingeschleuste Überwachungsprogramme (Trojaner).

    Die Effektivität dieser neuen Methoden der so genannten „forensischen Informatik“ provoziert regelmäßig die Frage nach den Auswirkungen auf die Grundrechte der Betroffenen. Die Begrenzung des Rechtsraums auf Nationalstaaten schafft zusätzliche Probleme. In diesem Vorhaben haben sich etablierte Wissenschaftler aus der Informatik und den Rechtswissenschaften zusammengeschlossen, um das noch recht unscharfe Forschungsfeld Cyberkriminalität sowie Strafbarkeit und Strafverfolgung von Cyberkriminalität systematisch zu erschließen, grundlegende Zusammenhänge aufzudecken und das Gebiet insgesamt einer besseren Handhabe zugänglich zu machen.

    Die Forschung im Graduiertenkolleg hat darum hier das Potential, die technisch-methodischen Standards des Umgangs mit digitalen Spuren, deren Nutzen für die Strafverfolgung sowie die nationale wie internationale Rechtsinterpretation und -gestaltung auf viele Jahre hinaus zu prägen. Gleichzeitig wirken wir in diesem Bereich dem Mangel an wissenschaftlich-methodisch geschultem Fachpersonal in Wirtschaft, Verwaltung und bei den Strafverfolgungsbehörden entgegen.

  • Cybercrime and Forensic Computing -- Hardware Security

    (Third Party Funds Group – Sub project)

    Overall project: Research Training Group 2475: Cybercrime and Forensic Computing
    Term: 1. October 2019 - 1. October 2028
    Funding source: DFG / Graduiertenkolleg (GRK)
    URL: https://www.cybercrime.fau.de
    This project is funded by the German Research Foundation (DFG) within the Research Training Group 2475 "Cybercrime and Forensic Computing".
    Cybercrime is becoming an ever greater threat in view of the growing societal importance of information technology. At the same time, new opportunities are emerging for law enforcement, such as automated data collection and analysis on the Internet or via surveillance programs. But how do you deal with the fundamental rights of those affected when "forensic informatics" is used? The RTG "Cybercrime and Forensic Informatics" brings together experts in computer science and law to investigate the research field of "prosecution of cybercrime" in a systematic way.
    At the Chair of Computer Science 12, aspects of hardware security are investigated. The focus is on researching techniques to extract information and traces from technical devices via side channels. The physical implementation of a system emits further, so-called side-channel information to the environment in addition to the actual processing of input data to output data. Known side channels are, for example, the data-dependent time behavior of an algorithm implementation, as well as power consumption, electromagnetic radiation and temperature development.

2025

2024

2023

2022

2021

2020

Related Research Fields

Contact: