Aktuelles Angebot an studentischen Arbeiten

In unseren Arbeitsgruppen sind oftmals Arbeiten in Vorbereitung, die hier noch nicht aufgelistet sind. Teilweise besteht auch die Möglichkeit, ein Thema entsprechend Ihrer speziellen Interessenslage zu definieren. Kontaktieren Sie hierzu einfach einen Mitarbeiter aus dem entsprechenden Arbeitsgebiet. Falls Sie darüber hinaus allgemeine Fragen zur Durchführung einer Arbeit am LIS haben, wenden Sie sich bitte an Dr. Thomas Wild.

Für Interessenten einer Ingenieurpraxis:

Wir betreuen gerne Ingenieurpraxen, die in der Industrie durchgeführt werden, wenn das jeweilige Thema zu unseren eigenen Arbeitsgebieten passt. Ingenieurpraxen am Lehrstuhl bieten wir jedoch nicht an, da aus unserer Sicht Studenten bereits frühzeitig Industrieerfahrung sammeln sollten.

 

BAMAIDPFPIPHSSHK
Titel
------

Approximate Computing for FPGA-based Image Processing

Approximate Computing for FPGA-based Image Processing

Beschreibung

Digital image processing in professional applications places ever-higher demands, so that the computing power and power consumption of FPGA devices reach their limits. Approximate Computing refers to a set of methods that are based on not performing calculations exactly, but only approximated. As a result, fewer resources are used in the FPGA, more functions can be implemented in the existing FPGA devices, and the energy efficiency of the calculations is improved. However, approximate computing always degrades the quality of the application, so an optimization process must be found that maximizes utility and keeps degradation below a tolerable limit.

Betreuer:

------

Coherence for Near Memory Computing

Coherence for Near Memory Computing

Kurzbeschreibung:
The goal of this seminar is to provide a survey of cache coherence mechanism for near memory computing.

Beschreibung

Hitting a wall is not a pleasant thing. Computer systems faced many walls in the last decades.Being able to break the memory wall in the mid 90's and the power wall in 2004, it now faces the next crucial barrier for scalabilty. Although being able to scale systems to 100's or 1000's of cores through NoCs, performance doesn't scale due to data-to-task dislocality. We now face the locality wall.

The newest trend to tackle this issue is data-task migration and processing in or near memory.

The goal of this seminar is to investigate how cache coherence mechanisms adapt to such near memory computing operations.

Kontakt

srivatsa.akshay@tum.de

Betreuer:

------

A Survey on Hybrid/Adaptive Coherency Protocols

A Survey on Hybrid/Adaptive Coherency Protocols

Kurzbeschreibung:
The goal of this seminar is to survey hybrid architecture-specific extensions to traditional cache coherence mechanisms that optimize application performance

Beschreibung

Hardware supported coherent architectures allow for faster coherency messages and easier programming models. But for large systems with varying application demands, the scalability and performance of such schemes maybe suboptimal. The goal of this seminar topic is to perform a detailed analysis of different hybrid/adaptive coherency schemes in the scope of NoC based distributed shared memory MPSoCs.

Kontakt

srivatsa.akshay@tum.de

Betreuer:

------

Data and Task Distribution for Streaming Applications

Data and Task Distribution for Streaming Applications

Beschreibung

Dynamic data and task distribution is an interesting topic for streaming applications, as those distributions could be potentially learned and applied for future iterations. The goal of this seminar is to make a survey and analysis of existing approaches.

Kontakt

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Data and Task Distribution for Streaming Applications

Data and Task Distribution for Streaming Applications

Beschreibung

Dynamic data and task distribution is an interesting topic for streaming applications, as those distributions could be potentially learned and applied for future iterations. The goal of this seminar is to make a survey and analysis of existing approaches.

Kontakt

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Evaluation Algorithms for Dynamic Data Migration

Evaluation Algorithms for Dynamic Data Migration

Beschreibung

Dynamic data migration is depended on the evaluation of memory accesses at runtime. A huge amount of data has to be processed for this purpose. The goal of this seminar is to make a survey and analysis of existing approaches. Exact as well as heuristic algoritms should be taken into account.

Kontakt

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Evaluation Algorithms for Dynamic Data Migration

Evaluation Algorithms for Dynamic Data Migration

Beschreibung

Dynamic data migration is depended on the evaluation of memory accesses at runtime. A huge amount of data has to be processed for this purpose. The goal of this seminar is to make a survey and analysis of existing approaches. Exact as well as heuristic algoritms should be taken into account.

Kontakt

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Near Memory Graphcopy

Near Memory Graphcopy

Beschreibung

The copying of graphs is well suited for near memory hardware acceleration. The goal of this seminar is to make a survey and analysis of existing approaches.

Kontakt

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Near Memory Graphcopy

Near Memory Graphcopy

Beschreibung

The copying of graphs is well suited for near memory hardware acceleration. The goal of this seminar is to make a survey and analysis of existing approaches.

Kontakt

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Motion Estimation Algorithms in Autonomous Driving

Motion Estimation Algorithms in Autonomous Driving

Beschreibung

Biologically, human brain is sensitive to visual changes observed by our eyes. After observing visual changes, human brain takes all the required high level behavioral decisions. In autonomous driving, various camera sensors help to perceive the motion in the surroundings, e.g. pedestrians, cyclists, automobiles etc.. This motion perception includes localizing, estimating velocities, tracking, and estimating trajectories of the road users. Motion estimation is a very well researched domain in machine vision over past 30 years. There are studies focusing on motion estimation using disparity estimation, eg. dense optical flow, stereo global matching. This seminar topic will focus on comparative study of the available start-of-the-art algorithms for dense optical flow and stereo global matching on continuous video frames.

Kontakt

Munish Jassi
munish.jassi@nxp.com

Betreuer:

------

Motion Estimation Algorithms in Autonomous Driving

Motion Estimation Algorithms in Autonomous Driving

Beschreibung

Biologically, human brain is sensitive to visual changes observed by our eyes. After observing visual changes, human brain takes all the required high level behavioral decisions. In autonomous driving, various camera sensors help to perceive the motion in the surroundings, e.g. pedestrians, cyclists, automobiles etc.. This motion perception includes localizing, estimating velocities, tracking, and estimating trajectories of the road users. Motion estimation is a very well researched domain in machine vision over past 30 years. There are studies focusing on motion estimation using disparity estimation, eg. dense optical flow, stereo global matching. This seminar topic will focus on comparative study of the available start-of-the-art algorithms for dense optical flow and stereo global matching on continuous video frames.

Kontakt

Munish Jassi
munish.jassi@nxp.com

Betreuer:

------

Statical WCET-Analysis for Multi-Core Systems

Statical WCET-Analysis for Multi-Core Systems

Beschreibung

It is indispensable to know the worst-case execution time (WCET) for the development of real-time systems. There exist several methods to approximate the WCET on single-core platforms. Whenever multiple tasks run simultaneously on a multi-core platform, these methods cannot provide a reliable estimation any more. The goal of this seminar is to summarize the major problems which arise when analysing multi-core applications and some methods to solve them.

Kontakt

Dirk Gabriel
Raum N2117
Tel. 089 289 28578
dirk.gabriel@tum.de

Betreuer:

------

Architectures for Neuromorphic Computing

Architectures for Neuromorphic Computing

Beschreibung

The goal of neuromorphic computers is to mimic the behaviour of the human nervous system or brain. Since the behaviour of neurons differs greatly from how classical computer systems work there is a need for new architectures. The approaches range from specialized CMOS designs over MOSFET based architectures to memristor based approaches. The goal of this seminar is to present the challenges posed by neuromorphic computing and how different architectures approach them.

Betreuer:

------

Architectures for Neuromorphic Computing

Architectures for Neuromorphic Computing

Beschreibung

The goal of neuromorphic computers is to mimic the behaviour of the human nervous system or brain. Since the behaviour of neurons differs greatly from how classical computer systems work there is a need for new architectures. The approaches range from specialized CMOS designs over MOSFET based architectures to memristor based approaches. The goal of this seminar is to present the challenges posed by neuromorphic computing and how different architectures approach them.

Betreuer:

------

Energy Efficiency of Neural Networks

Energy Efficiency of Neural Networks

Beschreibung

Deep and Convolutional Neural Networks are currently the de-facto standard when
it comes to machine learning and in the past years there have been great advances regarding their performance. However, with the wide adoption of these
techniques in data-centers around the world, energy efficiency becomes a more
and more important aspect. Therefore, the goal of this seminar is to provide an
overview of neural network implementations in software and hardware with regard to their energy efficiency.

Betreuer:

------

Energy Efficiency of Neural Networks

Energy Efficiency of Neural Networks

Beschreibung

Deep and Convolutional Neural Networks are currently the de-facto standard when
it comes to machine learning and in the past years there have been great advances regarding their performance. However, with the wide adoption of these
techniques in data-centers around the world, energy efficiency becomes a more
and more important aspect. Therefore, the goal of this seminar is to provide an
overview of neural network implementations in software and hardware with regard to their energy efficiency.

Betreuer:

------

An Introduction to Finite Length Codes for SoCs

An Introduction to Finite Length Codes for SoCs

Beschreibung

High data integrity is a key in modern SoC communication. However, due to the ever decreasing feature size, modern silicon devices become more vulnerable to transient faults. At the same time, on-Chip communication operates on rather small chunks of data, in contrast to traditional unreliable communication scenarios like wireless communication. Therefore, conventional measures like the channel capacity as introduced by Shannon do hold anymore, paving the way to new methods to quantify channels and codes alike that take the code length into account. The goal of this seminar is to provide an introduction into the field and methods of finite length codes.

Betreuer:

------

Meltdown: Concept, Cause and Effect

Meltdown: Concept, Cause and Effect

Beschreibung

When at the beginning of 2018 researchers published their discovery of side-channel attacks Meltdown and Spectre on modern CPUs, an entire industry was forced to rethink state-of-the-art techniques used to increase the processing power of their designs. In the seminar the core concepts of modern processors, their exploits leading to Meltdown, as well as mitigation techniques shall be presented.

Betreuer:

------

Meltdown: Concept, Cause and Effect

Meltdown: Concept, Cause and Effect

Beschreibung

When at the beginning of 2018 researchers published their discovery of side-channel attacks Meltdown and Spectre on modern CPUs, an entire industry was forced to rethink state-of-the-art techniques used to increase the processing power of their designs. In the seminar the core concepts of modern processors, their exploits leading to Meltdown, as well as mitigation techniques shall be presented.

Betreuer:

------

An Introduction to Finite Length Codes for SoCs

An Introduction to Finite Length Codes for SoCs

Beschreibung

High data integrity is a key in modern SoC communication. However, due to the ever decreasing feature size, modern silicon devices become more vulnerable to transient faults. At the same time, on-Chip communication operates on rather small chunks of data, in contrast to traditional unreliable communication scenarios like wireless communication. Therefore, conventional measures like the channel capacity as introduced by Shannon do hold anymore, paving the way to new methods to quantify channels and codes alike that take the code length into account. The goal of this seminar is to provide an introduction into the field and methods of finite length codes.

Betreuer:

------

Multicore Power Proxies and Models

Multicore Power Proxies and Models

Stichworte:
multicore, processor, modeling, power proxy, linear regression

Beschreibung

To optimize the power consumption of multicore processors, accurate power infrormation is needed. Such information can be either directly measured (sensors) or indirectly determined through so called power proxies. With sensors being very area expensive, power proxies are commonly used. Such power proxies combine a design-time power model with run-time activity information, e.g. performance counters. In this seminar, you will learn about state-of-the-art power proxies and identify the currenty challenges in using power proxies.

Kontakt

mark.sagi@tum.de

Betreuer:

------

Mult-Agent Reinforcement Learning for Multicore Processors

Mult-Agent Reinforcement Learning for Multicore Processors

Stichworte:
Machine learning, multi-agent reinforcement learning, multicore, power, temperature

Beschreibung

Reducing the power consumption of multicore processors is an ongoing challenge for industry and academia alike. Many different machine learning algorithms (Reinforcement learning (RL), supervised learning, unsupervised learning) have been proposed to manage power and performance of multicore processors. However, the interaction of multiple reinforcement learning algorithms running in parallel is an open research area. In this seminar, you will identify state-of-the-art reinforcement algorithms for multicore power management and investigate if there are any methods for Multi-Agent-Reinforcement-Learning in use today.

Kontakt

mark.sagi@tum.de

Betreuer:

------

Learning Control for Predictable Latency and Low Energy

Learning Control for Predictable Latency and Low Energy

Beschreibung

Many modern computing systems must provide reliable latency with minimal energy. Two central challenges arise when allocating system resources to meet these conflicting goals: (1) complexityÐmodern hardware exposes diverse resources with complicated interactionsÐand (2) dynamicsÐ latency must be maintained despite unpredictable changes in operating environment or input. Machine learning accurately models the latency of complex, interacting resources, but does not address system dynamics; control theory adjusts to dynamic changes, but struggles with complex resource interaction. We therefore propose CALOREE, a resource manager that learns key control parameters to meet latency requirements with minimal energy in complex, dynamic environments. CALOREE breaks resource allocation into two sub-tasks: learning how interacting resources affect speedup, and controlling speedup to meet latency requirements with minimal energy. CALOREE deines a general control systemÐ whose parameters are customized by a learning frameworkÐ while maintaining control-theoretic formal guarantees that the latency goal will be met. We test CALOREE’s ability to deliver reliable latency on heterogeneous ARM big.LITTLE architectures in both single and multi-application scenarios. Compared to the best prior learning and control solutions, CALOREE reduces deadline misses by 60% and energy consumption by 13%.

Betreuer:

------

Learning Control for Predictable Latency and Low Energy

Learning Control for Predictable Latency and Low Energy

Beschreibung

Many modern computing systems must provide reliable latency with minimal energy. Two central challenges arise when allocating system resources to meet these conflicting goals: (1) complexityÐmodern hardware exposes diverse resources with complicated interactionsÐand (2) dynamicsÐ latency must be maintained despite unpredictable changes in operating environment or input. Machine learning accurately models the latency of complex, interacting resources, but does not address system dynamics; control theory adjusts to dynamic changes, but struggles with complex resource interaction. We therefore propose CALOREE, a resource manager that learns key control parameters to meet latency requirements with minimal energy in complex, dynamic environments. CALOREE breaks resource allocation into two sub-tasks: learning how interacting resources affect speedup, and controlling speedup to meet latency requirements with minimal energy. CALOREE deines a general control systemÐ whose parameters are customized by a learning frameworkÐ while maintaining control-theoretic formal guarantees that the latency goal will be met. We test CALOREE’s ability to deliver reliable latency on heterogeneous ARM big.LITTLE architectures in both single and multi-application scenarios. Compared to the best prior learning and control solutions, CALOREE reduces deadline misses by 60% and energy consumption by 13%.

Betreuer:

------

Multilayer Resource Controllers to Maximize Efficiency

Multilayer Resource Controllers to Maximize Efficiency

Beschreibung

Since computers increasingly execute in constrained environments, they are being equipped with controllers for resource management. However, the operation of modern computer systems is structured in multiple layers, such as the hardware, OS, and networking layers—each with its own resources. Managing such a system scalably and portably requires that we have a controller in each layer, and that the different controllers coordinate their operation. In addition, such controllers should not rely on heuristics, but be based on formal control theory. This paper presents a new approach to build coordinated multilayer formal controllers for computers. The approach uses Structured Singular Value (SSV) controllers from Robust Control Theory. Such controllers are especially suited for multilayer computer system control. Indeed, SSV controllers can read signals from other controllers to coordinate multilayer operation. In addition, they allow designers to specify the discrete values allowed in each input, and the desired bounds on output value deviations. Finally, they accept uncertainty guardbands, which incorporate the effects of interference between the controllers. We call this approach Yukta. To assess its effectiveness, we prototype it in an 8-core big.LITTLE board. We build a two-layer SSV controller, and show that it is very effective. Yukta reduces the E×D and the execution time of a set of applications by an average of 50% and 38%, respectively, over advanced heuristic-based coordinated controllers.

Betreuer:

------

Multilayer Resource Controllers to Maximize Efficiency

Multilayer Resource Controllers to Maximize Efficiency

Beschreibung

Since computers increasingly execute in constrained environments, they are being equipped with controllers for resource management. However, the operation of modern computer systems is structured in multiple layers, such as the hardware, OS, and networking layers—each with its own resources. Managing such a system scalably and portably requires that we have a controller in each layer, and that the different controllers coordinate their operation. In addition, such controllers should not rely on heuristics, but be based on formal control theory. This paper presents a new approach to build coordinated multilayer formal controllers for computers. The approach uses Structured Singular Value (SSV) controllers from Robust Control Theory. Such controllers are especially suited for multilayer computer system control. Indeed, SSV controllers can read signals from other controllers to coordinate multilayer operation. In addition, they allow designers to specify the discrete values allowed in each input, and the desired bounds on output value deviations. Finally, they accept uncertainty guardbands, which incorporate the effects of interference between the controllers. We call this approach Yukta. To assess its effectiveness, we prototype it in an 8-core big.LITTLE board. We build a two-layer SSV controller, and show that it is very effective. Yukta reduces the E×D and the execution time of a set of applications by an average of 50% and 38%, respectively, over advanced heuristic-based coordinated controllers

Betreuer:

------

Synchronisierung per SpaceWire

Synchronisierung per SpaceWire

Beschreibung

Der Wide-Field-Imager (WFI) ist ein Instrument des ESA ATHENA Satelliten. Einzelne Module der WFI Subsysteme (Instrument Control- and Power Distribution Unit, Detector Electronics) sind über einen SpaceWire Router miteinander verbunden. Um die Submodule untereinander zu synchronisieren, kann der Time-Code des SpaceWire Protokolls benutzt werden. Ziel dieser Arbeit ist es, die Synchronisationsmöglichkeiten mittels SpaceWire zu recherchieren und eines oder mehrere mögliche Konzepte zu detaillieren und zu charakterisieren. Mögliche Limitierungen dieser Methode sind aufzuzeigen und gegebenenfalls mittels eines vorhandenen Test-Aufbaus nachzuweisen.

Kontakt

m.plattner@tum.de

Betreuer:

------

Optimization methods applied to embedded systems design

Optimization methods applied to embedded systems design

Beschreibung

Embedded systems design includes several complex sub-problems (e.g. placement, routing, partitioning, application mapping, power/thermal management, etc.) that are often classified as NP-complete or NP-hard. To solve these, several optimization methods (e.g. metaheuristics) coming from the operational research field can be used.

The goal of this seminar is to survey the sub-problems encountered in microelectronics design, show how they can be analyzed and modeled, and what kind of optimization methods can be used to solve them.

Kontakt

Anh Vu Doan
Room N2116
anhvu.doan@tum.de

Betreuer:

----

Programmierung von FPGAs in C/C++ (bei Missing Link Electronics)

Programmierung von FPGAs in C/C++ (bei Missing Link Electronics)

Beschreibung

In der Mikroelektronik wird deutlich, dass Moore’s Law und Dennard—Skalierung keine weiteren Performanzsteigerungen mehr bringen werden. Die Taktfrequenzen von CPUs - ein wichtiges Kriterium für Rechnerperformanz - werden eher langsamer, u.a. um die Verlustleistung zu verringern. Ebenso bringen auch Multicore-CPUs keine weiteren Performanzgewinne.

Daher geht der Trend dahin, andere Rechnerarchitekturen mit FPGAs in sogenannten heterogenen Rechnerarchitekturen einzusetzen. FPGA-Programmierung in VHDL oder Verilog ist allerdings nicht so einfach wie die Programmierung von C/C++. Diesem begegnen wir am Institut für Mikroelektronik mit zwei Ansätzen: 1. Ausbildung im „klassischen" FPGA—Entwurf mit VHDL/Verilog und 2. Einsatz von C/C++/SystemC in der High-LeveI—Synthese (HLS) im FPGA-Design, und wir bieten dafür mehrere Bachelor- / Masterarbeiten zu verschiedenen Themen an:

  • Entwurf und Analyse einer FPGA-basierten Bildverarbeitung auf Basis von openCV/ C++
  • Entwurf und Analyse von Deep Convolutional Neural Networks im FPGA in C/C++
  • o Vergleich von C/C++ mit SystemC beim Einsatz in der High—Level-Synthese

In Zusammenarbeit mit unseren Technologiepartnern IBM, XiIinx und MLE können diese und ähnliche Themen auch im Rahmen eines Industriepraktikums bearbeitet werden.

Wir bieten: Einblick in modernste FPGA-Technologien mit 32-bit und 64-bit Multi-Core ARM CPUs, Einführung in moderne Tools und Methoden zum Entwurf FPGA-basierter System-on-Chips, direkte Einarbeitung und Betreuung.

Wir erwarten: Programmierkenntnisse in C, C++ und/oder SystemC, Grundwissen in FPGA und digitalen Schaltungen, Linux, Ethernet, TCP/IP; organisiertes Arbeiten in kleinen Teams.

 

Kontakt

Dr. Endric Schubert
Missing Link Electronics GmbH
Industriestraße 10
89231 Neu-Ulm

endric@missinglinkelectronics.com
Tel: +49 (731) 141149-14
jobs@mlecorp.com
www.MLEcorp.com

Betreuer:

----

Cache-koherente HW-Beschleuniger für IBM Power8 / Power9 mit CAPI (bei Missing Link Electronics)

Cache-koherente HW-Beschleuniger für IBM Power8 / Power9 mit CAPI (bei Missing Link Electronics)

Beschreibung

Wie macht man eine schnelle CPU noch schneller? Man schließt ein FPGA als Cache-koherenten Coprozessor an!
IBM hat diesen Ansatz in der OpenPOWER-Architektur in Form von CAPI - Coherent Acceleration Processor Interface - implementiert und bietet zur einfacheren Entwicklung die Open-Source-Umgebung SNAP - Storage Networking Analytics Platform - an.

Mit Hilfe von zur Verfügung gestellten IBM Power8- / Power9- Rechnern samt dazugehörigen FPGA-Karten wollen wir die Effekte von CAPI qualitativ und quantitativ analysieren und im Rahmen mehrerer Bachelor- / Masterarbeiten mit anderen Ansätzen wie CCIX oder PCIe theoretisch und praktisch vergleichen.
Themen für Arbeiten sind unter anderem:

  • Benchmarking von IBM Power8, Power9 mit x86-basierten Rechnern
  • Implementierung eines Hardware-Beschleunigers für Key-Value-Stores
  • Untersuchung des Latenzverhaltens durch Instrumentieren der FPGA-Designs
  • Implementierung des Linux-Crypto-APls auf OpenPOWER mit CAPI

In Zusammenarbeit mit unseren Technologiepartnern IBM, XiIinx und MLE können diese und ähnliche Themen auch im Rahmen eines Industriepraktikums bearbeitet werden.

Wir bieten: Einblick in modernste FPGA-Technologien mit 32-bit und 64-bit Multi-Core ARM CPUs, Einführung in moderne Tools und Methoden zum Entwurf FPGA-basierter System-on-Chips, direkte Einarbeitung und Betreuung.

Wir erwarten: Programmierkenntnisse in C, C++ und/oder SystemC, Grundwissen in FPGA und digitalen Schaltungen, Linux, Ethernet, TCP/IP; organisiertes Arbeiten in kleinen Teams.

Kontakt

Dr. Endric Schubert
Missing Link Electronics GmbH
Industriestraße 10
89231 Neu-Ulm

endric@missinglinkelectronics.com
Tel: +49 (731) 141149-14
jobs@mlecorp.com
www.MLEcorp.com

Betreuer:

----

FPGA-Beschleunigung von Internet Protokollen für 1OOG Ethernet (bei Missing Link Electronics)

FPGA-Beschleunigung von Internet Protokollen für 1OOG Ethernet (bei Missing Link Electronics)

Beschreibung

Ingenieure bei Google, Facebook usw. treiben einen neuen Standard für 25/50/100 Gig Ethernet voran zur schnelleren Vernetzung. Bei diesen Übertragungsraten sind selbst schnelle Server CPUs überlastet, wenn die Protokollverarbeitung komplett in Software erfolgt. Daher werden TCP- Of?oad-Engines (TOE) eingesetzt, die rechenintensive Operationen mittels Hardware be- schleunigen.
Andere Ansätze verwenden sog. Full Accelerators, bei denen Protokoll und Applikation zu 100% in Hardware abgearbeitet werden, aber trotzdem programmierbar bleiben — weil im FPGA. Im Rahmen früherer, erfolgreicher Bachelor- und Masterarbeiten konnte bereits die enorme Rechenleistung eines FPGA demonstriert werden.
Auf Basis einer voll funktionsfähigen FPGA-Implementierung eines TCP/lP und UDP-Stacks sollen im Rahmen mehrerer unabhängiger Bachelor- / Masterarbeiten diverse Aspekte untersucht werden:

  • Hardware/Software Co-Processing auf 64-bit ARM MPSoC FPGA unter Linux 
  • "Wireshark im FPGA" — Deep—Packet-Inspection (DPI) und TCP/IP-Paketaufzeichnung
  • Netzwerk-Performanzanalysen auf Basis von Netperf/Netserv im FPGA
  • Inline-Processing zur effizienten Verschlüsselung, Komprimierung, z.B. für IPSec oder MACSec.

In Zusammenarbeit mit unseren Technologiepartnern IBM, Xilinx und MLE können diese oder ähnliche Themen auch im Rahmen eines Industriepraktikums bearbeitet werden.

Wir bieten: Einblick in modernste FPGA-Technologien mit 32-bit und 64-bit Multi-Core ARM CPUs, Einführung in moderne Tools und Methoden zum Entwurf FPGA-basierter System-on-Chips, direkte Einarbeitung und Betreuung.

Wir erwarten: Programmierkenntnisse in C, C++ und/oder SystemC. Grundwissen in FPGA und digitalen Schaltungen, Linux, Ethernet, TCP/IP; organisiertes Arbeiten in kleinen Teams.

Kontakt

Dr. Endric Schubert
Missing Link Electronics GmbH
Industriestraße 10
89231 Neu-Ulm

endric@missinglinkelectronics.com
Tel: +49 (731) 141149-14
jobs@mlecorp.com
www.MLEcorp.com

Betreuer:

----

High-Performance SSDs mit Anbindung an heterogene MPSoC-FPGAs (bei Missing Link Electronics)

High-Performance SSDs mit Anbindung an heterogene MPSoC-FPGAs (bei Missing Link Electronics)

Beschreibung

Wir alle kennen Flash Memory, auch Non-Volatile Memory (NVM) genannt, als sehr schnelle (und mittlerweile auch bezahlbare) Alternative zu Festplatten. Mit dem Einsatz von SSDs statt HDDs werden aber auch Ineffizienzen im Software-Stack (z.B. im Linux Kernel) sichtbar: So liegen 70% der Latenz bzw. über 80% der Energieaufnahme in Software. Da große Speichersysteme mit 1 PetaByte und mehr aus vernetzten Servern bestehen, kommen weitere Ineffizienzen in Software und Kommunikation hinzu.
Daher wird an neuen Architekturen geforscht, die Althergebrachtes aufgeben. Das Protokoll NVM Express (NVMe) bzw. seine Varianten NVMe-over-Fabric bieten hier großes Potential, besonders wenn es mittels sogenannter heterogener System-on-Chip-FPGAs beschleunigt wird.
Aufbauend auf vorherigen, erfolgreichen Arbeiten zum Thema FPGA-Beschleunigung sollen im Rahmen mehrerer Bachelor- / Masterarbeiten solche heterogene MPSoC-FPGA-Architekturen genauer untersucht werden. Diese Arbeiten beinhalten u.a. den Entwurf und die Analyse:

  • Einsatz des "Storage Performance Development Kit” (SPDK) im MPSoC-FPGA
  • Implementierung eines NVMe/ PCIe Transaction-Layer-Packet (TLP) Sniffers
  • Beschleunigung von Key-Value-Stores / No-SQL—Datenbanken

In Zusammenarbeit mit unseren Technologiepartnern IBM, Xilinx und MLE können diese und ähnliche Themen auch im Rahmen eines Industriepraktikums bearbeitet werden.

Wir bieten: Einblick in modernste FPGA- echnologien mit 32-bit und 64-bit Multi-Core ARM CPUs, Einführung in moderne Tools und Methoden zum Entwurf FPGA-basierter System-on-Chips, direkte Einarbeitung und Betreuung.

Wir erwarten: Programmierkenntnisse in C, C++ und/oder SystemC, Grundwissen in FPGA und digitalen Schaltungen, Linux, Ethernet, TCP/lP; organisiertes Arbeiten in kleinen Teams.

Kontakt

Dr. Endric Schubert
Missing Link Electronics GmbH
Industriestraße 10
89231 Neu-Ulm

endric@missinglinkelectronics.com
Tel: +49 (731) 141149-14
jobs@mlecorp.com
www.MLEcorp.com

Betreuer:

------

Design and Implementation of a Network Interface for a Fault-Tolerant Time-Division Multiplexed Network on Chip

Design and Implementation of a Network Interface for a Fault-Tolerant Time-Division Multiplexed Network on Chip

Beschreibung

Enabled by ever decreasing structure sizes, modern System on Chips (SoC) integrate a large amount of different processing elements, making them Multi-Processor System on Chips (MPSoC). These processing elements require a communication infrastructure to exchange data with each other and with shared resources such as memory and I/O ports. The limited scalability of bus-based solutions has led to a paradigm shift towards Network on Chips (NoC) which allow for multiple data streams between different nodes to be exchanged in parallel.
In order to implement a safety-critical real-time application on such an MPSoC, the NoC must fulfill certain requirements: it must ensure that no critical data gets lost, all critical data gets delivered within a certain deadline, and other applications cannot interfere with the critical application. And all this must be guaranteed even in case of a fault in the NoC.

Goal

The goal of this thesis is to implement a Network Interface for a Time-Division Multiplexed NoC that meets the criteria described above and create tests to validate the behavior of the implemented hardware.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • Very good programming skills in a hardware description language i.e. VHDL or (System)Verilog
  • Solid Python programming skills
  • At least basic knowledge of the functionality of NoCs
  • Self-motivated and structured work style

Learning Objectives

By completing this project, you will be able to

  • understand the concept of TDM NoCs
  • design and implement a complex hardware module in SystemVerilog
  • create tests to validate hardware modules
  • document your work in form of a scientific report and a presentation

 

 

Kontakt

Max Koenen
Room N2118
Tel. 089 289 23084
max.koenen@tum.de

Betreuer:

------

Extensions & Performance Benchmarks of a CAPI-based Network Interface Card

Extensions & Performance Benchmarks of a CAPI-based Network Interface Card

Beschreibung

With ever-increasing network data rates, the data transfer between network interface card (NIC) and the host system has a decisive impact on the achievable application performance. To fully exploit the host system’s CPU capacity for application processing, it is important to minimize I/O processing overheads. In this project, we want to extend the implementation and optimize the performance of an FPGA-based NIC that is connected to the host system with the Coherent Accelerator Processor Interface (CAPI) [1] for IBM POWER8 Systems.

In a previous project an initial implementation of the CAPI-based NIC was developed using the CAPI Storage, Network and Analytics Programming (SNAP) framework [2]. The goal of this project is to integrate the physical network interfaces in the design, as well as to identify and mitigate performance bottlenecks.

[1] https://developer.ibm.com/linuxonpower/capi/

[2] https://openpowerfoundation.org/blogs/capi-snap-simple-developers

Towards this goal you will complete the following tasks:

  • Analyze source code and working principles of the existing NIC implementation
  • Getting familiar with CAPI and the CAPI SNAP framework
  • Integrate an Ethernet Media Access Controller (MAC) IP core into the FPGA design
  • Benchmark throughput and latency of FPGA-to-host communication through simulations and measurements
  • Identify performance bottlenecks, propose and implement improvements
  • Extend the design to make use of multiple RX/TX queues for multi-core processing

Voraussetzungen

To successfully complete this project, you should already have several of the following skills and experiences:

  • Knowledge of a hardware description language such as Verilog and/or VHDL
  • Hands-on FPGA development experience
  • Solid C programming skills
  • Proficiency using Linux
  • Self-motivated and structured work style

Learning Objectives

By completing this project, you will be able to

  • understand the basic working principles of NICs, as well as FPGA-host communication mechanisms
  • apply your theoretical knowledge to an implementation consisting of both hard- and software parts
  • document work in a scientific report form and in a presentation

Kontakt

Andreas Oeldemann
Room N2137
Tel. 089 289 22962
andreas.oeldemann@tum.de

The thesis is carried out in cooperation with

Power Systems Acceleration Department
IBM Systems – HW Development Böblingen
IBM Deutschland R&D GmbH

 

 

Betreuer:

-----

Design and Implementation of a Hardware Managed Queue

Design and Implementation of a Hardware Managed Queue

Beschreibung

Description

Queues are a central element of an Operating System and Application Control Flow in general.

This project is part of a hardware-software codesign.

Goal

The goal of this project is to develop a hardware managed queue for a NoC-based multiprocessor platform

Prerequisites

To successfully complete this project, you should already have the following skills and experiences.

  • Very good programming skills VHDL
  • Good comprehension of a complex system
  • Good knowledge about hardware development.
  • Very good knowledge about digital circuit design

Contact

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

----

Multi-core Interference Channel Analysis (at GE Aviation)

Multi-core Interference Channel Analysis (at GE Aviation)

Beschreibung

This work is an offer of General Electric Aviation supervised at TUM LIS.

About GE Aviation

GE Aviation Munich is a R&D center of excellence and is in the heart of southern Germany, on the Garching campus of the Technical University of Munich. This creates a unique blend for our engineers to be in a university setting, while performing research and development in a world-class industrial environment that is dedicated to bringing innovative technologies to market. Within the R&D community, the center maintains close partnerships with numerous universities, research institutions and technology companies in Germany and abroad.

Role summary

The role of the student will be to conduct an interference channel analysis to address potential safety challenges of modern multi-core architectures.

Responsibilities / Goals

GE Aviation is investigating the use of modern multi-core architectures. You will characterize the interference channels of two different multi-core architectures (NXP T1040 and Xilinx Zynq Ultrascale+). The former is a quadcore Power PC built around the e5500 core, the latter a quad-core ARM built around the A53 core.
In your role you will:

  • Investigate domain specific literature (CAST-32A) which will give you a guideline and direction
  • Identify interference channels by using the specifications of both architectures mentioned above
  • Perform a state-of-the art search of existing test suites that help to exercise and identify interference channels
  • Characterize each identified test suite's interference channel's analysis capability and granularity of results
  • Implement a test suite based on the existing ones and successfully run it on both architectures in the lab

Expected Qualifications

  • Good C/C++ Skills
  • Good understanding of real-time operating systems (e.g. RTLinux, FreeRTOS, WindRiver VXWorks) and MPSoCs
  • Fluency in German and English
  • Experience in use of real-time operating systems is a plus
  • Self-motivated, structured work style and good communication skills

Kontakt

Supervisor at GE Aviation: Alexander Walsch

Online application form

Betreuer:

------

Statical WCET-Analysis for Multi-Core Systems

Statical WCET-Analysis for Multi-Core Systems

Beschreibung

It is indispensable to know the worst-case execution time (WCET) for the development of real-time systems. There exist several methods to approximate the WCET on single-core platforms. Whenever multiple tasks run simultaneously on a multi-core platform, these methods cannot provide a reliable estimation any more. The goal of this seminar is to summarize the major problems which arise when analysing multi-core applications and some methods to solve them.

Kontakt

Dirk Gabriel
Raum N2117
Tel. 089 289 28578
dirk.gabriel@tum.de

Betreuer:

---

Application Profiling for Near Memory Computing

Application Profiling for Near Memory Computing

Beschreibung

* Image Source: http://www.layer7.co.za/app_profiling.html

Description

Hitting a wall is not a pleasant thing. Computer systems faced many walls in the last decades.Being able to break the memory wall in the mid 90's and the power wall in 2004, it now faces the next crucial barrier for scalabilty. Although being able to scale systems to 100's or 1000's of cores through NoCs, performance doesn't scale due to data-to-task dislocality. We now face the locality wall.

The newest trend to tackle this issue is data-task migration and processing in or near memory.

Goal

The goal of this project is to profile application in the context of Near Memory Computing and to identify useful functions or primitives that could be accelerated.

Prerequisites

To successfully complete this project, you should already have the following skills and experiences.

  • Very good programming skills in C/C++
  • Good programming skills in SystemC
  • Very good analytical thinking and understanding of complex problems
  • Good knowledge about digital circuit design
  • Very good knowledge in the field of Near Memory Computing

Contact

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

---

FPGA Prototyping a Bus Front-End for Near Memory Accelerators

FPGA Prototyping a Bus Front-End for Near Memory Accelerators

Beschreibung

Description

Hitting a wall is not a pleasant thing. Computer systems faced many walls in the last decades.Being able to break the memory wall in the mid 90's and the power wall in 2004, it now faces the next crucial barrier for scalabilty. Although being able to scale systems to 100's or 1000's of cores through NoCs, performance doesn't scale due to data-to-task dislocality. We now face the locality wall.

The newest trend to tackle this issue is data-task migration and processing in or near memory.

Goal

The goal of this project is to develop a bus front-end for near memory operations on a FPGA prototype.

Prerequisites

To successfully complete this project, you should already have the following skills and experiences.

  • Very good programming skills VHDL
  • Good comprehension of a complex system
  • Good knowledge about hardware development.
  • Very good knowledge about digital circuit design

Contact

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

---

FPGA Prototyping a Memory Back-End for Near Memory Accelerators

FPGA Prototyping a Memory Back-End for Near Memory Accelerators

Beschreibung

Description

Hitting a wall is not a pleasant thing. Computer systems faced many walls in the last decades.Being able to break the memory wall in the mid 90's and the power wall in 2004, it now faces the next crucial barrier for scalabilty. Although being able to scale systems to 100's or 1000's of cores through NoCs, performance doesn't scale due to data-to-task dislocality. We now face the locality wall.

The newest trend to tackle this issue is data-task migration and processing in or near memory.

Goal

The goal of this project is to develop a memory back-end for near memory operations on a FPGA prototype.

Prerequisites

To successfully complete this project, you should already have the following skills and experiences.

  • Very good programming skills VHDL
  • Good comprehension of a complex system
  • Good knowledge about hardware development.
  • Very good knowledge about digital circuit design

Contact

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Frequency Optimization of a FPGA Prototype

Frequency Optimization of a FPGA Prototype

Beschreibung

Description

Our NoC-based many-core design is implemented on multiple Xilinx Virtex7 FPGAs. It is currently frequency limited by individual components.

Goal

The goal of this work is to optimize the overall frequency of an FPGA design.

This work includes:

  • Indetification of the critical paths of the design
  • Pipelining the design to reach higher frequencies

Prerequisites

For this challenging task, several prerequisites should be met:

  • Very good knowledge of VHDL
  • Very good knowledge of the Xilinx Vivado Synthesis Tool
  • Very good experience with FPGA design
  • Very good knowledge about digital circuit design

Application

If you are interested, send me an email with your CV, your transcript of records and summary of your experience attachted.

Contact

Sven Rheindt

Room: N2140

Tel. 089 289 28387

sven.rheindt@tum.de

Betreuer:

---

Simulator Support for Dynamic Task Migration

Simulator Support for Dynamic Task Migration

Beschreibung

Description

Hitting a wall is not a pleasant thing. Computer systems faced many walls in the last decades.Being able to break the memory wall in the mid 90's and the power wall in 2004, it now faces the next crucial barrier for scalabilty. Although being able to scale systems to 100's or 1000's of cores through NoCs, performance doesn't scale due to data-to-task dislocality. We now face the locality wall.

The newest trend to tackle this issue is data-task migration and processing in or near memory.

Goal

The goal of this project is to implement dynamic data migration into a trace-based simulator and to evaluate its potential.

Prerequisites

To successfully complete this project, you should already have the following skills and experiences.

  • Very good programming skills in C++ or SystemC
  • Good comprehension of a complex system
  • Very good knowledge about hardware development.

Contact

Sven Rheindt, Room: N2140, Phone +49.89.289.28387, sven.rheindt@tum.de

Betreuer:

------

Weiterentwicklung eines Linux Client Deployment Systems mit Puppet

Weiterentwicklung eines Linux Client Deployment Systems mit Puppet

Beschreibung

Die Fakultät für Elektro- und Informationstechnik stellt für Studenten und Mitarbeiter eine Vielzahl von Linux PCs zur Verfügung. Ein Konfigurationsmanagement stellt sicher, dass eine einheitliche Konfiguration auf allen PCs vorhanden ist und jederzeit aktuell gehalten wird. Für diese Aufgaben setzen wir die Open Source Werkzeuge Foreman (für die Grundinstallation) und Puppet (für das Konfigurationsmanagement) ein. Deine Aufgabe ist es, als Werkstudent die Anpassung und Weiterentwicklung dieses Systems zu begleiten. Die konkreten Aufgaben werden dabei je nach Bedarf vergeben, aktuell geplant sind beispielsweise die Umsetzung einer automatisierten Testumgebung.
Diese Arbeit gibt dir die einmalige Möglichkeit, am „Schalthebel der Automatisierung“ zu sitzen, wie sie in aktuellen Cloud-Umgebungen üblich ist. Mit deiner Arbeit beeinflusst und verbesserst du so die Installation hunderter PCs. Um die Aufgabe erfolgreich umsetzen zu können, sind folgende Voraussetzungen notwendig:

  • sehr gute Linux-Kenntnisse
  • geübter Umgang mit Werkzeugen der Open Source Welt, wie git, Skript-Sprachen, etc.
  • Interesse an einer längerfristigen Beschäftigung
  • selbständige Arbeitsweise und der Wunsch, sich in neue Themen einzuarbeiten


Bitte erläutere in deiner Bewerbung kurz, warum du dich für das Thema interessierst und welche relevanten Vorkenntnisse du bereits gesammelt hast.

Betreuer:

-----

Hardware accelerated Image Fusion

Hardware accelerated Image Fusion

Beschreibung

Automated driving systems require reliable information on the current environment in order to make proper decisions. Different sensor systems like cameras, LIDAR and radar contribute to this information. To minimize the possibility of incorrect recognitions or undetected objects the data provided by the different sensors must be exhaustively analyzed and compared to each other.

Such comparisons are only possible if the full surrounding is observed by each sensor system. As a single camera has a limited viewing angle multiple cameras are placed at different places around the vehicle to provide the required visual input.

 Additionally the processing time of the sensor inputs and data fusion must stay within limited bounds to ensure low end-to-end reaction times. For the camera systems this leads to a hardware accelerated implementation in order to achieve the required processing time.

Goal

The major goal of the thesis is the selection and implementation of a suitible algorithm to combine multiple images provided by different cameras to one image. Whereas the evaluation of the algorithm can be done with a pure software version e.g. with OpenCV the final version should run on Xilinx Zynq with suitable hardware accelerators implemented in the FPGA part.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • Knowledge of a hardware description language e.g. VHDL
  • Solid C programming skills
  • Hands-on FPGA development experience, preferably using Xilinx Vivado
  • Self-motivated and structured work style

Kontakt

Dirk Gabriel
Room N2117
Tel. 089 289 28578
dirk.gabriel@tum.de

Betreuer:

Laufende Arbeiten

Bachelorarbeiten

Implementation and Evaluation of a UART Controller

Implementation and Evaluation of a UART Controller

Beschreibung

To interact with an embedded system UART is still the dominant method. To reduce the number of I/O pins, tunneling UART over the existing debug interface is beneficial.

Goal

The goal of this thesis is to develop a well-known UART controller (16550) in hardware as part of the Open SoC Debug project.
Towards this goal you’ll complete the following tasks:

  • Learn about the standard UART 16550 chip and its interfaces.

  • Implement the interfaces in hardware running on an FPGA.

  • Implement a virtual tty interface as part of the Open Soc Debug software implementation.

  • Document your work in a written report and present your work in a presentation.

 

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences.

  • Experience in a HDL language (VHDL or Verilog)
  • Ideally experience with hardware design and I/O interfaces.

  • Self-motivated and structured work style

Betreuer:

Masterarbeiten

Design and Implementation of a Fault-Tolerant Low-Throughput Broadcast Control & Management Network for System on Chip

Design and Implementation of a Fault-Tolerant Low-Throughput Broadcast Control & Management Network for System on Chip

Beschreibung

Enabled by ever decreasing structure sizes, modern System on Chips (SoC) integrate a large amount of different processing elements, making them Multi-Processor System on Chips (MPSoC). These processing elements require a communication infrastructure to exchange data with each other and with shared resources such as memory and I/O ports. The limited scalability of bus-based solutions has led to a paradigm shift towards Network on Chips (NoC) which allow for multiple data streams between different nodes to be exchanged in parallel.
One way of organizing the access to such a NoC is by using Time-Division Multiplexing (TDM) which
allows to give service guarantees. However, such a TDM NoC must be configured before it can be used which requires a reliable configuration network.

Goal

The goal of this thesis is to implement a reliable broadcast configuration network that can be used to configure the routers and network interfaces of a TDM NoC and to create tests to validate the implemented hardware.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • Good programming skills in a hardware description language i.e. VHDL or (System)Verilog
  • Good knowledge of on-chip communication
  • Solid Python programming skills
  • At least basic knowledge of the functionality of NoCs
  • Self-motivated and structured work style

Learning Objectives

By completing this project, you will be able to

  • understand the concept of TDM NoCs
  • create and extend hardware modules in SystemVerilog
  • create tests to validate hardware modules
  • document your work in form of a scientific report and a presentation

 

Kontakt

Max Koenen
Room N2118
Tel. 089 289 23084
max.koenen@tum.de

Betreuer:

Forschungspraxis oder MSCE Forschungspraxis

Implementation of Fault-Injection & Fault-Detection Mechanisms in a Time-Division Multiplexed Network on Chip

Implementation of Fault-Injection & Fault-Detection Mechanisms in a Time-Division Multiplexed Network on Chip

Beschreibung

Enabled by ever decreasing structure sizes, modern System on Chips (SoC) integrate a large amount of different processing elements, making them Multi-Processor System on Chips (MPSoC). These processing elements require a communication infrastructure to exchange data with each other and with shared resources such as memory and I/O ports. The limited scalability of bus-based solutions has led to a paradigm shift towards Network on Chips (NoC) which allow for multiple data streams between different nodes to be exchanged in parallel.
To implement safety-critical real-time applications on such an MPSoC, the NoC must be fault-tolerant. In order to fulfill this requirement, it is necessary to first detect a fault in the system. Furthermore, to test this requirement, it is necessary to be able to inject errors into the system at random times and places.

Goal

The goal of this thesis is to implement a fault-injection and a fault-detection mechanism in a Time-Division Multiplexed (TDM) NoC and to create tests to validate the behavior of the hardware models.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • At least basic programming skills in a hardware description language i.e. VHDL or (System)Verilog
  • Solid Python programming skills
  • At least basic knowledge of the functionality of NoCs
  • Self-motivated and structured work style

Learning Objectives

By completing this project, you will be able to

  • understand the concept of TDM NoCs
  • understand the concept of fault-detection in hardware
  • create and extend hardware modules in SystemVerilog
  • create tests to validate hardware modules
  • document your work in form of a scientific report and a presentation

 

Kontakt

Max Koenen
Room N2118
Tel. 089 289 23084
max.koenen@tum.de

Betreuer: