Masterarbeiten

Compressed Representations of Convolutional Neural Networks for Data Movement Optimization

Compressed Representations of Convolutional Neural Networks for Data Movement Optimization

Beschreibung

Convolutional Neural Networks (CNNs) have become the state of the art in image classification and other computer vision tasks. This has led to a substantial effort from industry and academia, to bring such neural networks to edge devices. With tight area, power and latency constraints, this challenge presents many optimization opportunities.

Data movement optimization has been of high interest in the field of CNN accelerator design. As it accounts for a significant portion of the total power consumption of the system, researchers have used loop blocking methods and tailored dataflows to maximize the reuse of every piece of data read from a memory higher up in the hierarchy.

Goals

Smaller chunks of an entire neural network can be compressed to lower dimensionality. This representation can serve the purpose of reducing the size of transactions from off-chip memory to on-chip memory. Accessing off-chip memory can cost orders of magnitude more energy than on-chip memory requests.

The goal is to find the sweet spot between the loss in accuracy due to dimensionality reduction and the power consumption improvement brought about by the reduced data movement between memory hierarchies.

Voraussetzungen

To successfully complete this project, you should have the following skills and experiences:

  • Very good programming skills in Python and Tensorflow
  • Good knowledge of neural networks, particularly convolutional neural networks

The student is expected to be highly motivated and independent.

Learning Objectives

By completing this project, you will be able to:

  • Find accurate, compressed representations of neural networks
  • Analyze the effects of data movement on energy efficiency
  • Test and evaluate compression and expansion methods
  • Present your work in the form of a scientific report

 

Kontakt

Nael Fasfous
Department of Electrical and Computer Engineering
Chair of Integrated Systems
Arcisstr. 21
80333 Munich
Germany

Phone: +49.89.289.23858
Building: N1 (Theresienstr. 90)
Room: N2116
Email: nael.fasfous@tum.de

This project is in cooperation with BMW AG.

Betreuer:

Nael Al-Fasfous

Design of a RISC-V MPSoC with Autonomic Layer (IPF)

Design of a RISC-V MPSoC with Autonomic Layer (IPF)

Stichworte:
RISC-V, MPSoC, LCT, Machine Learning, VHDL

Beschreibung

Today's Multi-Processor System-on-Chip (MPSoCs) are getting more and more complex due to the growing amount of cores and accelerators.  Hence it's not possible anymore to set runtime parameters like frequency and task distribution by design time in an optimal manner. Therefore future controllers try to make use of machine learning which is aware of the system's current state (self-awareness). 

Information Processing Factoriy (IPF) is a global project that claims to show self-awareness across multiple abstraction levels. It represents a paradigm shift in platform design by envisioning the move towards a consistent platform-centric design in which the combination of self-organized learning and formal reactive methods guarantee the applicability of such cyber-physical systems in safety-critical and high-availability applications. 

At TUM, we explore the application and implementation of machine learning algorithms in hardware to optimize the mode of operation of MPSoCs at runtime. 

Currently we use a Leon3 setup, but more and more of recent research on SoC design including CPUs is done using the open RISC V instruction set. Due to this fact the community and the amount of available tools is constantly increasing, which is a significant advantage over other platforms like SparcV8.

For this reason we want to port our current Leon3 design to this new platform. 

Towards this goal the following tasks need to be done in this master thesis:

1. There exist several open source implementations of RISC V Cores which need to be compared in a first step.
2. Afterwards, a first MPSoC needs to be synthesized using the most promising implementation. A first goal on this MPSoC is to run a simple ‘Hello World’ using all cores.
3. The final step afterwards would be to attach our autonomic machine learning layer to this MPSoC in order to get a self-aware system doing DVFS and task migration autonomously.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

• Good VHDL and Verilog Skills
• Basic Knowledge of RISC V specialities
• Good Understanding of MPSoCs
• Self-motivated and structured work style
• opional: Basic Knowledge of Machine Learning

Kontakt

Florian Maurer
Chair of Integrated Systems 
Arcisstrasse 21,
80333 Munich
Germany

Tel. +49 89 289 23870 
flo.maurer@tum.de

www.lis.ei.tum.de

Betreuer:

Florian Maurer

Development of a Concept to Enable Access to Heavily Shared Resources in an MPSoC Featuring a Hybrid NoC

Development of a Concept to Enable Access to Heavily Shared Resources in an MPSoC Featuring a Hybrid NoC

Beschreibung

Enabled by ever decreasing structure sizes, modern System on Chips (SoC) integrate a large amount of different processing elements, making them Multi-Processor System on Chips (MPSoC). These processing elements require a communication infrastructure to exchange data with each other and with shared resources such as memory and I/O ports. The limited scalability of bus-based solutions has led to a paradigm shift towards Network on Chips (NoC) which allow for multiple data streams between different nodes to be exchanged in parallel.
One way of organizing the access to such a NoC is by using Time-Division Multiplexing (TDM) which allows to give service guarantees. However, in a TDM NoC the number of parallel accesses to a resource is limited which is problematic for heavily shared resources such as memory and I/O.

Goal

The goal of this thesis is to develop a concept to enable access to heavily shared resources for critical applica tions using TDM traffic in a hybrid TDM and packet-switched NoC.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • Good programming skills in a hardware description language i.e. (System)Verilog or VHDL
  • Good knowledge of on-chip communication
  • Solid Python programming skills
  • At least basic knowledge of the functionality of NoCs
  • Self-motivated and structured work style

Learning Objectives

By completing this project, you will be able to

  • Understand the concept of TDM NoCs
  • Create and extend hardware modules in SystemVerilog
  • Create tests to validate hardware modules
  • Document your work in form of a scientific report and a presentation

 

 

Kontakt

Max Koenen
Room N2118
Tel. 089 289 23084
max.koenen@tum.de

Betreuer:

Extending Region Based Cache Coherence to Global (DDR) Memory for Distributed Shared MPSoCs on an FPGA Prototype

Extending Region Based Cache Coherence to Global (DDR) Memory for Distributed Shared MPSoCs on an FPGA Prototype

Stichworte:
Cache Coherence, Distributed Shared Memory MPSoCs

Kurzbeschreibung:
The goal of this project is to extend RBCC to global memory with distributed directories.

Beschreibung

Providing hardware coherence formoderntile-based MPSoCsrequires additional area. As a result, this does not scale with increasing tile counts.As part of the Invasive Computing project, we introducedRegion Based Cache Coherence (RBCC) whichis ascalableapproachthat provides on-demand coherence. RBCC enables users to dynamically create/destroy coherency regions based on application requirements. Currently, RBCC has been developed for the distributed tile local memories of our system. The next step is to extend RBCC to the global memory, so as to fully utilize the memory capacity of our heterogeneous muticore architecture.

Towards this goal you’ll complete the following tasks:

  • Investigate existing distributed directory based cache coherence schemes

  • Extend RBCC to global DDR memory

  • Verify the design on a FPGA-based hardware platform

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences.

  • Very Good VHDL Skills

  • Good C/C++ Skills

  • Good understanding of MPSoCs and Cache Coherence Schemes

  • Self-motivated and structured work style

 

 

Kontakt

Akshay Srivatsa
Chair of Integrated Systems
Arcisstraße 21, 80333 Munich
Tel. +49 89 289 22963
srivatsa.akshay@tum.de
www.lis.ei.tum.de

Betreuer:

Assigned Topics

Masterarbeiten

Exploring the Dynamicity of Region Based Cache Coherence for Distributed Shared Memory MPSoCs on an FPGA Prototype

Exploring the Dynamicity of Region Based Cache Coherence for Distributed Shared Memory MPSoCs on an FPGA Prototype

Stichworte:
Cache Coherence, Distributed Shared Memory MPSoCs

Kurzbeschreibung:
The goal of this project is to explore the dynamicity of RBCC and minimize the context switching penalties.

Beschreibung

Providing hardware coherence for modern tile-based MPSoCs requires additional area. As a result, this does not scale with increasing tile counts. As part of the Invasive Computing project, we introduced Region Based Cache Coherence (RBCC) which is a scalable approach that provides on-demand coherence. RBCC enables users to dynamically create/destroy coherency regions based on application requirements. With such dynamicity, the associated context switching overheads like cache flushing, directory flushing, coherency region reconfigurations, etc. need to be investigated and optimized.

Towards this goal you’ll complete the following tasks:
• Investigate existing directory based cache coherence schemes
• Implement/Modify a dynamic framework for RBCC
• Verify the design on a FPGA-based hardware platform

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:
• Very Good VHDL Skills
• Good C/C++ Skills
• Good understanding of MPSoCs and Cache Coherence Schemes
• Self-motivated and structured work style

Kontakt

Akshay Srivatsa
Chair of Integrated Systems
Arcisstraße 21, 80333 Munich
Tel. +49 89 289 22963
srivatsa.akshay@tum.de
www.lis.ei.tum.de

Betreuer:

Design and Implementation of a Fault-Tolerant Low-Throughput Broadcast Control & Management Network for System on Chip

Design and Implementation of a Fault-Tolerant Low-Throughput Broadcast Control & Management Network for System on Chip

Beschreibung

Enabled by ever decreasing structure sizes, modern System on Chips (SoC) integrate a large amount of different processing elements, making them Multi-Processor System on Chips (MPSoC). These processing elements require a communication infrastructure to exchange data with each other and with shared resources such as memory and I/O ports. The limited scalability of bus-based solutions has led to a paradigm shift towards Network on Chips (NoC) which allow for multiple data streams between different nodes to be exchanged in parallel.
One way of organizing the access to such a NoC is by using Time-Division Multiplexing (TDM) which
allows to give service guarantees. However, such a TDM NoC must be configured before it can be used which requires a reliable configuration network.

Goal

The goal of this thesis is to implement a reliable broadcast configuration network that can be used to configure the routers and network interfaces of a TDM NoC and to create tests to validate the implemented hardware.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • Good programming skills in a hardware description language i.e. VHDL or (System)Verilog
  • Good knowledge of on-chip communication
  • Solid Python programming skills
  • At least basic knowledge of the functionality of NoCs
  • Self-motivated and structured work style

Learning Objectives

By completing this project, you will be able to

  • understand the concept of TDM NoCs
  • create and extend hardware modules in SystemVerilog
  • create tests to validate hardware modules
  • document your work in form of a scientific report and a presentation

 

Kontakt

Max Koenen
Room N2118
Tel. 089 289 23084
max.koenen@tum.de

Betreuer:

Design and Implementation of a Network Interface for a Fault-Tolerant Hybrid Network on Chip

Design and Implementation of a Network Interface for a Fault-Tolerant Hybrid Network on Chip

Beschreibung

Enabled by ever decreasing structure sizes, modern System on Chips (SoC) integrate a large amount of different processing elements, making them Multi-Processor System on Chips (MPSoC). These processing elements require a communication infrastructure to exchange data with each other and with shared resources such as memory and I/O ports. The limited scalability of bus-based solutions has led to a paradigm shift towards Network on Chips (NoC) which allow for multiple data streams between different nodes to be exchanged in parallel.
In order to implement a safety-critical real-time application on such an MPSoC, the NoC must fulfill certain requirements: it must ensure that no critical data gets lost, all critical data gets delivered within a certain deadline, and other applications cannot interfere with the critical application. And all this must be guaranteed even in case of a fault in the NoC.

Goal

The goal of this thesis is to implement a Network Interface for a hybrid Time-Division Multiplexed (TDM) and packet-switched NoC that provides protection switching for critical traffic and to create tests to validate the behavior of the implemented hardware.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • Very good programming skills in a hardware description language i.e. VHDL or (System)Verilog
  • Solid Python programming skills
  • At least basic knowledge of the functionality of NoCs
  • Self-motivated and structured work style

Learning Objectives

By completing this project, you will be able to

  • Understand the concept of TDM NoCs
  • Design and implement a complex hardware module in SystemVerilog
  • Create tests to validate hardware modules
  • Document your work in form of a scientific report and a presentation

 

 

Kontakt

Max Koenen
Room N2118
Tel. 089 289 23084
max.koenen@tum.de

Betreuer:

Interference Channel Analysis (at GE Aviation)

Interference Channel Analysis (at GE Aviation)

Beschreibung

This work is an offer of General Electric Aviation supervised at TUM LIS.

About GE Aviation

GE Aviation Munich is a R&D center of excellence and is in the heart of southern Germany, on the Garching campus of the Technical University of Munich. This creates a unique blend for our engineers to be in a university setting, while performing research and development in a world-class industrial environment that is dedicated to bringing innovative technologies to market. Within the R&D community, the center maintains close partnerships with numerous universities, research institutions and technology companies in Germany and abroad.

Role summary

GE Aviation is investigating the use of modern multi-core architectures. You will characterize the interference channels of two different multi-core architectures (NXP T1040 and Xilinx Zynq Ultrascale+). The former is a quadcore Power PC built around the e5500 core, the latter a quad-core ARM built around the A53 core. This work can be done either as a student job or for your master thesis.

Responsibilities

  • Enhance an existing bare-metal test suite
  • Develop a test plan
  • Characterize interference channels by investigating performance and determinism
  • Develop and implement mitigation concepts

Expected Qualifications

  • Good C/C++ Skills
  • Good understanding of MPSoCs and CPU architectures
  • Experience with embedded software development
  • Self-motivated, structured work style and good communication skills
  • Fluency in English
  • Good academic track record

Kontakt

Supervisor at GE Aviation: Alexander Walsch

Online application form

Betreuer: