Sebastian started working at the CALGO Lab team as a research assistant in 2020. He graduated with a Master’s degree in Data Science from the Beuth University of Applied Sciences in Berlin in 2020. During his studies, he worked as a Software and Data Engineer and, among others, was responsible for the implementation of a Kubernetes-based data science platform and a metadata management system for quality assessment and evaluation of 3D mass spectroscopy data. In his master’s thesis, he evaluated compression techniques for BERT to reduce its size while maintaining performance.

Currently, Sebastian is part of the GCA project, funded by the Federal Ministry for the Environment, Nature Conservation, and Nuclear Safety based on a decision of the German Bundestag. The goal of the GCA project is to implement an online assistant that encourages and helps users to choose more sustainable options while shopping online. This requires the implementation of a database that contains sustainability information on a product-by-product basis, also known as, GreenDB. This research database is publicly available on Zenodo.

Sebastian’s research interests are data quality problems and how to fix them to increase the efficiency of downstream machine learning applications. Therefore, he envisions a data cleaning system that automatically detects and fixes data errors, such as missing values, outliers, and other inconsistencies.



A list of Sebastian’s publications is available at Google Scholar


Presentations and Articles

  • Skalieren Von Deep Learning Frameworks Mit Hilfe Von Cloudinfrastrukturen Und Kubernetes
  • Machine Learning Im Kubernetes Cluster
  • Mit Metadatenmanagement hinzu reproduzierbaren und flexiblen Data-Science-Workflows auf Kubernetes



E-Mail: sebastian.jaeger (at)

Twitter: @se_jaeger