Mathematics and Machine Learning Internship (Summer 2025)

  • Organizer: Diaaeldin Taha (taha [at] mis [dot] mpg [dot] de)
  • Time and Location: Two internship-wide organizational meetings will take place at MIS MPI A3 01, Thursdays 11:15 - 12:45. Instructions for accessing A3 01 are available here; doors will automatically open 15 minutes before the meetings. Subsequent meetings will be arranged individually with each project mentor
  • Moodle: TBA
  • Module Description: Available here.
  • Course Plan Entry: MPI.MaML
  • Study Programs:
    • B.Sc. Informatik 6. Semester [Kernmodul]
    • B.Sc. Mathematik [Projektpraktikum]
    • M.Sc. Data Science 2. Semester [Wahlpflichtbereich Datenanalyse]
    • M.Sc. Informatik 2. Semester [Kernmodul]
    • Diplom Mathematik [Seminarschein]

This is the first iteration of the “Mathematics and Machine Learning Praktikum,” organized between the Max Planck Institute for Mathematics in the Sciences (MPI MIS), the Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), and Leipzig University. The internship involves designing, analysing, and implementing algorithms and models at the intersection of mathematics and machine learning. Several projects will be offered, which are worked on in small groups of 1–5 participants.

The deliverables include:

  • A written report detailing the research work.
  • A complete, working, and well-documented code repository (or equivalent artifact) that demonstrates the hands-on work performed during the internship.
  • Two 10-minute group presentations: one mid-semester to provide an update on progress and outline next steps, and one at the end of the semester to present final results. Specific dates for these presentations will be arranged together with the participants.

For more details, refer to the module description.

  • Organizational meeting 1/2: MPI MIS A3 01, Thu 10.04.2025, 11:15 - 12:45 (tentative)
  • Organizational meeting 2/2: MPI MIS A3 01, Thu 17.04.2025, 11:15 - 12:45 (tentative)
  • Mid-semester presentations: TBA
  • End-of-semester presentations: TBA
  • Deadline for submitting deliverables: TBA

This list may be updated with more projects before the organizational meeting. Participants who want to propose projects that fit the scope of the internship can contact the organizer no later than the first organizational meeting.

Mentor: Diaaeldin Taha

Members: TBA

Description: Mathematics, as a fundamentally creative human endeavor, involves formulating conjectures, testing them experimentally through computation, and gaining insights into patterns, potential proofs, or counterexamples. Inspired by recent advancements in AI-assisted mathematical discovery, this project investigates how artificial intelligence techniques, such as reinforcement learning, generative models, and symbolic computation, can systematically support mathematicians in identifying new conjectures, verifying mathematical patterns, or finding explicit counterexamples. Participants will learn how to translate mathematical problems into computational tasks, implement suitable AI methods, interpret results, and potentially contribute original results to open mathematical questions.

Prerequisites:

  • Basic familiarity with machine learning or symbolic computation is beneficial but not mandatory.
  • Interest in mathematical reasoning and/or experimental mathematics.

References:

  • Davies, Alex, et al. “Advancing mathematics by guiding human intuition with AI.” Nature 600.7887 (2021): 70-74.
  • Wagner, Adam. “Finding counterexamples with reinforcement learning.” (2021).

Mentor: Diaaeldin Taha

Members: TBA

Description: Whereas correlation is concerned with patterns between variables in data, causation is concerned with how changes in one variable influence another. While correlations can be learned directly from data, uncovering causal structure often requires subtle assumptions and careful reasoning. Causal deep learning aims to offer tools to navigate these challenges by combining data-driven deep models with causal discovery and effect estimation. In this project, participants will get familiar with the basics of causal deep learning, implement selected methods from recent literature, and gain hands-on experience reasoning about cause and effect in data.

Prerequisites:

  • Familiarity with machine learning or deep learning fundamentals helpful but not mandatory.
  • Interest in learning about causality and its application in deep learning.

References:

  • Berrevoets, Jeroen, et al. “Causal deep learning.” arXiv preprint arXiv:2303.02186 (2023).
  • Kaddour, Jean, et al. “Causal machine learning: A survey and open problems.” arXiv preprint arXiv:2206.15475 (2022).

Mentor: Diaaeldin Taha

Members: TBA

Description: Modern machine learning models are trained by optimizing highly non-convex loss functions, yet in practice, simple gradient-based methods often work remarkably well. This project investigates the geometry of these optimization landscapes: how structure, symmetry, and overparameterization shape the behavior of gradient descent. Participants will explore recent theoretical and empirical work connecting optimization dynamics to generalization and model performance. The goal is to implement simple model families, visualize their loss surfaces, and analyze how different training regimes (e.g. width, initialization, learning rate) interact with the landscape geometry.

Prerequisites:

  • Familiarity with gradient descent and basic optimization theory.
  • Curiosity about the interplay between learning dynamics and mathematical structure.

References:

  • Li, Hao, et al. “Visualizing the Loss Landscape of Neural Nets.” NeurIPS (2018).
  • Sagun, Levent, et al. “Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond.” arXiv:1611.07476 (2016).
  • Chizat, L., & Bach, F. “On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport.” (NeurIPS 2018)
  • Fort, Stanislav, et al. “Deep Learning versus Kernel Learning: an Empirical Study of Loss Landscape Geometry and the Time Evolution of the Neural Tangent Kernel.” NeurIPS (2020).

Mentor: Diaaeldin Taha

Members: TBA

Description: In many real-world applications, data lie on non-Euclidean spaces, such as spheres, tori, or hyperbolic surfaces, rather than in flat, high-dimensional vector spaces. This project explores how to model data that lives on non-Euclidean space. Participants will learn how to implement models that respect or exploit the underlying structure (e.g., Riemannian gradient descent, manifold-aware neural networks, or geodesic convolution). Depending on interest, the project can lean toward visualization or other machine learning problems.

Prerequisites:

  • Some exposure to differential geometry or calculus of curves and surfaces is a bonus.
  • Interest in geometry, optimization, or non-Euclidean ML.

References:

  • Bécigneul, G., & Ganea, O.-E. “Riemannian adaptive optimization methods.” ICLR (2019).
  • Bronstein, Michael M., et al. “Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.” arXiv preprint arXiv:2104.13478 (2021).
  • Sanborn, Sophia, et al. “Beyond euclid: An illustrated guide to modern machine learning with geometric, topological, and algebraic structures.” arXiv preprint arXiv:2407.09468 (2024).
  • mathematics_machine_learning_internship_summer_2025.txt
  • Last modified: 2025/04/02 14:46
  • by Diaaeldin Taha