Lunch Seminars Other Seminars Student/Postdoc Seminars Meetings and Workshops

## Past Events

### AY 2021/2022: Lunch Seminars

October 20, 2021
Steven Brunton [University of Washington]
▦ Machine Learning for Scientific Discovery, with Examples in Fluid Mechanics ▦

This work describes how machine learning may be used to develop accurate and efficient nonlinear dynamical systems models for complex natural and engineered systems. We explore the sparse identification of nonlinear dynamics (SINDy) algorithm, which identifies a minimal dynamical system model that balances model complexity with accuracy, avoiding overfitting. This approach tends to promote models that are interpretable and generalizable, capturing the essential "physics" of the system. We also discuss the importance of learning effective coordinate systems in which the dynamics may be expected to be sparse. This sparse modeling approach will be demonstrated on a range of challenging modeling problems in fluid dynamics, and we will discuss how to incorporate these models into existing model-based control efforts. Because fluid dynamics is central to transportation, health, and defense systems, we will emphasize the importance of machine learning solutions that are interpretable, explainable, generalizable, and that respect known physics.

View Recorded Video

October 27, 2021
Thomas Yizhao Hou [Caltech]
▦ Potential singularity of 3D incompressible Euler equations and the nearly singular behavior of 3D Navier-Stokes equations ▦

Whether the 3D incompressible Euler and Navier-Stokes equations can develop a finite time singularity from smooth initial data is one of the most challenging problems in nonlinear PDEs. In an effort to provide a rigorous proof of the potential Euler singularity revealed by Luo-Hou's computation, we develop a novel method of analysis and prove that the original De Gregorio model and the Hou-Lou model develop a finite time singularity from smooth initial data. Using this framework and some techniques from Elgindi's recent work on the Euler singularity, we prove the finite time blowup of the 2D Boussinesq and 3D Euler equations with $C^{1,\alpha}$ initial velocity and boundary. Further, we present some new numerical evidence that the 3D incompressible Euler equations with smooth initial data develop a potential finite time singularity at the origin, which is quite different from the Luo-Hou scenario. Our study also shows that the 3D Navier-Stokes equations develop nearly singular solutions with maximum vorticity increasing by a factor of $10^7$. However, the viscous effect eventually dominates vortex stretching and the 3D Navier-Stokes equations narrowly escape finite time blowup. Finally, we present strong numerical evidence that the 3D Navier-Stokes equations with slowly decaying time-dependent viscosity develop a finite time singularity.

View Recorded Video

December 8, 2021
Arnulf Jentzen [University of Munster]
▦ On neural network approximations for partial differential equations and convergence analyses for gradient descent optimization methods ▦

In the first part of this talk we show that artificial neural networks (ANNs) with rectified linear unit (ReLU) activation have the fundamental capacity to overcome the curse of dimensionality in the numerical approximation of semilinear heat partial differential equations with Lipschitz continuous nonlinearities. In the second part of this talk we present recent convergence analysis results for gradient descent (GD) optimization methods in the training of ANNs with ReLU activation. Despite the great success of GD type optimization methods in numerical simulations for the training of ANNs with ReLU activation, it remains -- even in the simplest situation of the plain vanilla GD optimization method with random initializations -- an open problem to prove (or disprove) the conjecture that the risk of the GD optimization method converges in the training of ANNs with ReLU activation to zero as the width/depth of the ANNs, the number of independent random initializations, and the number of GD steps increase to infinity. In the second part of this talk we, in particular, present the affirmative answer of this conjecture in the special situation where the probability distribution of the input data is absolutely continuous with respect to the continuous uniform distribution on a compact interval and where the target function under consideration is piecewise linear.

View Recorded Video

January 19, 2022
Alexandria Volkening [Purdue University]
▦ Modeling and topological data analysis of zebrafish-skin patterns ▦

Many natural and social phenomena involve individual agents coming together to create group dynamics, whether the agents are drivers in a traffic jam, cells in a tissue, or locusts in a swarm. Here I will focus on the specific example of skin pattern formation in zebrafish. Wild-type zebrafish are named for their dark and light stripes, but mutant zebrafish feature variable skin patterns, including spots and labyrinth curves. All of these patterns form as the fish grow due to the interactions of tens of thousands of pigment cells in the skin. This leads to the question: how do cell interactions change to create mutant patterns? The longterm motivation for my work is to help shed light on this question and better link genes, cell behavior, and visible animal characteristics. Toward this goal, I develop agent-based models to describe cell behavior in growing 2D domains. However, my models are stochastic and have many parameters, and comparing simulated patterns and fish images is often a qualitative process. In this talk, I will overview our models and discuss how methods from topological data analysis can be used to quantitatively describe cell-based patterns and compare in vivo and in silico images.

View Recorded Video

January 26, 2022
Talea Mayo [Emory University]
▦ Climate change impacts on hurricane storm surge risk ▦

It is widely accepted that climate change will cause global mean sea level rise, increasing coastal flood risk in many places. However, climate change also has significant implications for tropical cyclone climatology. Specifically, hurricane intensity, size, and translation speed are all expected to intensify in the future, and each of these influences storm surge generation and propagation. Numerical simulation plays a vital role in understanding the resulting changes to storm surge risk, as there is not sufficient historical data for statistical analysis. In this seminar, I'll discuss two numerical modeling approaches we've taken to more comprehensively understanding what climate change means for storm surge risk. In the first approach, we use a statistical/deterministic hurricane model with the numerical hydrodynamic model, SLOSH, to simulate synthetic storm surges for coastal communities along the U.S. North Atlantic. We use extreme value analysis to determine probability distributions of storm tide, and integrate probability distributions of local sea level rise to understand the present day flood risk and how it will change over the next century. We find that for most of the observed regions flood risk can be expected to increase by a factor of 10. In the second approach, we use the convection permitting regional climate model, WRF, and the high fidelity numerical storm surge model, ADCIRC, to simulate historical storm surges that impacted the Gulf of Mexico and Atlantic Coasts of the continental United States from 2000-2013. We then simulate the same storm surges under projected end of century climate conditions to assess the impact of climate change on storm surge inundation. We find that the volume of inundation increases for over half of the simulated storms and the average change for all storms is +36%, with notable increases in inundation occur near Texas, Mississippi, the Gulf Coast of Florida, the Carolinas, Virginia, and New York.

View Recorded Video

February 16, 2022
Ron Buckmire [Occidental College]

Annenberg 105 & Zoom
12:00pm

▦Different Differences▦

This talk is organized around different examples of the word "difference". First, I will summarize some of my work in the area of non-standard finite differences, which are numerical techniques used to generate approximate solutions of differential equations. Second, I will discuss how my differences from the average mathematician has caused/allowed/encouraged me to follow a different academic trajectory than the norm. Lastly, I will present some comments on how the mathematics community treats "difference" and provides "a new hope" for how the future for different mathematicians can be different from the past.

View Recorded Video

February 23, 2022
▦ Mathematical and Computational Approaches to Social Justice ▦

Civil rights leader, educator, and investigative journalist Ida B. Wells said that "the way to right wrongs is to shine the light of truth upon them." This talk will demonstrate how quantitative and computational approaches can shine a light on social injustices and help build solutions to remedy them. We will present quantitative social justice projects on topics ranging from diversity in art museums to inclusion in higher education to equity in criminal sentencing and more. The tools engaged include crowdsourcing, data cleaning, clustering, hypothesis testing, statistical modeling, Markov chains, and data visualization, to name a few. I hope that this talk leaves you informed about the breadth of social justice applications that one can tackle using mathematical and data science tools in careful collaboration with other scholars and activists.

March 30, 2022
Peter Schröder [Caltech]
▦ Constrained Willmore Surfaces ▦

Surfaces which minimize a squared curvature bending energy are fundamental in the theory of smooth surfaces, as well as in geometric and physical modeling. The canonical representative of such energies is the Willmore energy, measuring the total squared mean curvature of a surface. The associated Euler-Lagrange equation is a non-linear 4th order PDE which presents significant numerical challenges. In this talk I will introduce some of the tools we developed towards effective algorithms for finding minimizers of the Willmore energy in a given conformal class, i.e., allowing only conformal deformations. Such (conformally) constrained Willmore surfaces can be understood as generalizations of non-linear splines from the univariate to the bivariate setting. Physically these model isotropic auxetic materials. Our algorithms also serve as tools for experimental mathematics in the study of extrinsic surface shape as a function of the metric and genus.

Joint work with Yousuf Soliman (Caltech), Olga Diamanti (UGraz), Albert Chern (UCSD), Felix Knöppel (TU Berlin), Ulrich Pinkall (TU Berlin)

April 20, 2022
Rebecca Willett [University of Chicago]
▦ Machine Learning in Data Assimilation and Inverse Problems ▦

Machine learning is emerging as an essential tool in many science and engineering domains, fueled by extraordinarily powerful computers as well as advanced instruments capable of collecting high-resolution and high-dimensional experimental data. However, using off-the-shelf machine learning methods for analyzing scientific and engineering data fails to leverage our vast, collective (albeit partial) understanding of the underlying physical phenomenon or models of sensor systems. Reconstructing physical phenomena from indirect scientific observations is at the heart of scientific measurement and discovery, and so a pervasive challenge is to develop new methodologies capable of combining such physical models with training data to yield more rapid, accurate inferences. We will explore these ideas in the context of inverse problems and data assimilation challenges; examples include climate forecasting, uncovering material structure and properties, and medical image reconstruction. Classical approaches to such inverse problems and data assimilation approaches have relied upon insights from optimization, signal processing, and the careful exploitation of forward models. In this talk, we will see how these insights and tools can be integrated into machine learning systems to yield novel methods with significant accuracy and computational advantages over naïve applications of machine learning.

View Recorded Video

May 25, 2022
Carlos Fernandez-Granda [New York University]
▦ Deep denoising for scientific discovery ▦

Deep-learning models for image denoising achieve impressive results when trained on standard natural-image datasets in a supervised fashion. However, unleashing their potential in practice will require developing unsupervised or semi-supervised approaches capable of learning from real data, as well as understanding the strategies learned by these models. In this talk, we will describe advances in this direction motivated by a real-world scientific application: determining the 3D atomic structure of catalytic nanoparticles from extremely noisy electron-microscope data.

View Recorded Video

### AY 2021/2022: Student/Postdoc Seminars

October 8, 2021
• Internal CMX Seminar •

Zoom
1:00pm

Jiajie Chen
▦On the competition between advection and vortex stretching  ▦

Whether the 3D incompressible Euler equations can develop a finite-time singularity from smooth initial data is an outstanding open problem. The presence of vortex stretching is the primary source of a potential finite-time singularity. However, to obtain a singularity, the effect of the advection is one of the obstacles. In this talk, we will first talk about some examples in incompressible fluids about the competition between advection and vortex stretching. Then we will study the De Gregorio (DG) model and the generalized Constantin-Lax-Majda (gCLM) model, which model this competition, and several conjectures on these models. In an effort to establish singularity formation in incompressible fluids, we develop a novel framework based on dynamic rescaling. Using this framework, we construct finite time singularities of the DG model and gCLM model if the advection is "weaker" than the vortex stretching. On the other hand, for initial data with the same sign and symmetry properties as the blowup solution, if the advection is "stronger", we show that the solution to the DG model exists globally.

October 22, 2021
• Internal CMX Seminar •

Zoom
2:00pm

Nicholas Nelsen
▦Operator regression for forward and inverse problems  ▦

Operator learning has emerged as a key enabler for accelerating the computation of existing scientific models and for discovering new models from data when no model exists. In the first part of this talk, I will describe a fully data-driven methodology based on random features to regress nonlinear operators between infinite-dimensional spaces of functions. Generalizing traditional random feature methods operating in Euclidean spaces, this approach may be viewed as a random parametric (operator-valued) kernel method that enjoys several computational advantages over its nonparametric counterparts. The algorithm is deployed in practice to regress solution operators of parametric partial differential equations (PDEs), and I will also use the learned surrogate model to rapidly solve a PDE-based Bayesian inverse problem. The second part of my talk concerns recent theoretical results on the learnability of compact, bounded, and unbounded linear operators that define forward and inverse problems. Bayesian and learning-theoretic estimators for an unknown linear operator on an infinite-dimensional Hilbert space are derived given noisy input-output data, and under some imposed assumptions, convergence rates of the operator estimators are established in the infinite data limit. I will conclude with numerical results on learning differential (unbounded), identity (bounded), and inverse differential (compact) operators that exhibit excellent agreement with the theory and beyond.

View Recorded Video

November 5, 2021
• Internal CMX Seminar •

Zoom
1:00pm

Daniel Leibovici
▦An FC-based shock-dynamics solver with neural-network localized artificial-viscosity assignment ▦

I will present a spectral scheme for the numerical solution of shock-wave problems in general non-periodic domains. The approach utilizes the Fourier Continuation (FC) method for spectral representation of non-periodic functions in general domains in conjunction with smooth localized artificial viscosity assignments produced by means of a Shock-Detecting Neural Network (SDNN). The minimally invasive neural net-induced viscous term eliminates Gibbs ringing while enabling spectral dispersionless flows, and, unlike most other approaches, it does not suffer from unphysical spurious oscillations over smooth flow regions. The FC-SDNN algorithm, which relies on a Mach number proxy for neural-network analysis of the solution’s regularity, generally provides accurate resolution of discontinuities, as well as significantly smoother profiles away from jump discontinuities than those produced by other methods, including ENO/WENO solvers, Godunov schemes and other finite volume and artificial viscosity approaches. The character of the method will be demonstrated by means of applications to a number of important test cases, including a Mach 3 wind-tunnel step problem, a Double Mach ramp reflection of a shock, a shock-vortex interaction, and a Blast wave problem, among others.

View Recorded Video

November 19, 2021
• Internal CMX Seminar •

Zoom
1:00pm

▦Gaussian process modeling of groundwater level and land subsidence in California's Central Valley  ▦

Overdrafting of groundwater induced by erratic changes in climate has severely stressed California's Central Valley (CV) aquifer system, leading to environmental hazards such as land subsidence. Designing hazard-free groundwater management strategies requires spatio-temporally continuous predictions of how the groundwater table and land surface respond to water recharge and discharge. This generally is challenging in CV due to the sparse and noisy nature of available well head data, missing information on groundwater pumping rates and hydrogeological heterogeneity. To address these challenges, we propose a machine learning (ML) approach to estimating CV groundwater levels and land subsidence continuously across space and time. Our preliminary investigations consist of employing Gaussian process (GP) regression on available well head data and InSAR remote sensing measurements of surface deformation in southern CV. We propose a linear model to capture the seasonal and long-term temporal trends observed in the raw data. Spatial continuity of model parameters is imposed with multi-output GPs. We discuss the linear model of co-regionalization for building permissible covariance kernels in the multivariate setting. We demonstrate the applicability of proposed GP modeling approach to real data in the CV, along with a discussion on advantages and limitations.

January 7, 2022
• Internal CMX Seminar •

Zoom
1:00pm

Eliza O'Reilly
▦Stochastic and Convex Geometry for Complex Data Analysis ▦

Many modern problems in data science aim to efficiently and accurately extract important features and make predictions from high dimensional and large data sets. Naturally occurring structure in the data underpins the success of many contemporary approaches, but large gaps between theory and practice remain. In this talk, I will present recent progress on two different methods for nonparametric regression that can be viewed as the projection of a lifted formulation of the problem with a simple stochastic or convex geometric description. In particular, I will first describe how the theory of stationary random tessellations in stochastic geometry addresses the computational and theoretical challenges of random decision forests with non-axis-aligned splits. Second, I will present a new approach to convex regression that returns non-polyhedral convex estimators compatible with semidefinite programming. These works open new questions at the intersection of stochastic and convex geometry, machine learning, and optimization.

January 21, 2022
• Internal CMX Seminar •

Zoom
1:00pm

Daniel Zhengyu Huang
▦Building Next-Generation Mathematical Models and Methods for Real-World Multiphysics Simulations ▦

Multiphysics modeling and simulation are rapidly becoming indispensable for modern engineering and science. However, remarkable gaps still exist between state-of-the-art multiphysics simulations and reality, which can result in catastrophic loss of life or property (e.g., Europe’s Schiaparelli Mars lander crash and the record-breaking heatwave in the western United States). Core challenges for such simulations include the systems’ great range of scales in space and time, strong coupling effects among subsystems, and huge uncertainties in reality. To address these challenges, my overarching research goal is developing robust, high-fidelity, and intelligent partial differential equation (PDE) solvers, which judiciously combine theory, high-performance computing, data, and machine learning. The fundamental laws governing multiphysics systems are known; however, brute-force computing still cannot resolve all relevant scales. Data-driven models have undeniable potential for harnessing the exponentially growing volume of data. My focus on data-driven approaches includes 1) how to make the hybrid solvers, combining data-driven closure models with traditional PDE solvers, more robust; 2) how to practically leverage indirect data, which generally do not provide direct information about small-scale processes. These ideas, including physics-based neural networks and large-scale Bayesian inference, will be demonstrated through two projects: Mars landing parachute inflation simulation and climate modeling.

View Recorded Video

February 4, 2022
• Internal CMX Seminar •

Zoom
1:00pm

Haoxuan (Steve) Chen
▦Machine Learning for Numerical PDEs: Fast Rate, Scaling Law and Minimax Optimality ▦

Despite the empirical success of adopting machine learning (ML) models for solving high-dimensional partial differential equations (PDEs), the following question remains poorly answered: For a given PDE and a data-driven approximation architecture, how large the sample size and how complex the model are needed to reach a prescribed performance level? In this talk, we will discuss the statistical limits of some ML-based methods for solving elliptic PDEs from random samples, including the Deep Ritz Method (DRM) and Physics-Informed Neural Networks (PINNs). Firstly, we will talk about how to establish information-theoretic lower bounds for both methods via Fano’s inequality. Secondly, we will prove upper bounds for DRM and PINN by using a fast rate generalization bound. We discover that the local Rademacher complexity of a gradient term is hard to be bounded, which causes the current version of DRM to be sub-optimal. Based on the discovery, we propose a modified version of DRM by sampling more data points for the gradient term. We also prove that PINN and the modified version of DRM can achieve minimax optimal bounds over Sobolev spaces. Finally, we will exhibit results of some computational experiments that agrees with the convergence rates proved in our theory.

View Recorded Live

February 18, 2022
• Internal CMX Seminar •

Annenberg 104 & Zoom
1:00pm

Hamed Hamze Bajgiran
▦Aggregation of Pareto optimal models. ▦

In statistical decision theory, a model is said to be Pareto optimal (or admissible) if no other model carries less risk for at least one state of nature while presenting no more risk for others. How can you rationally aggregate/combine a finite set of Pareto optimal models while preserving Pareto efficiency? This question is nontrivial because weighted model averaging does not, in general, preserve Pareto efficiency. I am presenting an answer in a form of a generalization of hierarchical Bayesian modeling. Following our main result, I present applications to Kernel smoothing, time-depreciating models, and voting mechanisms. This is joint work with Houman Owhadi.

March 4, 2022
• Internal CMX Seminar •

Annenberg 104 & Zoom
1:00pm

Matthew Levine
▦A Framework for Machine Learning of Model Error in Dynamical Systems  ▦

The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. Here, we present a unifying framework for blending mechanistic and machine-learning approaches for identifying dynamical systems from data. This framework is agnostic to the chosen machine learning model parameterization, and casts the problem in both continuous- and discrete-time. We will also show recent developments that allow these methods to learn from noisy, partial observations. We first study model error from the learning theory perspective, defining the excess risk and generalization error. For a linear model of the error used to learn about ergodic dynamical systems, both excess risk and generalization error are bounded by terms that diminish with the square-root of T (the length of the training trajectory data). In our numerical examples, we first study an idealized, fully-observed Lorenz system with model error, and demonstrate that hybrid methods substantially outperform solely data-driven and solely mechanistic-approaches. Then, we present recent results for modeling partially observed Lorenz dynamics that leverages both data assimilation and neural differential equations.

View Recorded Video

April 8, 2022
• Internal CMX Seminar •

Annenberg 104 & Zoom
1:00pm

Dani Kiyasseh
▦TBD  ▦

TBD

April 22, 2022
• Internal CMX Seminar •

Annenberg 104 & Zoom
1:00pm

Dmitry Burov
▦Connections Between Two Kernel Regression Methods ▦

Gaussian process regression (GPR) is a well-studied and commonly used kernel regression method. Among its advantages are profound theoretical basis, ability to perform well with relatively small amounts of data and built-in uncertainty quantification. However, it suffers from high computational complexity, limited expressiveness and some other shortcomings, such as poor performance in high dimensions. In this talk, I will give a brief introduction to Kernel Analog Forecasting (KAF), another kernel regression method that was developed independently in the last six years, and discuss connections between GPR and KAF, and how it is possible to overcome the first two drawbacks of GPR. I will also give a brief comparison of the two different uncertainty quantification approaches that these methods use.

View Recorded Video

May 6, 2022
• Internal CMX Seminar •

Annenberg 104 & Zoom
1:00pm

Oliver R. A. Dunbar
▦Pandemic vs Pingdemic: Network Data Assimilation for Epidemic Tracking and Control ▦

Testing, contact tracing, and isolation (TTI) for epidemic management is difficult to implement at scale. One recent attempt, in the form of exposure notification apps, automate notification of neighbours of a contact network created from Bluetooth technology. We design a new framework that can act as a backend for current exposure notification apps, using data assimilation in conjunction with an epidemiological model over the contact network to learn about individual risks of infection. Network DA exploits both the diverse sources of health data, together with proximity data from mobile devices. In COVID-19 simulations of New York-style city with a population of 100,000, network DA identifies up to a factor 2 more infections than exposure notification app-based contact tracing. The framework can also be used to feedback to artificial users, and targeting contact interventions with network DA can reduce deaths by up to a factor 4 relative to TTI, at relatively moderate test rates, provided high user compliance.

View Recorded Video

May 20, 2022
• Internal CMX Seminar •

Annenberg 104 & Zoom
1:00pm

Oscar Leong
▦Leveraging Common Structure for Prior-Free Image Reconstruction ▦

We consider solving ill-posed imaging inverse problems under a generic forward model. Due to the ill-posedness present in such problems, prior models that encourage certain image-based structure are required to reduce the space of possible images when solving for a solution. Common approaches utilize hand-crafted prior models with parameters tuned through trial and error, which can be time-intensive and prone to human bias. Other approaches based on machine learning try to learn the underlying image generation model given samples from the data distribution of interest and use this to solve a constrained inverse problem; however, in many applications ground-truth images may be unavailable. In contrast, we propose to either select or learn an image generation model from the noisy measurements alone, without incorporating prior constraints on image structure. We first show how, given a number of candidate models, the Evidence Lower Bound (ELBO) of a Variational distribution can be used to select an appropriate prior. Then, we showcase how, in the absence of available priors, we are able to directly learn the underlying model from a set of noisy measurements using the ELBO. We crucially assume that the ground-truth images share a common structure by being drawn from the same underlying distribution. The learned model leverages this structure in its architecture, which consists of a shared generator with a compressed latent space where each measurement posterior is learned variationally. This allows the model to learn global properties of the data distribution from noisy observations without overfitting. We illustrate our framework on a variety of inverse problems, ranging from denoising to compressed sensing problems inspired by black hole imaging.

View Recorded Video

### AY 2021/2022: Other Seminars

October 22, 2021
• CMX Special Seminar •

Annenberg 104
4:00pm

Jason Altschuler [MIT]
▦Computing Wasserstein barycenters: easy or hard? ▦

A major effort in modern data science is interpreting and extracting geometric information from data. In this talk, I will focus on my recent work on the core algorithmic task of averaging data distributions. Wasserstein barycenters (aka Optimal Transport barycenters) provide a natural approach for this problem and are central to diverse applications in machine learning, statistics, and computer graphics. Despite considerable attention, it remained unknown whether Wasserstein barycenters can be computed in polynomial time. Our recent work provides a complete answer to this question and reveals that the answer depends subtly on the dimension due to the continuous nature of the problem.

View Recorded Video

March 10, 2022
• CMX Special Seminar •

Annenberg 105 & Zoom
4:00pm

Mathieu Desbrun [Caltech]
▦Going against the flow in fluid animation ▦

While Computer Graphics (CG) has often been inspired by Computational Fluid Dynamics (CFD), its most common algorithmic solutions to fluid animation remain limited in scope (they cannot handle high Reynolds numbers and/or high density ratios when simulating water-air interaction) and scalability. As a consequence, they have found little to no industrial applications aside from special effects in movies. In this talk, I will discuss recent progress in Lattice Boltzmann solvers which offer a nice, massively-parallel way to bridge the gap between CG and CFD for both incompressible single-phase and multi-phase fluid simulation using an atypical discretization of phase space. If time allows, I will also discuss recent progress in Machine Learning that offer plausible space-time upsampling of coarse simulations at low computational cost.

View Recorded Video

May 20, 2022
• CMX Special Seminar •

ANB 105 & Zoom
4:00pm

Roman Vershynin [University of California, Irvine]
▦The Mathematics of Privacy and Synthetic Data▦

In a world where artificial intelligence and data science become omnipresent, data sharing is increasingly locking horns with data-privacy concerns. Among the main data privacy concepts that have emerged are anonymization and differential privacy. Today, another solution is gaining traction-synthetic data. The goal of synthetic data is to create an as-realistic-as-possible dataset, one that not only maintains the nuances of the original data, but does so without risk of exposing sensitive information. The combination of differential privacy with synthetic data has been suggested as a best-of-both-worlds solution. However, the road to privacy is paved with NP-hard problems. The speaker will present three recent mathematical breakthroughs in the NP-hard challenge of creating synthetic data that come with provable privacy and utility guarantees and doing so computationally efficiently. These efforts draw from a wide range of mathematical concepts, particularly random processes. This is joint work with March Boedihardjo and Thomas Strohmer.

View Recorded Video

May 26, 2022
• CMX Special Seminar •

Zoom
4:00pm

Liliana Borcea [University of Michigan]
▦A data driven reduced order model for the wave operator and its application to velocity estimation;▦

This talk is concerned with the following inverse problem for the wave equation: Determine the variable wave speed from data gathered by a collection of sensors, which emit probing signals and measure the generated backscattered waves. Inverse backscattering is an interdisciplinary field driven by applications in geophysical exploration, radar imaging, non-destructive evaluation of materials, etc. There are two types of methods: (1) Qualitative (imaging) methods, which address the simpler problem of locating reflective structures in a known host medium. (2) Quantitative methods, also known as velocity estimation. Typically, velocity estimation is formulated as a PDE constrained optimization, where the data are fit in the least squares sense by the wave computed at the search wave speed. The increase in computing power has lead to growing interest in this approach, but there is a fundamental impediment, which manifests especially for high frequency data: The objective function is not convex and has numerous local minima even in the absence of noise. The main goal of the talk is to introduce a novel approach to velocity estimation, based on a reduced order model (ROM) of the wave operator. The ROM is called data driven because it is obtained from the measurements made at the sensors. The mapping between these measurements and the ROM is nonlinear, and yet the ROM can be computed efficiently using methods from numerical linear algebra. More importantly, the ROM can be used to define a better objective function for velocity estimation, so that gradient based optimization can succeed even for a poor initial guess.

View Recorded Video

Aug 01, 2022
• CMX Special Seminar •

ANB 105 & Zoom
11:00am

Peter Park [Harvard Medical School]
▦Mutational Signature Analysis and its Applications▦

Whole-genome sequencing of a large number of individuals generates hundreds of terabytes of data, requiring efficient computational methods for in-depth analysis. Mutational signature analysis is a recent computational approach for interpreting somatic mutations identified from sequencing data. To discover "signatures" (a 96-dim vector representing possible mutation types and their nucleotide context), a conventional approach utilizes non-negative matrix factorization; to match a signature to a known catalog of signatures, a non-negative least squares is typically used. I will describe some of the shortcomings of the current approaches and our solutions. Applications include detection of homologous recombination deficiency in tumor samples and prediction of response to immunotherapy.