The Colloquium on Digital Transformation is a series of weekly online talks featuring top scientists from academia and industry on how artificial intelligence, machine learning, and big data can lead to scientific breakthroughs with large-scale societal benefit. Register for the fall Zoom webinar series here. See videos of all DTI talks at

Fall 2022 Series

December 1, 2022, 1 pm PT/3 pm CT

Improved Adversarial Attacks and Certified Defenses via Nonconvex Relaxations

Richard Y. Zhang, Assistant Professor of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign

After training a machine learning model to be resilient towards adversarial attacks, one often desires a mathematical proof or “certification” that the model is rigorously robust against further attacks. Typically, certification is performed by taking the nonconvex set of all possible attacks, and relaxing it into a larger, convex set of attacks, some (or almost all) of which may not be physically realizable. Unfortunately, such certifications are often extremely conservative, as they tend to be overly cautious towards fictitious attacks that cannot be physically realized. In the end, there remains a large “convex relaxation barrier” between our ability to train ostensibly resilient models, and our ability to guarantee them as being rigorously robust. In this talk, we discuss nonconvex relaxations for the adversarial attack and certification problem. We argue that nonconvex relaxations are able to conform much better to the set of physically realizable attacks, as these are also naturally nonconvex. Our nonconvex relaxations are inspired by recent work on the Burer-Monteiro factorization for optimization on Riemannian manifolds. Our results find that the nonconvex relaxation can almost fully close the “convex relaxation barrier” that stymies the existing state-of-the-art. For safety-critical applications, the technique promises to guarantee that a model trained today will not become a security concern in the future: it will resist all attacks, including previously unknown attacks, and even if the attacker is given full knowledge of the model.


Richard Y. Zhang is an assistant professor in the Department of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. He received the S.M. and Ph.D. degrees in EECS from MIT in 2012 and 2017 respectively, and was a postdoc at UC Berkeley. His research is in optimization and machine learning, with a particular focus on building structure-exploiting algorithms to solve real-world problems with provable guarantees on quality, speed, and safety. He received an NSF CAREER Award in 2021.

December 8, 2022, 1 pm PT/3 pm CT

The Power of Adaptivity in Representation Learning: From Meta-learning to Federated Learning

Sewoong Oh, Associate Professor of Computer Science and Engineering, University of Washington

A central problem in machine learning is as follows: How should we train models using data generated from a collection of heterogeneous tasks/environments, if we know that these models will be deployed in a new and unseen environment? In the setting of few-shot learning, a prominent approach is to develop a modeling framework that is “primed” to adapt, such as Model Adaptive Meta Learning (MAML), and then fine-tune the model for the deployment environment. We study this approach in the multi-task linear representation setting. We show that the reason behind generalizability of the models in new environments is that the dynamics of training induces the models to evolve toward the common data representation among the various tasks. The structure of the bi-level update at each iteration (an inner and outer update with MAML) holds the key — the diversity among client data distributions are exploited via inner/local updates. This is the first result that formally shows representation learning, and derives exponentially fast convergence to the ground-truth representation. The talk concludes by making a connection between MAML and Federated Average (FedAvg) in the context of personalized federated learning, where the the local and global updates of FedAvg exhibits the same representation learning. This is based on joint work with Liam Collins, Aryan Mokhtari, and Sanjay Shakkottai.


Sewoong Oh is an associate professor at the Paul G. Allen School of Computer Science and Engineering at the University of Washington, since 2019. Previously, he was an assistant professor in the Department of Industrial and Enterprise Systems Engineering at the University of Illinois at Urbana-Champaign, since 2012. He received his PhD from the Department of Electrical Engineering at Stanford University in 2011, under the supervision of Andrea Montanari. He was a postdoctoral researcher at the Laboratory for Information and Decision Systems (LIDS) at MIT, under the supervision of Devavrat Shah. Oh was co-awarded the ACM SIGMETRICS best paper award in 2015, NSF CAREER award in 2016, ACM SIGMETRICS Rising Star Award in 2017, and Google Faculty Research Awards in 2017 and 2020.

Past Talks

August 25, 2022, 1 pm PT/3 pm CT

The Many Facets of Robust Machine Learning: from Mathematical Guarantees to Real-world Shifts

Aditi Raghunathan, Assistant Professor of Computer Science, Carnegie Mellon University

Despite notable successes on several carefully controlled benchmarks, current machine learning (ML) systems are remarkably brittle, raising serious concerns about their deployment in safety-critical applications like self-driving cars and predictive healthcare. In this talk, we address robustness in ML via the robust optimization framework and develop principled approaches to address fundamental obstacles. We focus on two settings where standard ML models degrade substantially: adversarial attacks on test inputs, and presence of spurious correlations like image backgrounds, and demonstrate the need to question common assumptions in ML. Next we switch gears to discuss natural distribution shifts in the wild that are both hard to predict a priori and intractable to mathematically characterize. We discuss approaches to estimate and improve performance under such shifts, which complement approaches based on robust optimization.


Aditi Raghunathan is an assistant professor of computer science at Carnegie Mellon University. She is interested in building robust ML systems with guarantees for trustworthy real-world deployment. Previously, she was a postdoctoral researcher at Berkeley AI Research, and received her PhD from Stanford University in 2021. Her research has been recognized by the Arthur Samuel Best Thesis Award at Stanford, a Google PhD fellowship in machine learning, and an Open Philanthropy AI fellowship.

September 1, 2022, 1 pm PT/3 pm CT

Two Surprises When Optimization Meets Machine Learning

Suvrit Sra, Associate Professor of Electrical Engineering and Computer Science, Massachusetts Institute of Technology

It is well-known that there are large gaps between optimization theory and machine learning practice. However, there are two even more surprising gaps that have persisted at the fundamental level. The first one arises from ignoring the elephant in the room: non-differentiable non-convex optimization, e.g., when training a deep ReLU network. The second surprise is more disturbing: it uncovers a non-convergence phenomenon in the training of deep networks, and as a result it challenges existing convergence theory and training algorithms. Both these fundamental surprises open new directions of research, and we talk about some of our theoretical progress on these, as well as potential research questions.

Suvrit Sra

Suvrit Sra is an associate professor in MIT’s EECS Department; a core faculty member of the Institute for Data, Systems, and Society (IDSS) and at the Laboratory for Information and Decision Systems (LIDS); and a member of MIT-ML and Statistics groups. He earned his PhD in Computer Science from the University of Texas at Austin. Before moving to MIT, he was a Senior Research Scientist at the Max Planck Institute for Intelligent Systems, in Tübingen, Germany. His research lies at the intersection of machine learning with mathematics, spanning areas such as differential geometry, matrix analysis, convex analysis, probability theory, and optimization. He founded the Optimization for Machine Learning (OPT) series of workshops in 2008 (at NeurIPS). He is a co-founder and chief scientist of macro-eyes, a global OR+ OPT+ ML+ healthcare startup.

September 8, 2022, 1 pm PT/3 pm CT

Trustworthy Machine Learning: Robustness, Privacy, Generalization, and their Interconnections

Bo Li, Assistant Professor of Computer Science, University of Illinois at Urbana-Champaign

Advances in machine learning have led to the rapid and widespread deployment of learning-based methods in safety-critical applications, such as autonomous driving and medical healthcare. Standard machine learning systems, however, assume that training and test data follow the same or similar distributions, without explicitly considering active adversaries manipulating either distribution. For instance, recent work demonstrates that motivated adversaries can circumvent anomaly detection or other machine learning models at test-time through evasion attacks, or can inject well-crafted malicious instances into training data to induce errors during inference through poisoning attacks. Such distribution shifts could also lead to other trustworthiness issues, such as generalization. In this talk, we describe different perspectives of trustworthy machine learning, such as robustness, privacy, generalization, and their underlying interconnections. We focus on a certifiably robust learning approach based on statistical learning with logical reasoning as an example, and then discuss the principles towards designing and developing practical trustworthy machine learning systems with guarantees, by considering these trustworthiness perspectives holistically.


Bo Li is an assistant professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. She is the recipient of the IJCAI Computers and Thought Award, Alfred P. Sloan Research Fellowship, NSF CAREER Award, MIT Technology Review TR-35 Award, Dean’s Award for Excellence in Research, C.W. Gear Outstanding Junior Faculty Award, Intel Rising Star Award, Symantec Research Labs Fellowship, research awards from tech companies including Amazon, Facebook, Intel, and IBM, and best paper awards at several top machine learning and security conferences. Her research focuses on both theoretical and practical aspects of trustworthy machine learning, security, machine learning, privacy, and game theory. She has designed several scalable frameworks for trustworthy machine learning and privacy-preserving data publishing systems. Her work has been featured by major press outlets, including Nature, Wired, Fortune, and the New York Times.

September 15, 2022, 1 pm PT/3 pm CT

Federated Learning with Formal User-level Differential Privacy Guarantees

Brendan McMahan, Research Scientist, Google

Privacy for users is a central goal of cross-device federated learning. This talk begins with a quick overview of federated learning and key privacy principles. We then deep-dive into some recent advances in providing stronger anonymization properties for cross-device federated learning, including the DP-FTRL algorithm that was used to launch a production neural-language model trained with a user-level differential privacy guarantee.


Brendan McMahan is a research scientist at Google, where he leads efforts on decentralized and privacy-preserving machine learning. His team pioneered the concept of federated learning, and continues to push the boundaries of what is possible when working with decentralized data using privacy-preserving techniques. Previously, he has worked in the fields of online learning, large-scale convex optimization, and reinforcement learning. McMahan received his Ph.D. in computer science from Carnegie Mellon University.

September 22, 2022, 1 pm PT/3 pm CT

New Approaches to Detecting and Adapting to Domain Shifts in Machine Learning

Zico Kolter, Associate Professor of Computer Science, Carnegie Mellon University

Machine learning systems, in virtually every deployed system, encounter data from a qualitatively different distribution than what they were trained upon. Effectively dealing with this problem, known as domain shift, is thus perhaps the key challenge in deploying machine learning methods in practice. In this talk, we motivate some of these challenges in domain shift, and highlight some of our recent work on two topics. First, we present our work on determining if we can even evaluate the performance of machine learning models under distribution shift, without access to labeled data. Second, we present work on how to better adapt our classifiers to new data distributions, again assuming access only to unlabeled data in the new domain.


Zico Kolter is an associate professor in the Computer Science Department at Carnegie Mellon University, and also serves as chief scientist of AI research for the Bosch Center for Artificial Intelligence. His work spans the intersection of machine learning and optimization, with a focus on developing more robust and rigorous methods in deep learning. In addition, Kolter has worked in a number of application areas, highlighted by work on sustainability and smart energy systems. He is a recipient of the DARPA Young Faculty Award, a Sloan Fellowship, and best paper awards at NeurIPS, ICML (honorable mention), AISTATS (test of time), IJCAI, KDD, and PESGM.

October 6, 2022, 1 pm PT/3 pm CT

Machine Learning at All Levels: A Pathway to “Autonomous” AI

Eric Xing, Mohamed Bin Zayed University of Artificial Intelligence, Carnegie Mellon University, Petuum Inc.

An integrative AI system is not a monolithic blackbox, but a modular, standardizable, and certifiable assembly of building blocks at all levels: data, model, algorithm, computing, and infrastructure. In this talk, we summarize our work on developing principled and “white-box” approaches, including formal representations, optimization formalisms, intra- and inter-level mapping strategies, theoretical analysis, and production platforms, for optimal and potentially automatic creation and configuration of AI solutions at all levels, namely, data harmonization, model composition, learning to learn, scalable computing, and infrastructure orchestration. We argue that traditional benchmark/leaderboard-driven bespoke approaches or the massive end-to-end “AGI” models in the Machine Learning community are not suited to meet the highly demanding industrial standards beyond algorithmic performance, such as cost-effectiveness, safety, scalability, and automatability, typically expected in production systems; and there is a need to work on ML-at-All-Levels as a necessity step toward industrializing AI that can be considered transparent, trustworthy, cost effective, and potentially autonomous.


Eric Xing is a professor of Computer Science at Carnegie Mellon University, President of the Mohamed bin Zayed University of Artificial Intelligence, and Founder and Chairman of Petuum Inc., a 2018 World Economic Forum Technology Pioneer company that builds standardized AI development platforms and operating systems for industrial AI applications. He completed his PhD in Computer Science at UC Berkeley. His research interests are the development of machine learning and statistical methodology; and composable, automatic, and scalable computational systems for solving problems involving automated learning, reasoning, and decision-making in artificial, biological, and social systems. Xing currently serves or has served as associate editor of the Journal of the American Statistical Association, Annals of Applied Statistics, and IEEE’s Journal of Pattern Analysis and Machine Intelligence; and as action editor of the Machine Learning Journal and Journal of Machine Learning Research. He is a board member of the International Machine Learning Society.

October 13, 2022, 1 pm PT/3 pm CT

Adversarial Machine Learning from a Privacy Perspective

Tom Goldstein, Perotto Associate Professor of Computer Science, University of Maryland

In this talk, we discuss ways that adversarial machine learning can be used to protect or infringe upon the privacy of users. This includes methods for deincentivizing data scraping by creating “unlearnable” data that cannot be used for model training, and methods for manipulating federated learning systems to extract private data.


Tom Goldstein is the Perotto Associate Professor of Computer Science at the University of Maryland. His research lies at the intersection of machine learning and optimization and targets applications in computer vision and signal processing. Before joining the faculty at Maryland, Tom completed his PhD in Mathematics at UCLA, and was a research scientist at Rice University and Stanford University. Of several awards, Goldstein has received SIAM’s DiPrima Prize, a DARPA Young Faculty Award, a JP Morgan Faculty award, and a Sloan Fellowship.

October 20, 2022, 1 pm PT/3 pm CT

Improving Communication for Differential Privacy: Insight from Human Behavior

Rachel Cummings, Assistant Professor of Industrial Engineering and Operations Research, Columbia University

Differential privacy (DP) is widely regarded as a gold standard for privacy-preserving computation over users’ data. A key challenge is that the privacy guarantees are difficult to communicate to users, leaving them uncertain about how and whether they are protected. Despite recent widespread deployment of DP, relatively little is known about user perceptions and how to effectively communicate DP’s practical privacy guarantees. This talk will cover a series of user studies aimed at measuring and improving communication with non-technical end users about DP. The first set explores users’ privacy expectations related to DP and measures the efficacy of existing methods for communicating the privacy guarantees of DP systems. We find that the ways in which DP is described in-the-wild largely set users’ privacy expectations haphazardly, which can be misleading depending on the deployment. Motivated by these findings, the second set develops and evaluates prototype descriptions designed to help end users understand DP guarantees. These descriptions target two important technical details in DP deployments that are often poorly communicated to end users: the privacy parameter epsilon (which governs the level of privacy protections) and distinctions between the local and central models of DP (which governs who can access exact user data).


Rachel Cummings is an assistant professor of Industrial Engineering and Operations Research at Columbia University. Before joining Columbia, she was an assistant professor at the Georgia Institute of Technology. Her research interests lie primarily in data privacy, with connections to machine learning, algorithmic economics, optimization, statistics, and public policy. She is the recipient of an NSF CAREER Award, a DARPA Young Faculty Award, an Apple Privacy-Preserving Machine Learning Award, JP Morgan Chase Faculty Award, a Simons-Google Research Fellowship, a Mozilla Research Grant, and multiple best paper awards.

October 27, 2022, 1 pm PT/3 pm CT

AI Model Inspector: Towards Holistic Adversarial Robustness for Deep Learning

Pin-Yu Chen, Principal Research Staff Member, Trusted AI Group, IBM Institute

In this talk, Chen shares his research journey toward building an AI model inspector for evaluating, improving, and exploiting adversarial robustness for deep learning, starting with an overview of research topics concerning adversarial robustness and machine learning, including attacks, defenses, verification, and novel applications. For each, Chen summarizes key research findings, including, 1) practical optimization-based attacks and their applications to explainability and scientific discovery; 2) plug-and-play defenses for model repairing and patching; 3) attack-agnostic robustness assessment; and 4) data-efficient transfer learning via model reprogramming. The talk concludes with his vision of preparing deep learning for the real world and the research methodology of learning with an adversary.


Pin-Yu Chen is a principal research scientist of the Trusted AI Group and PI of the MIT-IBM Watson AI Lab at the IBM Thomas J. Watson Research Center. He is also Chief Scientist of the RPI-IBM AI Research Collaboration program. His recent research focus has been on adversarial machine learning and robustness of neural networks, and more broadly, making machine learning trustworthy. His research contributes to IBM Adversarial Robustness Toolbox, AI Explainability 360, AI Factsheets 360, and Watson Studio. Chen received his Ph.D. in electrical engineering and computer science and his M.A. in Statistics from the University of Michigan at Ann Arbor.

November 10, 2022, 1 pm PT/3 pm CT

Underspecified Foundation Models Considered Harmful

Nicholas Carlini, Research Scientist, Google Brain

Instead of training neural networks to solve any one particular task, it is now common to train neural networks to behave as a “foundation” upon which future models can be built. Because these models train on unlabeled and uncurated datasets, their objective functions are necessarily underspecified and not easily controlled. In this talk, I argue that while training underspecified models at scale may benefit accuracy, it comes at a cost to their security. As evidence, I present two case studies in the domains of semi- and self-supervised learning, where an adversary can poison the unlabeled training dataset to perform various attacks. Addressing these challenges will require new categories of defenses to simultaneously allow models to train on large datasets while also being robust to adversarial training data.


As a research scientist at Google Brain, Nicholas Carlini studies the security and privacy of machine learning. For this he has received best paper awards at ICML, USENIX Security, and IEEE S&P. Carlini earned his PhD from the University of California, Berkeley in 2018.

November 17, 2022, 1 pm PT/3 pm CT

Tackling Computational and Data Heterogeneity in Federated Learning

Gauri Joshi, Associate Professor of Electrical and Computer Engineering, Carnegie Mellon University

The future of machine learning lies in moving both data collection as well as model training to the edge. The emerging area of federated learning seeks to achieve this goal by orchestrating distributed model training using a large number of resource-constrained mobile devices that collect data from their environment. Due to limited communication capabilities as well as privacy concerns, the data collected by these devices cannot be sent to the cloud for centralized processing. Instead, the nodes perform local training updates and only send the resulting model to the cloud. A key aspect that sets federated learning apart from data-center-based distributed training is the inherent heterogeneity in data and local computation at the edge clients. In this talk, Joshi presents recent work on algorithms for tackling computational and data heterogeneity in federated optimization.


Gauri Joshi is an associate professor in the ECE department at Carnegie Mellon University. Gauri completed her Ph.D. from MIT EECS and completed her undergrad in Electrical Engineering from IIT Bombay. Her current research is on designing algorithms for federated learning, distributed optimization, and parallel computing. Her awards and honors include being named as one of MIT Technology Review’s 35 Innovators under 35 (2022), the NSF CAREER Award (2021), the ACM Sigmetrics Best Paper Award (2020), Best Thesis Prize in Computer Science at MIT (2012), and Institute Gold Medal of IIT Bombay (2010).