# Theoretical CS Reading Group

 The reading group meets once a week usually, Tuesday 12:00pm-1:00pm to discuss a particular paper or topic. Please join our mailing list to get notifications about the weekly meetings and discussions. You might also be interested in the Theory Seminar. Spring 2019 Fall 2018 Spring 2018 Fall 2017 Spring 2017 Fall 2016 Spring & Summer 2016 Fall 2015 Spring & Summer 2015 Fall 2014 Spring 2014 Summer & Fall 2013

# Spring 2019

Monday, 4th February 2019 — Euiwoong Lee, New York University
Towards a unified theory of approximate optimization

Click to view Abstract

The theory of approximate optimization faces new challenges and opportunities thanks to its increasing interaction with other fields of optimization. Such growth and diversity call for a unified theory of approximate optimization, where various algorithms and complexity results can be connected to one another and explained by a few general principles. My research aims to contribute to such a theory, by building a unified theory for general classes of optimization problems and building connections between such classes. This talk will showcase results in both directions. 1. For the class of graph cut problems, I will present a general framework for proving optimal hardness results for many directed cut problems. I will also show my recent algorithmic results that improve long-standing barriers for the k-cut problem in various settings. 2. Finally, I will introduce some recently discovered connections between continuous and discrete optimization. These connections involve well-studied problems in the respective fields including low-rank approximation, operator norms, and small set expansion. Bio: Euiwoong Lee is a postdoc at the Computer Science Department of New York University hosted by Oded Regev and Subhash Khot. He received his PhD in Computer Science from Carnegie Mellon University under the supervision of Venkat Guruswami. His research interests include topics in optimization algorithms and complexity theory, such as approximation algorithms, hardness of approximation, sum-of-squares hierarchies, and parameterized algorithms. He is a recipient of Edmund M. Clarke Doctoral Dissertation Award, ICALP Best Student Paper Award, and Simons Award for Graduate Students in Theoretical Computer Science.

Wednesday, 6th February 2019 — Mrinalkanti Ghosh, TTIC
Matrix p to q Norms: Generalized Krivine Rounding and Hypercontractive Hardness

Click to view Abstract

We study the problem of computing the p to q operator norm of a matrix A in R mxn , deﬁned as sup x∈R^n \{0} ||Ax|| q /||x|| p . This problem generalizes the spectral norm of a matrix (p = q = 2) and the Grothendieck problem (p = ∞, q = 1), and has been widely studied in various regimes. When p ≥ q, the problem exhibits a dichotomy: constant factor approximation algorithms are known if 2 is in [q, p], and the problem is hard to approximate within almost polynomial factors when 2 is not in [q,p]. For the case when 2 is in [q, p] we prove almost matching approximation and NP-hardness results. The regime when p < q, known as hypercontractive norms, is particularly signiﬁcant for various applications but much less well understood. The case with p = 2 and q > 2 was studied by [Barak et. al., STOC’12] who gave sub-exponential algorithms for a promise version of the problem (which captures small-set expansion) and also proved hardness of approximation results based on the Exponential Time Hypothesis. However, no NP-hardness of approximation is known for these problems for any p < q. We prove the ﬁrst NP-hardness result for approximating hypercontractive norms. We show that for any 1 < p < q < ∞ with 2 not in [p, q], ||A|| p→q is hard 1−ε to approximate within 2 O(log n) assuming NP is not O(1) contained in BPTIME(2 log n ). Bio: Mrinalkanti Ghosh is a PhD candidate at Toyota Technological Institute at Chicago working with Prof. Madhur Tulsiani. Before joining his PhD he completed his masters at Indian Institute of Technology at Kanpur where he worked on intersections of Ergodic theory and Computability. Currently he is interested in discrete and continuous optimization problems, approximation algorithms and hardness.

Thursday, 14th February 2019 — Qin Zhang, Indiana University
Communication-Efficient Computation on Distributed Data: Theory and Practice

Click to view Abstract

Abstract: Through the massive use of mobile devices, data clouds, and the rise of the Internet of Things, large amounts of data have been generated, digitized, and analyzed for the benefit of society at large. As data are often collected and maintained at different sites, communication has become necessary for nearly every computational task. Moreover, decision makers naturally want to maintain a centralized view of all the data in a timely manner, which requires frequent queries on the distributed data. The communication cost, which contributes to most of the data/bandwidth usage, energy consumption and response time, becomes the bottleneck for many tasks. Today I will talk about my recent work on understanding communication-efficient distributed computation for fundamental algorithmic problems, from both the theoretical perspective (i.e., design algorithms with performance guarantees and explore the theoretical limits) and the practical perspective (i.e., make algorithms work well in real-world datasets). Bio: Qin Zhang is an Assistant Professor in the Department of Computer Science at the Indiana University Bloomington. He received his Ph.D. from Hong Kong University of Science and Technology. Prior to IUB, he spent a couple of years as a postdoc at Theory Group, IBM Almaden Research Center, and Center for Massive Data Algorithmics, Aarhus University. He is interested in algorithms for big data, in particular, streaming/sketching algorithms, algorithms for distributed data, and fundamental problems in databases, machine learning, and data mining. He is a recipient of several NSF grants (including a CAREER Award) and Best Paper Award at SPAA 2017

Monday, 18th February 2019 — Rad Niazadeh, Stanford University
Markets with Data: Challenges and Solutions

Click to view Abstract

The application of sophisticated algorithms to vast amounts of user data is already transforming our economy and stimulating the growth of innovative industries such as cloud computing. Algorithm designers working on these marketplaces face significant challenges stemming from the need to make real-time decisions under uncertainty and the need to account for users' strategic behavior. In this talk, I study these limitations and show that despite their existence, there are market algorithms that perform almost as well, or require almost as few resources, as if those limitations had not existed. In the first part of the talk I focus on dynamic pricing and auction design, inspired by the problem of designing robust pricing algorithms for cloud computing spot markets. I will present an algorithm that learns a sequence of pricings/auctions through real-time interaction with bidders, whose average revenue converges to that of the optimal pricing in hindsight even if the bid distribution is heavy-tailed. Furthermore, the number of rounds required for this convergence matches the number of i.i.d. samples required for an offline learning algorithm to attain the same near-optimality guarantee. A consequence is that the constraint of real-time decision making, surprisingly, does not increase the sample complexity of learning near-optimal pricings and auctions. In the second part of the talk I will briefly explain my work on transforming any given algorithm for (approximate) social welfare optimization into a Bayesian incentive compatible mechanism, via a computationally efficient black-box reduction. A consequence is that, in principle, engineers designing mechanisms to maximize users' aggregate welfare can ignore strategic behavior and only focus on the optimization aspect of the problem. I will conclude the talk by surveying my current and future research on further applications of machine learning and optimization to market design, algorithmic tools for exploratory data analysis, and applying game-theoretic techniques to solve a class of non-convex optimization problems arising in machine learning and computer vision. Bio: Rad Niazadeh is a Motwani postdoctoral fellow at Stanford University, Department of Computer Science, where he is hosted by Tim Roughgarden, Amin Saberi and Moses Charikar. Prior to Stanford, he obtained his Ph.D. in Computer Science from Cornell University under Bobby Kleinberg. During his graduate studies, he was a research intern at Microsoft Research (Redmond), Microsoft Research (New England) and Yahoo! Research. He also has been awarded the Google PhD fellowship (in market algorithms), INFORMS Revenue Management and Pricing Dissertation Award (honorable mention), Stanford Motwani fellowship and Cornell Irwin Jacobs fellowship. His research interests are broadly at the intersection of algorithms, game theory and optimization, with a focus on applications in market design, machine learning, and operations research.

Thursday, 21st February 2019 — Alex Psomas, CMU
Theory and Practice of Fair Resource Allocation

Click to view Abstract

The Internet and the vast increase in the availability of data have transformed algorithm design, as well as computer science in general. Designing algorithms that, given an input, quickly produce an output is no longer sufficient. In a myriad of modern applications, considerations like fairness and users' incentives must be taken into account, complicating how success is defined and achieved. In this talk I'll demonstrate how to tackle such issues in the context of food waste. I'll present a full stack of results on fair allocation, from theoretical results in formal models, all the way down to an improved allocation of food. In the first part of the talk we study, from a purely theoretical angle, the fundamental problem of allocating a set of indivisible goods that arrive over time. Specifically, we will design algorithms that make allocation decisions in a way that is fair, under a formal definition of fairness. In the second part of the talk, we adopt and further develop an emerging paradigm called virtual democracy. Virtual democracy is an approach to automating decisions, by learning models of the preferences of individual people from data, and, at runtime, aggregating the predicted preferences of those people on the dilemma at hand. We will take the virtual democracy approach all the way to practice, in a collaboration with a Pittsburgh-based food bank, 412 Food Rescue, that provides on-demand food donation distribution services. I will present my work on designing and deploying an algorithm that automatically makes the decisions they most frequently face: given an incoming food donation, which recipient organization (such as a housing authority or food pantry) should receive it? I will also discuss challenges and solutions we faced in the data collection, learning and aggregation steps of virtual democracy, as well as this work's implications to algorithmic fairness in general. I will conclude the talk by surveying some of my current and future research in algorithmic economics in general. Bio: Christos-Alexandros (Alex) Psomas is a postdoctoral researcher in the Computer Science Department at Carnegie Mellon University, hosted by Ariel Procaccia. He received his PhD in 2017 from the Department of Electrical Engineering and Computer Sciences at UC Berkeley, under the supervision of Christos Papadimitriou. During his PhD he spent two summers as a research intern at the International Computer Science Institute with Eric Friedman, a summer at Microsoft Research Redmond with Nikhil Devanur, and a summer as an instructor for the Discrete Mathematics and Probability Theory course at UC Berkeley. He is broadly interested in algorithmic economics, including topics such as algorithmic mechanism design, artificial intelligence, computational social choice, fair division and auction theory.

Monday, 25th February 2019 — Kent Quanrud, UIUC
From discrete to continuous and back: modern algorithmic techniques for some classical problems in optimization

Click to view Abstract

Larger datasets and modern computing systems have shifted the focus in algorithm design to faster and/or distributed algorithms that still retain strong theoretical guarantees. A recurring theme in this line of work is the interplay of techniques from continuous and discrete optimization to design more scalable algorithms. We examine two classical problems from my own experience through this modern algorithmic lens. First, we consider metric TSP, a central problem in combinatorial optimization that has historically inspired many techniques. We show how to approximate the underlying linear program in nearly-linear time in the input size. We then show how to leverage the LP solution and compute an approximate tour an order of magnitude faster than previously known. (The latter running time is also nearly linear for explicit metrics.) We will also discuss a more general framework for approximating other LPs of interest in nearly linear time, by a careful combination of discrete and continuous techniques. In the second part of the talk, we will discuss parallel algorithms for constrained submodular maximization. Submodular maximization generalizes many problems in combinatorial optimization and has applications in machine learning and data mining. Scalable algorithms are increasingly important for these applications. We derive parallel algorithms with only a polylogarithmic number of parallel rounds for a variety of constraints including packing constraints and matroid constraints. These algorithms were developed by first taking a certain continuous viewpoint of submodular functions that I plan to discuss, which leads to clean and intuitive algorithms. Bio: Kent Quanrud is a PhD candidate in computer science at the University of Illinois at Urbana-Champaign, advised by Chandra Chekuri and Sariel Har-Peled. His research is in the design and analysis of algorithms, with interests ranging from classical topics such as combinatorial optimization and approximation algorithms to modern paradigms such as nearly linear time approximation algorithms and the interplay between discrete and continuous optimization.

Thursday, 28th February 2019 — Saeed Seddighin, University of Maryland
Campaigning via LPs: Solving Blotto and Beyond

Click to view Abstract

The competition between the Republican and the Democrat nominees in the U.S presidential election is known as Colonel Blotto in game theory. In the classical Colonel Blotto game -- introduced by Borel in 1921 -- two colonels simultaneously distribute their troops across multiple battlefields. The outcome of each battlefield is determined by a winner-take-all rule, independently of other battlefields. In the original formulation, the goal of each colonel is to win as many battlefields as possible. The Colonel Blotto game and its extensions have been used in a wide range of applications from political campaigns (exemplified by the U.S presidential election) to marketing campaigns, from (innovative) technology competitions, to sports competitions. For almost a century, there have been persistent efforts for finding the optimal strategies of the Colonel Blotto game, however it was left unanswered whether the optimal strategies are polynomially tractable. In this talk, I will present several algorithms for solving Blotto games in polynomial time and will talk about their applications in practice. Bio: Saeed Seddighin is a PhD candidate in computer science at the University of Maryland working under the supervision of MohammadTaghi Hajiaghayi. During his PhD, he spent a summer as a research intern at the Toyota Technological Institute and several semesters at the Simons Institute for the Theory of Computing as a visiting student. His research focuses on approximation algorithms and algorithmic game theory. His research has been supported by Ann. G. Wylie Fellowship, Algorithms Award, and the University of Maryland's Dean's Fellowship.

Tuesday, 5th March 2019 — Mahdi Cheraghchi, Imperial College London
Coding Theory in the Big Data Era

Click to view Abstract

The pioneering work of Shannon in 1948 introduced information and coding theory as a mathematical tool for studying communications systems. However, in recent decades, the theory has evolved into an indispensable tool in theoretical computer science and for studying virtually all aspects of information processing. In this talk, I will highlight recent progress in both combinatorial and algorithmic aspects. The deletion channel is a classical model of point-to-point communication in which a random subsequence of a transmitted sequence is delivered to the receiver. This model has been a subject of study in information theory since the 1960s, originally for the design of communications systems in the presence of synchronization errors. However, over the past decade, the problem has garnered major interest within the theoretical computer science community due to its rich combinatorial structure (in particular, its connection with the Edit Distance) and significance in emerging DNA storage applications. On the other hand, determining the information capacity of the deletion channel is generally regarded as one of the most challenging open problems in information theory. Until recently, the known nontrivial bounds on the capacity for positive deletion probabilities relied on extensive computer search that reveals little about the mathematical structure of the problem. I will give an overview of the recent successful and fully analytic attempts (starting [C., STOC'18, J. ACM]) towards an eventual characterization of the deletion capacity. Rather unexpectedly, this line of work suggests that understanding the deletion channel may require a better understanding of basic models for fiber optics communication, also among the most challenging open problems in information theory since the 1960s. In fact, the exact same techniques have been used to significantly advance the state of the art on both fronts. This showcases an interplay between techniques from different areas. I will continue the discussion of how the role of coding theory has evolved in the big data era by giving an example in algorithm design. I will present a recent algorithm for deterministically estimating the Fourier transform of high-dimensional data up to exponentially faster than the classical FFT method [C. and Indyk, SODA'16, T. ALG]. This demonstrates how error-correcting codes crucially work "under the hood" as a main technical tool for the design of algorithms for big data. This talk is aimed at a general audience; no technical background in information and coding theory will be assumed. Bio. Mahdi Cheraghchi is a Lecturer (US equivalent: Assistant Professor) of Computing at Imperial College London, UK. Before joining Imperial in 2015, he held post-doctoral appointments at UC Berkeley (the Simons Institute for the Theory of Computing), MIT (with Piotr Indyk), CMU (with Venkatesan Guruswami), the University of Texas at Austin (with David Zuckerman), as well as a visiting engineer position at Qualcomm (with Michael Luby). He obtained his Ph.D. in 2010 from EPFL (with Amin Shokrollahi), where he received the Patrick Denantes Memorial Prize for outstanding doctoral thesis in the School of Computer and Communication Sciences. Mahdi is broadly interested in all theoretical aspects of computer science, especially the role of information and coding theory in cryptography, complexity, algorithms, and high-dimensional geometry. He is a senior member of the ACM and IEEE.

Friday, 8th March 2019 — Alexander Block, Purdue University
On the Local Leakage Resilience of Linear Secret Sharing Schemes

Click to view Abstract

Recently in CRYPTO 2018, Benhamouda, Degwekar, Ishai, and Rabin introduce a new leakage model for cryptography: local leakage resilience. Inspired by a result of Guruswami and Wooters (IEEE Trans. Info. Theory), local leakage resilience answers the following question. If an adversary is allowed to obtain an arbitrary function of bounded output length from the secret state of every party participating in a secret sharing scheme, but obtains no other joint information about the secret states, does the secret remain hidden? In this talk, we discuss the motivating result of Guruswami and Wooters and the results of Benhamouda et al. Namely, for (large) prime p, both Additive Secret Sharing and high-threshold Shamir Secret Sharing are local leakage resilient. We further give perspective and intuition behind why small characteristic fields are susceptible to small local leakage resilience attacks, as is the case demonstrated by the result of Guruswami and Wooters for secret sharing over GF[2^k].

Wednesday, 20th March 2019 — Karthik Chandrasekaran, UIUC
Improving the smoothed complexity of FLIP for max cut problems

Click to view Abstract

Finding a locally optimal solution for max-cut is a PLS-complete problem with applications in game theory. An instinctive approach to finding a locally optimum max-cut is the FLIP algorithm. Even though the FLIP algorithm requires exponential time in worst-case instances, it tends to terminate quickly in practical instances. To explain this discrepancy, the run-time of the FLIP algorithm has been studied in the smoothed complexity framework. Etscheid and Roglin (SODA, 2014) showed that the smoothed complexity of the FLIP algorithm for max-cut in arbitrary graphs is quasi-polynomial. Angel, Bubeck, Peres, and Wei (STOC, 2017) showed that the smoothed complexity of the FLIP algorithm for max-cut in complete graphs is O(φ^5 n^15.1), where φ is an upper bound on the random edge-weight density and n is the number of vertices in the graph. In this work, we make substantial progress towards improving the run-time bound. We prove that the smoothed complexity of the FLIP algorithm in complete graphs is O(φn^7.83). Our techniques provide a general framework for analyzing the FLIP algorithm in the smoothed setting. We illustrate this general framework by showing that the smoothed complexity of the FLIP algorithm for max-3-cut in complete graphs is polynomial and for max-k-cut in arbitrary graphs is quasi-polynomial. Based on joint work with Ali Bibak and Charles Carlson and appeared at SODA 2019.

Tuesday, 26th March 2019 — Minshen Zhu, Purdue University
Markov Chain Mixing Times: Techniques and Applications

Click to view Abstract

Abstract: Markov chain Monte Carlo is a classic method for designing approximate samplers according to given target distributions. Mixing time is a measure of how fast a Markov chain converges to its stationary distribution, and it dictates the efficiency of our sampling algorithm. In this talk, I will introduce two main techniques for bounding the mixing time: (1) coupling and (2) canonical paths. I will showcase how to use the coupling technique to prove that Glauber Dynamics on q-colorings mixes rapidly (when q is not too small); and how to use canonical paths to show that Jerrum-Sinclair chain for sampling matchings mixes rapidly.

Wednesday, 27th March 2019 — Vasilis Gkatzelis, Drexel University
Cost-Sharing Methods for Scheduling Games under Uncertainty

Click to view Abstract

We study the performance of cost-sharing protocols in a selfish scheduling setting with load-dependent cost functions. Previous work on selfish scheduling protocols has focused on two extreme models: omnipotent protocols that are aware of every machine and every job that is active at any given time, and oblivious protocols that are aware of nothing beyond the machine they control. The main focus of this talk is on a well-motivated middle-ground model of "resource-aware" protocols, which are aware of the set of machines that the system comprises, but unaware of what jobs are active at any given time. Apart from considering budget-balanced protocols, to which previous work was restricted, we augment the design space by also studying the extent to which overcharging can lead to improved performance. Joint with George Christodoulou and Alkmini Sgouritsa.

Tuesday, 9th April 2019 — Akash Kumar, Purdue University
Random walks and forbidden minors II: A \poly(d\eps−1)-query tester for minor-closed properties of bounded degree graphs

Click to view Abstract

Let $G$ be a graph with $n$ vertices and maximum degree $d$. Fix some minor-closed property $\mathcal{P}$ (such as planarity). We say that $G$ is $\varepsilon$-far from $\mathcal{P}$ if one has to remove $\varepsilon dn$ edges to make it have $\mathcal{P}$. The problem of property testing $\mathcal{P}$ was introduced in the seminal work of Benjamini-Schramm-Shapira (STOC 2008) that gave a tester with query complexity triply exponential in $\varepsilon^{-1}$. Levi-Ron (TALG 2015) have given the best tester to date, with a quasipolynomial (in $\varepsilon^{-1}$) query complexity. It is an open problem to get property testers whose query complexity is $\poly(d\varepsilon^{-1})$, even for planarity. In this paper, we resolve this open question. For any minor-closed property, we give a tester with query complexity $d\cdot \poly(\varepsilon^{-1})$. The previous line of work on (independent of $n$, two-sided) testers is primarily combinatorial. Our work, on the other hand, employs techniques from spectral graph theory. This paper is a continuation of recent work of the authors (FOCS 2018) analyzing random walk algorithms that find forbidden minors.

Friday, 15th April 2019 — Lev Reyzin, UIC
Sampling Without Compromising Accuracy in Adaptive Data Analysis

Click to view Abstract

In this talk, I will discuss how to use sampling to speed up mechanisms for answering adaptive queries into datasets without reducing the accuracy of those mechanisms. In particular, I will describe a mechanism that provides a polynomial speed-up per query over previous mechanisms, without needing to increase the total amount of data required to maintain the same generalization error as before. This speed-up holds for arbitrary statistical queries. I will also give an even faster method for achieving statistically-meaningful responses wherein the mechanism is only allowed to see a constant number of samples from the data per query. Finally, I will discuss how these general results also yield a simple, fast, and unified approach for adaptively optimizing convex and strongly convex functions over a dataset. This work is joint with Benjamin Fish and Benjamin I. P. Rubinstein.

26th-27th April 2019Midwest Theory Day

Monday, 29th April 2019 — Samson Zhou, Indiana University
Numerical Linear Algebra in the Sliding Window Model

Click to view Abstract

We initiate the study of numerical linear algebra in the sliding window model, where only the most recent W updates in the data stream form the underlying set. Although most existing work in the sliding window model uses the smooth histogram framework, most interesting linear-algebraic problems are not smooth; we show that the spectral norm, vector induced matrix norms, generalized regression, and low-rank approximation are not amenable for the smooth histogram framework. To overcome this challenge, we first give a deterministic algorithm that achieves spectral approximation in the sliding window model that can be viewed as a generalization of smooth histograms, using the Loewner ordering of positive semidefinite matrices. We then give algorithms for both spectral approximation and low-rank approximation that are space-optimal up to polylogarithmic factors. Our algorithms are based on a new notion of "reverse online" leverage scores that account for both how unique and how recent a row is. We show that by sampling rows based on their reverse online leverage scores and repeatedly downsampling each time a new row arrives, we can both oversample rows with respect to their true leverage scores, and also bound the total number of rows stored. The downsampling procedure can be streamlined so that both our spectral approximation algorithm and our low-rank approximation algorithm run in input sparsity runtime, up to lower order factors. We show that our techniques have a number of applications to linear-algebraic problems in other settings. Specifically, we show that our analysis immediately implies an algorithm for low-rank approximation in the online setting that is space-optimal up to logarithmic factors, as well as nearly input sparsity time. We then show our deterministic spectral approximation algorithm can be used to handle L_1 spectral approximation in the sliding window model under a certain assumption on the bit complexity of the entries. Finally, we show that our downsampling framework can be applied to the problem of approximate matrix multiplication and provide upper and lower bounds that are tight up to log log W factors. Joint work with Vladimir Braverman, Petros Drineas, Jalaj Upadhyay, and David P. Woodruff.