Andrew McGregor

Associate Professor

Welcome to the homepage for CMPSCI 891M - Theory Seminar. The theory seminar is a weekly meeting in which topics of interest in the theory of computation - broadly construed - are presented. This is sometimes new research by visitors or local people. It is sometimes work in progress, and it is sometimes recent material of others that some of us present in order to learn and share. May be repeated for credit up to 6 times. 1 credit. Please email me if you'd like to get involved.

This semester the seminar will be held on Tuesday from 12:45pm to 1:45pm in CMPS 140. The schedule of talks is below. The schedule for other semesters can be found at here, here, and here.


Spring 2015

Date Speaker Title
Tue 20 Jan [Organization Meeting]
Tue 27 Jan [Snow Day]
Tue 3 Feb David Tench Vertex Connectivity in the Streaming Model

While research into streaming algorithms for graphs has been fruitful for problems such as edge connectivity, the related problem of vertex connectivity seems significantly more difficult and few useful results have been found. We discuss the nature of some of these difficulties and present two exciting results: an streaming algorithm to construct a reasonably-sized k-vertex connectivity certificate, and a streaming algorithm which provides a (1 + \epsilon) approximation of vertex connectivity.

Joint work with Sudipto Guha and Andrew McGregor.

Tue 10 Feb
Th 17 Feb [Monday Schedule]
Tue 24 Feb
Thu 26 Feb (Room 151) Rong Ge, Microsoft Research Towards Provable and Practical Machine Learning

Many problems --- especially machine learning problems like sparse coding or topic modeling --- are hard in the worst-case, but nevertheless solved in practice by algorithms whose convergence properties are not understood. In this talk I will show how we can identify natural properties of "real-life" instances that allow us to design scalable algorithms for a host of well-known machine learning problems. Most of the talk will be focused on the sparse coding problem: a basic task in many fields including signal processing, neuroscience and machine learning where the goal is to learn a basis that enables a sparse representation of a given set of data, if one exists. Here we give a general framework for understanding alternating minimization which we leverage to analyze existing heuristics and to design new ones also with provable guarantees.

Web 4 Mar (12pm, CS140) Manish Purohit, U. Maryland Designing Robust Virtual Backbones via Partial Connected Dominating Sets

Over the past two decades, connected dominating sets have found wide applicability as virtual backbones in ad-hoc wireless networks. A connected dominating set is a good theoretical model for a virtual backbone as it satisfies both the required properties of (i) Accessibility: every node of the network is adjacent to a node in the backbone, and (ii) Connectivity: The backbone is connected.

However, in the presence of outliers in the network, a connected dominating set may lead to a prohibitively large backbone. In such a scenario, it is preferable to have a small backbone that is connected and provides service (accessibility) to a large fraction of the nodes in the network. In other words, we treat a small fraction (user defined) of the nodes as "outliers" and do not care about connecting them to the backbone. This scenario leads to the following generalization of the classical connected dominating set problem that we call Partial Connected Dominating Set (PCDS): Given a unweighted graph G=(V,E) and a target quota q, find a minimum size subset of vertices S such that G[S] is connected and S dominates at least q vertices.

In this talk, we will discuss a O(log n) approximation algorithm to the PCDS problem via a novel "local" analysis of the greedy algorithm for set cover. We will also discuss a constant approximation to a budgeted variant of the connected dominating set problem.

Bio: Manish Purohit is current a PhD student in Computer Science at University of Maryland, College Park and is advised by Samir Khuller. His primary research interests lie in developing approximation algorithms for NP-hard optimization problems on networks.

Mon 9 Mar Michael Kapralov, IBM Sublinear Algorithms for Modern Graph Analysis

Graphs are a common abstraction for representing large social and information networks, and a powerful set of algorithmic primitives has been developed for their analysis. As the sizes of modern datasets grow, however, many classical polynomial time (and sometimes even linear time) solutions become prohibitively expensive. This calls for sublinear algorithms, i.e. algorithms whose resource requirements are substantially smaller than the size of the input that they operate on.

In this talk, we describe a new approach to the problem of approximating the size of maximum matching in a large graph given as a stream of edge updates using sublinear space. The matching problem is one of the most well-studied questions in combinatorial optimization, and has important applications in modern big data analysis (e.g. online advertising). We obtain a polylogarithmic approximation to maximum matching size using space sublinear in the number of *vertices* in the graph and exponentially smaller than previously known. This is the first algorithm for a graph problem that achieves truly sublinear space in the streaming setting, suggesting the possibility of a new class of more efficient graph analysis primitives.

Tue 10 Mar
Thu 12 Mar Ruta Mehta, Georgia Tech Games, Equilibria, and Evolution

The tremendous growth of online markets, ad auctions, and social network communities, where agents interact to achieve their own goals, often selfish, has created a need to apply game theoretic solution concepts more than ever before. In this talk I will discuss one of the most important solution concept in game theory, namely Nash equilibrium, for its computational and application aspects. Recently a remarkable connection was discovered between evolution under sexual reproduction and coordination games. Proceeding along these lines I will show some new insights on genetic diversity.

Towards efficient computation, finding Nash equilibrium in two-player normal form game (2-Nash) is one of the most extensively studied problem. Such a game can be represented by two payoff matrices A and B, one for each player. 2-Nash is PPAD-complete in general, while in case of zero-sum games (B=-A) the problem reduces to LP and hence is in P. Extending the notion of zero-sum, in 2005, Kannan and Theobald defined rank of game (A, B) as rank(A+B), e.g., rank-0 are zero-sum games. They asked for an efficient algorithm for constant rank games, where the primary difficulty was disconnected solution set, even in rank-1 games. I will answer this question affirmatively for rank-1 games, and negatively for games with rank three or more (unless PPAD=P); the status of rank-2 games remains unresolved. In the process I obtain a number of other results, including a simpler proof of PPAD-hardness for 2-Nash.

Tue 17 Mar [Spring Break]
Mon 23 Mar Shi Li, TTI Chicago Towards better approximation algorithms for classic combinatorial optimization problems

Over the past three decades, approximation algorithm techniques have been successful in analyzing the approximability of many problems. However many other classic problems are still poorly-understood in terms of approximability. Long-standing gaps between approximation ratios and hardness of approximation results present challenges to our algorithmic techniques. I will present two very different examples from my work to illustrate how various techniques were used to tackle long-standing gaps.

The first example deals with the problem of connecting pairs of senders and receivers in a communication network with limited resources. A simple model is the edge-disjoint paths problem in which we need to connect as many pairs as possible using edge-disjoint paths. Despite many years of study, the best approximation ratio for this problem is only O(sqrt(n)). I will show how our work circumvented barriers for this problem by allowing a congestion 2 for the connections.

The second example deals with the classic k-median problem. Our work improved the previous decade-old (3+epsilon)-approximation algorithm. In particular, we show how allowing k+O(1) medians overcomes natural barriers, resulting in our improved approximation ratio.

Tue 24 Mar Neil Immerman Dynamic Reasoning

Reasoning about reachability -- can we get to $b$ from $a$ by following a sequence of pointers -- is crucial for proving that programs meet their specifications. However reasoning about reachability in general is undecidable. A related problem concerns reachability in dynamic graphs (a dynamic graph is a graph in which edges are added or deleted over time). It would be useful to revise reachabilty relations with minimal recomputation, and indeed, in undirected graphs and functional graphs, this can be done efficiently.

The problems of reasoning about reachability in programs and computing reachability in dynamic graphs are connected. For some graphs we can keep track of reachability after each small change in a quantifier-free language. This leads to an automatic way to check the correctness of programs whose data structures are appropriately simple. In this talk, I discuss progress and open questions concerning where such simple dynamic reasoning is possible and where it is not.

Tue 31 Mar Sofya Vorotnikova
Tue 7 Apr Kevin Spiteri Near-Optimal Algorithms for Bitrate Selection for Online Videos

Modern video players employ complex algorithms to choose the bitrate of the video that is shown to the user. Bitrate selection for video content delivery requires a tradeoff between reducing the probability that the video freezes and enhancing the quality of the video shown to the user. Too high a bitrate leads to frequent video freezes (i.e. rebuffering) while too low a bitrate leads to poor video quality. Providers segment video in short chunks and encode at multiple bitrates. The video player selects a bitrate for each chunk, possibly choosing different bitrates for successive chunks. Current state-of-the-art video player implementations use ad-hoc algorithms to select the bitrates. We propose an algorithm that uses the Lyapunov optimization paradigm to provide a guarantee of optimality. The objective function is designed to minimize rebuffering and maximize video quality and achieves a time-average utility that is within an additive term O(1/V) of the optimal value for some constant V that is related to the video buffer size. Our work has immediate implications for how real-world video players are designed and for the evolving new DASH standard for video transmission.

Joint work with Ramesh Sitaraman (UMass) and Rahul Urgaonkar (IBM).

Tue 14 Apr Hoa Vu Finding densest sub-graph via graph stream

Finding dense components is an important problem in large graph analysis. For massive and dynamic graphs, it is a natural to study this problem in the streaming setting.

For any subgraph $S$ of $G$, the density of $S$ is given by $f(S) = |E(S)| / |S|$. The task is to find the densest density of a graph $G$, i.e., $f(G) = \max_{S \subseteq V} f(S)$ over a stream of edge insertions and deletions using only $O(n polylog n)$ bits of space.

Bahmani et. al (VLDB 12') gave an $O(\epsilon^{-2} \log n)$-pass algorithm that computes $(2 \pm \epsilon) f(G)$ and proved a lower bound of $\Omega(n)$ on the space required. Very recently, in STOC 15', Bhattacharya et. al provided a single-pass algorithm that achieves the same result.

Tue 21 Apr [Faculty Candidate Talk]
Tue 28 Apr Cibele Freire A Characterization of the Complexity of Resilience and Responsibility for Conjunctive Queries

The concept of the responsibility that a tuple $\vec t$ bears for the answer of a query $q$ on a database $D$ was defined in [1]. It was argued there that responsibility is key in determining why or why not a given query answer occurs. It was also stated that there is a dichotomy in the complexity of computing responsibility: for any self-join free conjunctive query (sj-free CQ), $q$, the complexity of computing the responsibility of $q$ is either in $\PTIME$ or is NP complete, depending on the structure of $q$. Originally, the goal of this work was to extend that result to show that the dichotomy still holds in the presence of functional dependencies (FDs).

In attempting to prove this theorem, we found that responsibility is a more subtle concept than we previously thought. In particular we found some errors in the proof of the dichotomy of the complexity of responsibility in [1]. In attempting to repair the errors, we defined a simpler and more robust concept which we call resilience. We prove that the complexity of resilience enjoys a dichotomy: for any sj-free CQ, $q$, the resilience of $q$ is either NP complete or in $\PTIME$. We show that this dichotomy continues to hold in the presence of FDs.

Next, we used our insights from the work on resilience to prove that the complexity of responsibility is strictly greater than that of resilience (assuming P $\ne$ NP). We proved that a dichotomy does indeed hold for responsibility. Finally, we achieved our original goal, showing that the dichotomy for responsibility still holds in the presence of FDs.

Joint work with Neil Immerman, Alexandra Meliou and Wolfgang Gatterbauer (CMU).

[1]: A. Meliou, W. Gatterbauer, K. F. Moore, and D. Suciu. The complexity of causality and responsibility for query answers and non-answers. PVLDB, 4(1):34-45, 2010.

Thu 30 Apr Ted Leone Conclusions and Approaches regarding Subproblems of Grid Graph Reachability

The problem of Layered Grid Graph Reachability, Grid Graph Reachability restricted to only east and south edges, has interesting equivalent formulations in terms of database operations on a list. Join Ted Leone for a presentation outlining these formulations, properties of the symmetric group, as well as some conclusions about complexity as outlined in his undergraduate honors thesis.


Spring 2013

Date Speaker Title Material
Tue 22 Jan [Organization Meeting]
Tue 29 Jan Michael Crouch Error-correcting codes as pseudorandom generators Abstract
Tue 5 Feb Daniel Stubbs Bulldozers and Teleporters: Sketching Earth Mover Distance on Graph Metrics Abstract
Th 7 Feb Tasos Sidoropoulos Simplification of metric spaces and its algorithmic applications Abstract
Tue 12 Feb Marco Carmosino Relating Closure Properties of #L to the Complexity of Linear Algebra Abstract
Th 21 Feb Moritz Hardt When Machines Learn About Humans Abstract
Mon 25 Feb Alina Ene Submodular cost allocation: applications, algorithms and hardness results
Th 28 Feb Michael Dinitz Approximating Spanners via Convex Relaxations Abstract
Th 7 Mar Raghu Meka Structure and Geometry of Randomness Abstract
Mon 11 Mar Jacob Abernethy Learning in an Adversarial World, with Connections to Pricing, Hedging, and Routing Abstract
Tue 26 Mar Kurt Rohloff, BBN Technologies Enabling Secure Computing through Fully Homomorphic Encryption Abstract
Tue 9 Apr Cibele Freire On the complexity of computing responsibility for database queries Abstract

Fall 2012

Tue., Sep. 11 Andrew McGregor Massive Graph Synopses and Dimensionality Reduction
Room 151 with lunch provided as part of the Faculty Research Seminar series
Tue., Sep. 18 [Organizational Meeting]
Tue., Oct. 2 Michael CrouchCommunication Complexity Lower Bounds for Space/Sampling Tradeoffs
Tue., Oct. 16 Nicholas SchachterAnalysis of Rank-Deficient Matrix Inversion Algorithms
Tue., Oct. 23 Catherine McGeochOptimal Auctions with Correlated Bids
Wed., Oct. 24 Daniel SpielmanAlgorithms, Graph Theory, and the Solution of Laplacian Linear Equations, 4pm, Room 151
Tue., Nov. 13 David Mix BarringtonApproximate Majority, Certifying Polynomials, and AND-OR-XOR Circuits
Tue., Nov. 20 Graham Cormode Room 151: Data-driven concerns in Private Data Release
Tue., Nov. 27 Conor PowerSome Fun Results in Arithmetic Circuit Complexity
Tue., Nov. 27, 4 p.m. Catherine McGeochDepartment Colloquium: Tuning Algorithms
Tue., Dec. 4 Marco CarmosinoOn the Complexity of the Determinant

Spring 2012

Date Speaker Title Material
24 January [Organizational Meeting]
Feb 7 Boulat Bash When Can You Hide a Message in the Noise?
Feb 14 Marco Carmosino Logspace Counting Classes
Feb 21 David Mix Barrington Arithmetic Circuit Classes (.key) (.pdf)
2pm 27 Feb Jelani Nelson (Room 151)
Mar 6 Michael Crouch Streaming Sketches in the Sliding Windows Model

Fall 2011

Date Speaker Title Material
Sept 6 Organization Meeting
Sept 13 Skip Jordan
Sept 20 Michael Crouch
Oct 4 Andrew McGregor
Oct 18 Ramesh Sitaraman
Nov 1 Marco Carmosino
Nov 8 Mikkel Thorup
Nov 15 Cibele Freire
Nov 29 Vinay Shah
Dec 6 John Bowers

Spring 2011

Date Speaker Title Material
Tue 18 Jan (Snow Day)
Tue 25 Jan Brandon McPhail Submodular functions and energy efficient task scheduling Abstract
Tue 1 Feb (Faculty Candidate)
Tue 8 Feb David Mix Barrington Williams' Lower Bound for Non-Uniform ACC Abstract
Tue 15 Feb Neil Immerman P versus NP: Approaches, Rebuttals, and Does It Matter? Abstract, Slides
Tue 22 Feb (UMass Monday)
Tue 1 Mar (Faculty Candidate)
Tue 8 Mar Philipp Weis Expressiveness and Succinctness of First-Order Logic on Finite Word Abstract
Tue 15 Mar (Spring Break)
Tue 22 Mar Mark McCartin-Lim Models of Parallel Computation Abstract
Tue 29 Mar Md. Ashraful Alam Shape analysis of points in the streaming model Abstract
Tue 5 Apr Ramesh Sitaraman How to stream live on the Internet Abstract
Tue 12 Apr Michael Crouch Streaming algorithms for context-free language recognition Abstract
Tue 19 Apr Andrew McGregor Some Recent Results on Graph Streams
Thu 28 Apr Marco Carmosino Uniform Circuit Lower Bounds Using Logical Pebble Games Abstract
Tue 3 May Jian Tian Cuckoo Hashing
Tue 10 May Shin-ichi Tanigawa Constant-time algorithms for sparsity matroids

Spring 2010

Date Speaker Title Material
Tue 19 Jan No seminar (SODA conference)
Tue 26 Jan Scheduling meeting
Tue 2 Feb Joshua Brody, Dartmouth Better Gap Hamming lower bounds via Better Round Elimination Abstract, Paper
Tue 9 Feb Daniel S. Menasche, UMass Reciprocity and Barter in Peer-to-Peer Systems Abstract, Paper
Tue 16 Feb No seminar (Monday schedule)
Tue 23 Feb Marco Carmosino, Hampshire College SAT Solvers & Proof Complexity Abstract
Tue 2 Mar
Tue 9 Mar Nicolas Scarrci, UMass The Asmmetric p-Center Problem Paper
Tue 16 Mar No seminar (Spring Break)
Tue 23 Mar No seminar (Faculty Candidate Talk)
Tue 30 Mar No seminar
Thu 1 Apr Kamalika Chaudhuri, UCSD (Faculty Candidate) Algorithmic Challenges in Machine Learning
Tue 6 Apr Brandon McPhail, UMass Optimal Stopping and the Odds Algorithm Abstract
Tue 13 Apr Adam Smith, PSU
Tue 20 Apr Creidieki Crouch, UMass Ongoing research in data streams
Tue 27 Apr Mark McCartin-Lim, UMass Ongoing research in data streams
Tue 4 May Philipp Weis, UMass Succinctness of First-Order Logics Abstract