_{1}-Heavy Hitters in Insertion Streams and Related Problems

We give the first optimal bounds for returning the ℓ_{1}-heavy hitters in a data stream of insertions, together with their approximate frequencies, closing a long line of work on this problem. For a stream of _{i} denote the frequency of item _{i} ⩾ φ _{j} ⩽ (φ −ε)_{i} with |_{i} − _{i}| ⩽ ε ^{−1} log φ ^{−1} + φ ^{−1} log _{1}-heavy hitters. We also introduce several variants of the heavy hitters and maximum frequency problems, inspired by rank aggregation and voting schemes, and show how our techniques can be applied in such settings. Unlike the traditional heavy hitters problem, some of these variants look at comparisons between items rather than numerical values to determine the frequency of an item.