“I Know What You Bought At Chipotle for $9.81 by Solving A Linear Inverse Problem”, Michael Fleder, Devavrat Shah2020-12-01 (, ; similar)⁠:

We consider the question of identifying which set of products are purchased and at what prices in a given transaction by observing only the total amount spent in the transaction, and nothing more. The ability to solve such an inverse problem can lead to refined information about consumer spending by simply observing anonymized credit card transactions data.

Indeed, when considered in isolation, it is impossible to identify the products purchased and their prices from a given transaction just based on the transaction total. However, given a large number of transactions, there may be a hope. As the main contribution of this work, we provide a robust estimation algorithm for decomposing transaction totals into the underlying, individual product(s) purchased by using a large corpus of transactions.

Our method recovers a (product prices) vector p ∈ ℝ>0N of unknown dimension (number of products) N as well as matrix A ∈ ℤ≥0M×N simply from M observations (transaction totals) y ∈ ℝ>0M such that y = Ap + η with η ∈ ℝM representing noise (taxes, discounts, etc). We formally establish that our algorithm identifies N, A precisely and p approximately, as long as each product is purchased individually at least once, ie. MN and A has rank N. Computationally, the algorithm runs in polynomial time (with respect to problem parameters), and thus we provide a computationally efficient and statistically robust method for solving such inverse problems.

We apply the algorithm to a large corpus of anonymized consumer credit card transactions in the period 201632019, with data obtained from a commercial data vendor. The transactions are associated with spending at Apple, Chipotle, Netflix, and Spotify. From just transactions data, our algorithm identifies (1) key price points (without access to the listed prices), (2) products purchased within a transaction, (3) product launches, and (4) evidence of a new ‘secret’ product from Netflix—rumored to be in limited release.

[Keywords: blind compressed sensing, alternative data, finance, consumer credit card transactions]