ā€œProvable Guarantees for Generative Behavior Cloning: Bridging Low-Level Stability and High-Level Behaviorā€, Adam Block, Ali Jadbabaie, Daniel Pfrommer, Max Simchowitz, Russ Tedrake2023-07-27 (, , )⁠:

We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling. Our framework invokes low-level controllers—either learned or implicit in position-command control—to stabilize imitation around expert demonstrations.

We show that with (1) a suitable low-level stability guarantee and (2) a powerful enough generative model as our imitation learner, pure supervised behavior cloning can generate trajectories matching the per-time step distribution of essentially arbitrary expert trajectories in an optimal transport cost. Our analysis relies on a stochastic continuity property of the learned policy we call total variation continuity (TVC).

We then show that TVC can be ensured with minimal degradation of accuracy by combining a popular data-augmentation regimen with a novel algorithmic trick: adding augmentation noise at execution time. We instantiate our guarantees for policies parameterized by diffusion models and prove that if the learner accurately estimates the score of the (noise-augmented) expert policy, then the distribution of imitator trajectories is close to the demonstrator distribution in a natural optimal transport distance.

Our analysis constructs intricate couplings between noise-augmented trajectories, a technique that may be of independent interest.

We conclude by empirically validating our algorithmic recommendations, and discussing implications for future research directions for better behavior cloning with generative modeling.