On the Vulnerability of concept erasure in diffusion models

Ongoing work where we have developed an adversarial attack algorithm that reliably recalls erased concepts from models that have been “unlearned” via a wide range of techniques.