“Mitigating Lies in Vision-Language Models”, Junbo Li, Xianhang Li, Cihang Xie2023-05-05 (, )⁠:

In this work, we bring new insights into the honesty of vision-language models, particularly in visual question answering (VQA). After a throughout revisit of the existing ‘lie’ behavior in pure language models, our work makes an unprecedented extension of ’lies’ to vision-language models.

The results indicate that the lie prefixes have a more obvious misleading effect on vision-language models than on language models. We also propose a novel visual prefix and prove that the consistent vision-language prefix is more threatening to vision-language models.

To defend the models from the stated ’lies’, we put forward an unsupervised framework based on Gaussian mixture modeling and obtain improvement of +3% against the language prefix and +12% against the vision-language prefix.