On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective

Published in arxiv, 2023

We systematically study and evaluate the adversarial robustness and out-of-distribution generalization of ChatGPT and other large language models in this article. We find that despite significant improvements in adversarial robustness and out-of-distribution generalization compared to previous language models, ChatGPT still has some distance to go before it can be considered fully deployable in a secure manner.

Download paper here