Yep, you read that right – at the end of 2023, a chatbot using GPT-4 committed insider trading and then lied about it.
This one, thankfully, happened during a simulation and had no real world consequences. The AI (named “Alpha”) was informed about a “surprise merger announcement” for a fictional company, and then explicitly warned that this information was confidential and should not be used.
However, when Alpha was then instructed to prevent the company’s financial decline, it independently concluded that averting an organizational setback was more important than adhering to the rules, and leveraged the insider information to make a trade. Furthermore, when asked it whether it had used confidential tips, the AI bot dissembled and falsely claimed to have only used publicly available information.
Researchers state that it took a lot of prompting to get Alpha to use the confidential information, which is somewhat comforting. But the greater lesson from this gaffe is one that anyone who uses AI can learn from.
As researcher Marius Hobbhahn put it in an interview with the BBC:
The model isn’t plotting or trying to mislead you in many different ways. It’s more of an accident. Helpfulness, I think, is much easier to train into the model than honesty.
AI models are designed to be accommodating – they are created to supply answers, and will do so even if those answers are not always accurate (or are felonies!). Human oversight through techniques like reinforcement learning and reviewing processes can help catch these moments of over-helpfulness before they cause a disaster.
We wrote about how to train your AI agents – here are some tips to make sure they’re learning the right lessons: https://pickaxe.ai/2025/03/04/how-we-trained-your-ai-agent-part-2-reinforcement-learning/