AgentTape
OpenAI: o3: AgentScore, benchmarks and signals | AgentTape