https://arxiv.org/abs/2401.04536 Evaluating Language Model Agency through NegotiationsWe introduce an approach to evaluate language model (LM) agency using negotiation games. This approach better reflects real-world use cases and addresses some of the shortcomings of alternative LM benchmarks. Negotiation games enable us to study multi-turnarxiv.org 오 딱 제가 생각했던 Agent의 평가를 어떻게 해야 멀티턴에 적합하게, 벤치마크 ..