FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments
Published in Submitted to EMNLP 2026, 2026
The first execution-grounded security benchmark for LLM-based financial agents — 31 regulatory sandbox scenarios, 107 real-world vulnerabilities, 963 test cases. Submitted to EMNLP 2026.
Recommended citation: Zhi Yang, Runguo Li, Qiqi Qiang, Jiashun Wang, Fangqi Lou, et al. (2026). "FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments." arXiv preprint arXiv:2601.07853. Submitted to EMNLP 2026.
Download Paper
