FinQA: A Dataset of Numerical Reasoning over Financial Data (original) (raw)

The sheer volume of financial statements 001 makes it difficult for humans to access and an002 alyze a business’s financials. Robust numeri003 cal reasoning likewise faces unique challenges 004 in this domain. In this work, we focus on 005 answering deep questions over financial data, 006 aiming to automate the analysis of a large cor007 pus of financial documents. In contrast to ex008 isting tasks on general domain, the finance do009 main includes complex numerical reasoning 010 and understanding of heterogeneous represen011 tations. To facilitate analytical progress, we 012 propose a new large-scale dataset, FINQA, 013 with Question-Answering pairs over Financial 014 reports, written by financial experts. We also 015 annotate the gold reasoning programs to en016 sure full explainability. We further introduce 017 baselines and conduct comprehensive experi018 ments in our dataset. The results demonstrate 019 that popular, large, pre-trained models fall far 020 short of expert humans...