Draft README for WebSRC · dadwadw233/lmms-eval@829612c (original) (raw)

Original file line number Diff line number Diff line change
@@ -1 +1,51 @@
1 -# WebSRC
1 +# WebSRC
2 +
3 +## Paper
4 +
5 +Title: WebSRC: A Dataset for Web-Based Structural Reading Comprehension
6 +
7 +Abstract: https://arxiv.org/abs/2101.09465
8 +
9 +Homepage: https://x-lance.github.io/WebSRC/#
10 +
11 +WebSRC is a dataset for web-based structural reading comprehension.
12 +Its full train/dev/test split contains over 400k questions across 6.4k webpages.
13 +This version of the dataset does not contain OCR or original HTML, it simply treats WebSRC as a image-and-text-based multimodal Q&A benchmark on webpage screenshots.
14 +
15 +## Citation
16 +
17 +```bibtex
18 +@inproceedings{chen2021websrc,
19 + title={WebSRC: A Dataset for Web-Based Structural Reading Comprehension},
20 + author={Chen, Xingyu and Zhao, Zihan and Chen, Lu and Ji, Jiabao and Zhang, Danyang and Luo, Ao and Xiong, Yuxuan and Yu, Kai},
21 + booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
22 + pages={4173--4185},
23 + year={2021}
24 +}
25 +```
26 +
27 +## Groups & Tasks
28 +
29 +### Groups
30 +
31 +- `websrc`: Evaluates `websrc-val` and generates a submission file for `websrc-test`.
32 +
33 +### Tasks
34 +
35 +- `websrc-val`: Given a question and a web page, predict the answer.
36 +- `websrc-test`: Given a question and a web page, predict the answer. Ground truth is not provided for this task.
37 +
38 +## Metrics
39 +
40 +This task uses SQUAD-style evaluation metrics, of which F1 score over tokens is used.
41 +The orignal paper also uses Exact Match (EM) score, but this is not implemented here as that metric is more conducive for Encoder-only extraction models.
42 +
43 +### F1 Score
44 +
45 +F1 Score is the harmonic mean of precision and recall.
46 +We calculate precision and recall at the token level, then compute the F1 score as normal using these values.
47 +
48 +### Test Submission
49 +
50 +When evaluaing on the test split, a prediction JSON will be compiled instead of metrics computed.
51 +Instructions for submission are available on the [WebSRC homepage](https://x-lance.github.io/WebSRC/#) and in their [Original GitHub Repo](https://github.com/X-LANCE/WebSRC-Baseline#obtain-test-result).