@@ -1 +1,51 @@ |
|
|
1 |
|
-# WebSRC |
|
1 |
+# WebSRC |
|
2 |
+ |
|
3 |
+## Paper |
|
4 |
+ |
|
5 |
+Title: WebSRC: A Dataset for Web-Based Structural Reading Comprehension |
|
6 |
+ |
|
7 |
+Abstract: https://arxiv.org/abs/2101.09465 |
|
8 |
+ |
|
9 |
+Homepage: https://x-lance.github.io/WebSRC/# |
|
10 |
+ |
|
11 |
+WebSRC is a dataset for web-based structural reading comprehension. |
|
12 |
+Its full train/dev/test split contains over 400k questions across 6.4k webpages. |
|
13 |
+This version of the dataset does not contain OCR or original HTML, it simply treats WebSRC as a image-and-text-based multimodal Q&A benchmark on webpage screenshots. |
|
14 |
+ |
|
15 |
+## Citation |
|
16 |
+ |
|
17 |
+```bibtex |
|
18 |
+@inproceedings{chen2021websrc, |
|
19 |
+ title={WebSRC: A Dataset for Web-Based Structural Reading Comprehension}, |
|
20 |
+ author={Chen, Xingyu and Zhao, Zihan and Chen, Lu and Ji, Jiabao and Zhang, Danyang and Luo, Ao and Xiong, Yuxuan and Yu, Kai}, |
|
21 |
+ booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing}, |
|
22 |
+ pages={4173--4185}, |
|
23 |
+ year={2021} |
|
24 |
+} |
|
25 |
+``` |
|
26 |
+ |
|
27 |
+## Groups & Tasks |
|
28 |
+ |
|
29 |
+### Groups |
|
30 |
+ |
|
31 |
+- `websrc`: Evaluates `websrc-val` and generates a submission file for `websrc-test`. |
|
32 |
+ |
|
33 |
+### Tasks |
|
34 |
+ |
|
35 |
+- `websrc-val`: Given a question and a web page, predict the answer. |
|
36 |
+- `websrc-test`: Given a question and a web page, predict the answer. Ground truth is not provided for this task. |
|
37 |
+ |
|
38 |
+## Metrics |
|
39 |
+ |
|
40 |
+This task uses SQUAD-style evaluation metrics, of which F1 score over tokens is used. |
|
41 |
+The orignal paper also uses Exact Match (EM) score, but this is not implemented here as that metric is more conducive for Encoder-only extraction models. |
|
42 |
+ |
|
43 |
+### F1 Score |
|
44 |
+ |
|
45 |
+F1 Score is the harmonic mean of precision and recall. |
|
46 |
+We calculate precision and recall at the token level, then compute the F1 score as normal using these values. |
|
47 |
+ |
|
48 |
+### Test Submission |
|
49 |
+ |
|
50 |
+When evaluaing on the test split, a prediction JSON will be compiled instead of metrics computed. |
|
51 |
+Instructions for submission are available on the [WebSRC homepage](https://x-lance.github.io/WebSRC/#) and in their [Original GitHub Repo](https://github.com/X-LANCE/WebSRC-Baseline#obtain-test-result). |