Draft README for WebSRC · dadwadw233/lmms-eval@829612c (original) (raw)

Original file line number	Diff line number	Diff line change
@@ -1 +1,51 @@
1		-# WebSRC
	1	+# WebSRC
	2	+
	3	+## Paper
	4	+
	5	+Title: WebSRC: A Dataset for Web-Based Structural Reading Comprehension
	6	+
	7	+Abstract: https://arxiv.org/abs/2101.09465
	8	+
	9	+Homepage: https://x-lance.github.io/WebSRC/#
	10	+
	11	+WebSRC is a dataset for web-based structural reading comprehension.
	12	+Its full train/dev/test split contains over 400k questions across 6.4k webpages.
	13	+This version of the dataset does not contain OCR or original HTML, it simply treats WebSRC as a image-and-text-based multimodal Q&A benchmark on webpage screenshots.
	14	+
	15	+## Citation
	16	+
	17	+```bibtex
	18	+@inproceedings{chen2021websrc,
	19	+ title={WebSRC: A Dataset for Web-Based Structural Reading Comprehension},
	20	+ author={Chen, Xingyu and Zhao, Zihan and Chen, Lu and Ji, Jiabao and Zhang, Danyang and Luo, Ao and Xiong, Yuxuan and Yu, Kai},
	21	+ booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
	22	+ pages={4173--4185},
	23	+ year={2021}
	24	+}
	25	+```
	26	+
	27	+## Groups & Tasks
	28	+
	29	+### Groups
	30	+
	31	+- `websrc`: Evaluates `websrc-val` and generates a submission file for `websrc-test`.
	32	+
	33	+### Tasks
	34	+
	35	+- `websrc-val`: Given a question and a web page, predict the answer.
	36	+- `websrc-test`: Given a question and a web page, predict the answer. Ground truth is not provided for this task.
	37	+
	38	+## Metrics
	39	+
	40	+This task uses SQUAD-style evaluation metrics, of which F1 score over tokens is used.
	41	+The orignal paper also uses Exact Match (EM) score, but this is not implemented here as that metric is more conducive for Encoder-only extraction models.
	42	+
	43	+### F1 Score
	44	+
	45	+F1 Score is the harmonic mean of precision and recall.
	46	+We calculate precision and recall at the token level, then compute the F1 score as normal using these values.
	47	+
	48	+### Test Submission
	49	+
	50	+When evaluaing on the test split, a prediction JSON will be compiled instead of metrics computed.
	51	+Instructions for submission are available on the [WebSRC homepage](https://x-lance.github.io/WebSRC/#) and in their [Original GitHub Repo](https://github.com/X-LANCE/WebSRC-Baseline#obtain-test-result).