merged readme.md · EvolvingLMMs-Lab/lmms-eval@46a88d8 (original) (raw)
``
1
`+
Current Tasks
`
``
2
+
``
3
`+
() indicates the task name in the lmms_eval. The task name is also used to specify the dataset in the configuration file.
`
``
4
`` +
The following is manually updated documentation. You could use
lmms_eval task --list
to list all supported tasks and their task names.
``
``
5
+
``
6
`+
- AI2D (ai2d)
`
``
7
`+
- ChartQA (chartqa)
`
``
8
`+
- CMMMU (cmmmu)
`
``
9
`+
- CMMMU Validation (cmmmu_val)
`
``
10
`+
- CMMMU Test (cmmmu_test)
`
``
11
`+
- COCO Caption (coco_cap)
`
``
12
`+
- COCO 2014 Caption (coco2014_cap)
`
``
13
`+
- COCO 2014 Caption Validation (coco2014_cap_val)
`
``
14
`+
- COCO 2014 Caption Test (coco2014_cap_test)
`
``
15
`+
- COCO 2017 Caption (coco2017_cap)
`
``
16
`+
- COCO 2017 Caption MiniVal (coco2017_cap_val)
`
``
17
`+
- COCO 2017 Caption MiniTest (coco2017_cap_test)
`
``
18
`+
- ConBench (conbench)
`
``
19
`+
- DOCVQA (docvqa)
`
``
20
`+
- DOCVQA Validation (docvqa_val)
`
``
21
`+
- DOCVQA Test (docvqa_test)
`
``
22
`+
- Ferret (ferret)
`
``
23
`+
- Flickr30K (flickr30k)
`
``
24
`+
- Ferret Test (ferret_test)
`
``
25
`+
- GQA (gqa)
`
``
26
`+
- HallusionBenchmark (hallusion_bench_image)
`
``
27
`+
- Infographic VQA (info_vqa)
`
``
28
`+
- Infographic VQA Validation (info_vqa_val)
`
``
29
`+
- Infographic VQA Test (info_vqa_test)
`
``
30
`+
- LLaVA-Bench (llava_in_the_wild)
`
``
31
`+
- LLaVA-Bench-COCO (llava_bench_coco)
`
``
32
`+
- MathVerse (mathverse)
`
``
33
`+
- MathVerse Text Dominant (mathverse_testmini_text_dominant)
`
``
34
`+
- MathVerse Text Only (mathverse_testmini_text_only)
`
``
35
`+
- MathVerse Text Lite (mathverse_testmini_text_lite)
`
``
36
`+
- MathVerse Vision Dominant (mathverse_testmini_vision_dominant)
`
``
37
`+
- MathVerse Vision Intensive (mathverse_testmini_vision_intensive)
`
``
38
`+
- MathVerse Vision Only (mathverse_testmini_vision_only)
`
``
39
`+
- MathVista (mathvista)
`
``
40
`+
- MathVista Validation (mathvista_testmini)
`
``
41
`+
- MathVista Test (mathvista_test)
`
``
42
`+
- MMBench (mmbench)
`
``
43
`+
- MMBench English (mmbench_en)
`
``
44
`+
- MMBench English Dev (mmbench_en_dev)
`
``
45
`+
- MMBench English Test (mmbench_en_test)
`
``
46
`+
- MMBench Chinese (mmbench_cn)
`
``
47
`+
- MMBench Chinese Dev (mmbench_cn_dev)
`
``
48
`+
- MMBench Chinese Test (mmbench_cn_test)
`
``
49
`+
- MME (mme)
`
``
50
`+
- MMMU (mmmu)
`
``
51
`+
- MMMU Validation (mmmu_val)
`
``
52
`+
- MMMU Test (mmmu_test)
`
``
53
`+
- MMUPD (mmupd)
`
``
54
`+
- MMUPD Base (mmupd_base)
`
``
55
`+
- MMAAD Base (mmaad_base)
`
``
56
`+
- MMIASD Base (mmiasd_base)
`
``
57
`+
- MMIVQD Base (mmivqd_base)
`
``
58
`+
- MMUPD Option (mmupd_option)
`
``
59
`+
- MMAAD Option (mmaad_option)
`
``
60
`+
- MMIASD Option (mmiasd_option)
`
``
61
`+
- MMIVQD Option (mmivqd_option)
`
``
62
`+
- MMUPD Instruction (mmupd_instruction)
`
``
63
`+
- MMAAD Instruction (mmaad_instruction)
`
``
64
`+
- MMIASD Instruction (mmiasd_instruction)
`
``
65
`+
- MMIVQD Instruction (mmivqd_instruction)
`
``
66
`+
- MMVet (mmvet)
`
``
67
`+
- Multi-DocVQA (multidocvqa)
`
``
68
`+
- Multi-DocVQA Validation (multidocvqa_val)
`
``
69
`+
- Multi-DocVQA Test (multidocvqa_test)
`
``
70
`+
- NoCaps (nocaps)
`
``
71
`+
- NoCaps Validation (nocaps_val)
`
``
72
`+
- NoCaps Test (nocaps_test)
`
``
73
`+
- OKVQA (ok_vqa)
`
``
74
`+
- OKVQA Validation 2014 (ok_vqa_val2014)
`
``
75
`+
- POPE (pope)
`
``
76
`+
- RefCOCO (refcoco)
`
``
77
`+
- refcoco_seg_test
`
``
78
`+
- refcoco_seg_val
`
``
79
`+
- refcoco_seg_testA
`
``
80
`+
- refcoco_seg_testB
`
``
81
`+
- refcoco_bbox_test
`
``
82
`+
- refcoco_bbox_val
`
``
83
`+
- refcoco_bbox_testA
`
``
84
`+
- refcoco_bbox_testB
`
``
85
`+
- RefCOCO+ (refcoco+)
`
``
86
`+
- refcoco+_seg
`
``
87
`+
- refcoco+_seg_val
`
``
88
`+
- refcoco+_seg_testA
`
``
89
`+
- refcoco+_seg_testB
`
``
90
`+
- refcoco+_bbox
`
``
91
`+
- refcoco+_bbox_val
`
``
92
`+
- refcoco+_bbox_testA
`
``
93
`+
- refcoco+_bbox_testB
`
``
94
`+
- RefCOCOg (refcocog)
`
``
95
`+
- refcocog_seg_test
`
``
96
`+
- refcocog_seg_val
`
``
97
`+
- refcocog_bbox_test
`
``
98
`+
- refcocog_bbox_val
`
``
99
`+
- ScienceQA (scienceqa_full)
`
``
100
`+
- ScienceQA Full (scienceqa)
`
``
101
`+
- ScienceQA IMG (scienceqa_img)
`
``
102
`+
- ScreenSpot (screenspot)
`
``
103
`+
- ScreenSpot REC / Grounding (screenspot_rec)
`
``
104
`+
- ScreenSpot REG / Instruction Generation (screenspot_reg)
`
``
105
`+
- SeedBench (seedbench)
`
``
106
`+
- SeedBench 2 (seedbench_2)
`
``
107
`+
- ST-VQA (stvqa)
`
``
108
`+
- TextCaps (textcaps)
`
``
109
`+
- TextCaps Validation (textcaps_val)
`
``
110
`+
- TextCaps Test (textcaps_test)
`
``
111
`+
- TextVQA (textvqa)
`
``
112
`+
- TextVQA Validation (textvqa_val)
`
``
113
`+
- TextVQA Test (textvqa_test)
`
``
114
`+
- VizWizVQA (vizwiz_vqa)
`
``
115
`+
- VizWizVQA Validation (vizwiz_vqa_val)
`
``
116
`+
- VizWizVQA Test (vizwiz_vqa_test)
`
``
117
`+
- VQAv2 (vqav2)
`
``
118
`+
- VQAv2 Validation (vqav2_val)
`
``
119
`+
- VQAv2 Test (vqav2_test)
`
``
120
`+
- WebSRC (websrc)
`
``
121
`+
- WebSRC Validation (websrc_val)
`
``
122
`+
- WebSRC Test (websrc_test)
`