Remove unnecessary lines since use batched visuals now in llava · EvolvingLMMs-Lab/lmms-eval@12f4806 (original) (raw)

`@@ -358,18 +358,6 @@ def _collate(x):

`

358

358

`prompt_question = conv.get_prompt()

`

359

359

`question_input.append(prompt_question)

`

360

360

``

361

``

`-

The above for loop has bugs. When there is no visuals, e.g. pure text,

`

362

``

`-

there will be no for loop execute resulting in an empty question_input (because no visuals)

`

363

``

`-

Scenario 1 won't even be execute

`

364

``

`-

if len(flattened_visuals) == 0:

`

365

``

`-

for context in contexts:

`

366

``

`-

question = context

`

367

``

`-

conv = conv_templates[self.conv_template].copy()

`

368

``

`-

conv.append_message(conv.roles[0], question)

`

369

``

`-

conv.append_message(conv.roles[1], None)

`

370

``

`-

prompt_question = conv.get_prompt()

`

371

``

`-

question_input.append(prompt_question)

`

372

``

-

373

361

`# input_ids = tokenizer_image_token(prompt, self.tokenizer, IMAGE_TOKEN_INDEX, return_tensors="pt").unsqueeze(0).to(self.device)

`

374

362

`# preconfigure gen_kwargs with defaults

`

375

363

`gen_kwargs["image_sizes"] = [flattened_visuals[idx].size for idx in range(len(flattened_visuals))]

`