Skip to main content

Multimodal

MARINE

MARINE object-hallucination evaluation: per-item outputs for LLaVA-v1.5-7B with and without the MARINE image-grounded guidance intervention, on POPE (binary object probing) and CHAIR (caption hallucination) over MSCOCO images.

3,500items
2subjects
100%observed
unknownlicense
generaldomain
textmodality
imagemodality

Response matrix

Fit to width. Hover for subject & item; click a cell for details.

MARINE response matrix: AI models (rows) against items (columns)
Correct (1)Incorrect (0)Unobserved

Scale: 1 = correct · 0 = incorrect

Sample items

Item 10% solve rateanswer: no

[Image: COCO_val2014_000000278890.jpg] Is there a handbag in the image?

How subjects answered

  • LLaVA-v1.5-7B incorrect

    Yes, there is a handbag in the image, and a person is holding it.

  • LLaVA-v1.5-7B + MARINE incorrect

    Yes, there is a handbag in the image, and a person is holding it.

Item 20% solve rateanswer: no

[Image: COCO_val2014_000000320350.jpg] Is there a backpack in the image?

How subjects answered

  • LLaVA-v1.5-7B incorrect

    Yes, there is a backpack in the image, and the man and woman are wearing it while skiing.

  • LLaVA-v1.5-7B + MARINE incorrect

    Yes, there is a backpack in the image, and it is being worn by the man.

Item 350% solve rateanswer: yes

[Image: COCO_val2014_000000110449.jpg] Is there a tv in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    Yes, there is a TV in the image, and the two men are standing in front of it.

  • LLaVA-v1.5-7B + MARINE incorrect

    No, there is no TV in the image. The image only features a person wearing headphones and a hat.

Item 4100% solve rateanswer: no

[Image: COCO_val2014_000000046919.jpg] Is there a car in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    No, there is no car in the image. The image features a person skiing on a snow-covered slope.

  • LLaVA-v1.5-7B + MARINE correct

    No, there is no car in the image. The image only shows a person on skis and a snowy mountain.

Item 5100% solve rateanswer: yes

[Image: COCO_val2014_000000067042.jpg] Is there a person in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    Yes, there is a man in the image.

  • LLaVA-v1.5-7B + MARINE correct

    Yes, there is a person in the image.

Item 6100% solve rate

[Image: COCO_val2014_000000477016.jpg] Generate a short caption of the image.

How subjects answered

  • LLaVA-v1.5-7B correct

    A little girl is standing in front of a toy oven, looking at the fish inside.

  • LLaVA-v1.5-7B + MARINE correct

    A little girl is standing in front of a toy oven, which has a fish tank inside.

Item 7100% solve rateanswer: yes

[Image: COCO_val2014_000000456784.jpg] Is there a knife in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    Yes, there is a knife in the image.

  • LLaVA-v1.5-7B + MARINE correct

    Yes, there is a knife in the image.

Item 8100% solve rateanswer: yes

[Image: COCO_val2014_000000411564.jpg] Is there a person in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    Yes, there is a person in the image, a man who is sitting on a table or counter.

  • LLaVA-v1.5-7B + MARINE correct

    Yes, there is a person in the image.

Item 9100% solve rateanswer: yes

[Image: COCO_val2014_000000268293.jpg] Is there a person in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    Yes, there is a man in the image who is playing tennis.

  • LLaVA-v1.5-7B + MARINE correct

    Yes, there is a person in the image.

Item 10100% solve rate

[Image: COCO_val2014_000000358086.jpg] Generate a short caption of the image.

How subjects answered

  • LLaVA-v1.5-7B correct

    A woman in a green shirt and shorts is jumping in the air to catch a frisbee.

  • LLaVA-v1.5-7B + MARINE correct

    A woman in a green shirt and shorts is jumping in the air to catch a frisbee.

Item 11100% solve rateanswer: no

[Image: COCO_val2014_000000250540.jpg] Is there a truck in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    No, there is no truck in the image. The image features a group of people on motorcycles.

  • LLaVA-v1.5-7B + MARINE correct

    No, there is no truck in the image. The image only contains a person and a motorcycle.

Item 12100% solve rateanswer: no

[Image: COCO_val2014_000000523175.jpg] Is there a person in the image?

How subjects answered

  • LLaVA-v1.5-7B correct

    No, there is no person in the image. It is a close-up of a bowl of soup with a spoon in it.

  • LLaVA-v1.5-7B + MARINE correct

    No

Subjects

  1. 1LLaVA-v1.5-7B + MARINE0.8643
  2. 2LLaVA-v1.5-7B0.8111