1 분 소요

Paper Github


I-HallA via PaliGemma ✨

This is an unofficial release of the paper Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering.

Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering,
Youngsun Lim, Hojun Choi, Pin-Yu Chen, Hyunjung Shim [Paper][Supp][project page(TBD)][Bibetex]


Installation

This project is based on PaliGemma

It requires the following packages:

    python==3.10.0
    transformers==4.45.1
    numpy==1.26.3

Testing

PaliGemma

The implementation based on PaliGemma achieves comparable results compared to the GPT-4o results reported in the paper.

Models I-HallA Score (Science) I-HallA Score (History) I-HallA Score† (Science) I-HallA Score† (History)
SD v1.4 0.253 0.435 - -
SD v1.5 0.209 0.433 - -
SD v2.0 0.236 0.440 - -
SD XL 0.298 0.479 - -
DallE-3 0.561 0.566 - -
Factual 0.756 0.773 - -

GPT-4o

Models I-HallA Score (Science) I-HallA Score (History) I-HallA Score† (Science) I-HallA Score† (History)
SD v1.4 0.353 ± 0.002 0.535 ± 0.013 0.033 ± 0.012 0.110 ± 0.010
SD v1.5 0.309 ± 0.011 0.533 ± 0.004 0.030 ± 0.017 0.117 ± 0.021
SD v2.0 0.336 ± 0.006 0.540 ± 0.014 0.027 ± 0.021 0.120 ± 0.010
SD XL 0.398 ± 0.015 0.579 ± 0.012 0.077 ± 0.050 0.110 ± 0.066
DallE-3 0.661 ± 0.020 0.666 ± 0.003 0.227 ± 0.029 0.133 ± 0.031
Factual 0.856 ± 0.002 0.873 ± 0.006 0.517 ± 0.038 0.533 ± 0.015

Citation

@inproceedings{ihalla,
    title={Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering},
    author={Youngsun Lim, Hojun Choi, Pin-Yu Chen, Hyunjung Shim},
    year={2024},
    booktitle={arXiv},
}

댓글남기기