Evaluating on Image Hallucination for TTI Generative Models in I-HallA via PaliGemma

1 분 소요

I-HallA via PaliGemma ✨

This is an unofficial release of the paper Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering.

Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering,
Youngsun Lim, Hojun Choi, Pin-Yu Chen, Hyunjung Shim [Paper][Supp][project page(TBD)][Bibetex]

Installation

This project is based on PaliGemma

It requires the following packages:

    python==3.10.0
    transformers==4.45.1
    numpy==1.26.3

Testing

PaliGemma

The implementation based on PaliGemma achieves comparable results compared to the GPT-4o results reported in the paper.

Models	I-HallA Score (Science)	I-HallA Score (History)	I-HallA Score† (Science)	I-HallA Score† (History)
SD v1.4	0.253	0.435	-	-
SD v1.5	0.209	0.433	-	-
SD v2.0	0.236	0.440	-	-
SD XL	0.298	0.479	-	-
DallE-3	0.561	0.566	-	-
Factual	0.756	0.773	-	-

GPT-4o

Models	I-HallA Score (Science)	I-HallA Score (History)	I-HallA Score† (Science)	I-HallA Score† (History)
SD v1.4	0.353 ± 0.002	0.535 ± 0.013	0.033 ± 0.012	0.110 ± 0.010
SD v1.5	0.309 ± 0.011	0.533 ± 0.004	0.030 ± 0.017	0.117 ± 0.021
SD v2.0	0.336 ± 0.006	0.540 ± 0.014	0.027 ± 0.021	0.120 ± 0.010
SD XL	0.398 ± 0.015	0.579 ± 0.012	0.077 ± 0.050	0.110 ± 0.066
DallE-3	0.661 ± 0.020	0.666 ± 0.003	0.227 ± 0.029	0.133 ± 0.031
Factual	0.856 ± 0.002	0.873 ± 0.006	0.517 ± 0.038	0.533 ± 0.015

Citation

@inproceedings{ihalla,
    title={Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering},
    author={Youngsun Lim, Hojun Choi, Pin-Yu Chen, Hyunjung Shim},
    year={2024},
    booktitle={arXiv},
}

Twitter Facebook LinkedIn

쭌스🎄

Evaluating on Image Hallucination for TTI Generative Models in I-HallA via PaliGemma

I-HallA via PaliGemma ✨

Installation

Testing

PaliGemma

GPT-4o

Citation

공유하기

댓글남기기

참고

2023.12.04
[논문분석] Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation (PAMI 2023)

2023.12.03
[논문분석] Segment Anything (ICCV 2023)

2023.12.03
[논문분석] Learning Transferable Visual Models From Natural Language Supervision (ICMR 2021)

2023.12.03
[논문분석] Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation (CVPR 2018)

2023.12.01
[논문분석] Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (ICCV 2017)

2023.12.01
[논문분석] Entropy regularization for weakly supervised object localization (PRL 2023)

2023.11.29
[논문분석] Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation (CVPR 2021)

2023.11.25
[논문분석] Weaker Than You Think: A Critical Look at Weakly Supervised Learning (ACL 2023)

2023.08.03
[논문분석] PETR: Position Embedding Transformation for Multi-View 3D Object Detection (ECCV, 2022)

2023.08.02
[논문 분석] DETR3D (CoRL 2021)

쭌스🎄

I-HallA via PaliGemma ✨

Installation

Testing

PaliGemma

GPT-4o

Citation

공유하기

댓글남기기

참고

2023.12.04 [논문분석] Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation (PAMI 2023)

2023.12.03 [논문분석] Segment Anything (ICCV 2023)

2023.12.03 [논문분석] Learning Transferable Visual Models From Natural Language Supervision (ICMR 2021)

2023.12.03 [논문분석] Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation (CVPR 2018)

2023.12.01 [논문분석] Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (ICCV 2017)

2023.12.01 [논문분석] Entropy regularization for weakly supervised object localization (PRL 2023)

2023.11.29 [논문분석] Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation (CVPR 2021)

2023.11.25 [논문분석] Weaker Than You Think: A Critical Look at Weakly Supervised Learning (ACL 2023)

2023.08.03 [논문분석] PETR: Position Embedding Transformation for Multi-View 3D Object Detection (ECCV, 2022)

2023.08.02 [논문 분석] DETR3D (CoRL 2021)

2023.12.04
[논문분석] Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation (PAMI 2023)

2023.12.03
[논문분석] Segment Anything (ICCV 2023)

2023.12.03
[논문분석] Learning Transferable Visual Models From Natural Language Supervision (ICMR 2021)

2023.12.03
[논문분석] Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation (CVPR 2018)

2023.12.01
[논문분석] Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (ICCV 2017)

2023.12.01
[논문분석] Entropy regularization for weakly supervised object localization (PRL 2023)

2023.11.29
[논문분석] Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation (CVPR 2021)

2023.11.25
[논문분석] Weaker Than You Think: A Critical Look at Weakly Supervised Learning (ACL 2023)

2023.08.03
[논문분석] PETR: Position Embedding Transformation for Multi-View 3D Object Detection (ECCV, 2022)

2023.08.02
[논문 분석] DETR3D (CoRL 2021)