ImageCLEF Fusion: Late Fusion Dataset 2022-2023

Dataset: ImageCLEFusion

ImageCLEF Fusion: Late Fusion Dataset

The ImageCLEFfusion contains 2 sets: i) media interestingness (ImageCLEFfusion-int) set and ii) result diversification (ImageCLEFfusion-div) set.

ImageCLEFfusion-int. The data is extracted from the Interestingness10k dataset [2]. It consists of the output data from 33 inducers, with 1826 samples for the development set, and 609 samples for the testing set.

ImageCLEFfusion-div. The data is extracted from the Retrieving Diverse Social Images Task dataset [3]. It consists of the output data from 117 inducers, with 104 queries for the development set, and 35 samples for the testing set.

The dataset was validated during the 2022 ImageCLEF fusion task and 2023 ImageCLEF fusion task.

To download the data, please follow this link.

Using the dataset:

If you plan to make use of the ImageCLEF Fusion dataset, or refer to its results, please acknowledge the work of the authors by citing the following papers:

Ştefan, L.D., Constantin, M.G., Dogariu, M. and Ionescu, B., 2023, September. Overview of imagecleffusion 2023 task-testing ensembling methods in diverse scenarios. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. CEUR Workshop Proceedings, CEUR-WS. org, Thessaloniki, Greece (pp. 18-21).
Ştefan, L. D., Constantin, M. G., Dogariu, M., and Ionescu, B. (2022). Overview of imagecleffusion 2022 task-ensembling methods for media interestingness prediction and result diversification. In CLEF2022 Working Notes, CEUR Workshop Proceedings, CEUR-WS. org, Bologna, Italy.
Constantin, M. G., Ştefan, L. D., Ionescu, B., Duong, N. Q., Demarty, C. H., and Sjöberg, M. (2021). Visual Interestingness Prediction: A Benchmark Framework and Literature Review. International Journal of Computer Vision, 1-25.
Ionescu, B., Rohm, M., Boteanu, B., Gînscă, A. L., Lupu, M., and Müller, H. (2020). Benchmarking Image Retrieval Diversification Techniques for Social Media. IEEE Transactions on Multimedia, 23, 677-691.

Acknowledgements:

This work was supported under project AI4Media, A European Excellence Centre for Media, Society and Democracy, H2020 ICT-48-2020, grant #951911.