Predicting Media Memorability Dataset

This dataset is intended to be used for assessing the prediction of how memorable a video will be. The dataset is composed of 10,000 short soundless videos shared under a license that allows their use and redistribution. The videos are split into 8,000 videos for the development set and 2,000 videos for the test set. They were extracted from raw footage used by professionals when creating content. Of 7s-duration each, they are varied and contain different scenes types. Each video also comes with its original title. These titles can often be interpreted as a list of tags (textual metadata) that might be useful to infer the memorability of the videos.

The dataset was validated during the 2018 and 2019 Predicting Media Memorability Tasks at the MediaEval Benchmarking Initiative for Multimedia Evaluation.

For more details see:
  1. R. Cohendet, C.-H. Demarty, Ngoc Q.K. Duong, M. Sjöberg, B. Ionescu, T.-T. Do, “MediaEval 2018: Predicting Media Memorability”, MediaEval Benchmarking Initiative for Multimedia Evaluation, vol. 2283, CEUR-WS.org, ISSN: 1613-0073, 2018 (task overview paper describing the dataset and the task).
  2. M.G. Constantin, B. Ionescu, C.-H. Demarty, Ngoc Q.K. Duong, X. Alameda-Pineda, M. Sjöberg, “The Predicting Media Memorability Task at MediaEval 2019”, MediaEval Benchmarking Initiative for Multimedia Evaluation, 2019 (task overview paper describing the dataset and the task).
Acknowledgements:

If you plan to make use of the PMMD dataset, or refer to its results, please acknowledge the work of the authors by citing the papers listed above.

This dataset was made possible by Technicolor France, and therefore we acknowledge their valuable contribution.