Enriching Metadata by Extracting Entities from Photographs

Background / Challenge

With the growing number of Open Data initiatives, the volume of Open Images (i.e., images accompanied by an appropriate open license) has increased on the Web. However, in some cases, such images lack descriptive information, so-called metadata, which hampers discoverability and interoperability. At the Bern University of Applied Sciences, we aim to create an online service/platform to enrich the metadata for Open Images, especially those from heritage institutions.


A machine learning algorithm is used to extract entities from image files and to link them to an existing knowledge graph (Wikidata). To improve the quality of the matching, humans are put in the loop, using an expert- or crowdsourcing approach.

Achieved so far

In a first step, the fundamental knowledge in deep learning was researched, and existing Virtual Recognition Services (VRSs) were explored and compared to each other. This step's goal was to analyze the efficacy of the existing VRSs and to determine whether they can be used productively. As for the data source (i.e., image collections), Wikimedia Commons was used, and its items' metadata was extracted and enriched by linking it to Wikidata items.

The findings suggested that the combination of existing VRSs could deliver substantial results, while there is room for improvement. Further, it was suggested to train an algorithm to directly link to Wikidata items instead of using VRSs pre-defined tags in an intermediate step.

Metadata extraction

Currently in progress

In the ongoing second step, the goal is twofold: to customize one of the existing VRSs from IBM Watson to improve its predictive power (i.e., accuracy), and to automate extracting and enriching the metadata on Wikimedia Commons as much as possible. The customized VRS will be benchmarked with the computer-aided tagging (CAT) tool provided in the context of Wikimedia Commons; thus, its added value will be assessed. A possible way to accomplish the second goal would be coupling the ISA-Tool, which is already functioning and used on Wikimedia Commons to enrich image collections with metadata through crowdsourcing, and the CAT-tool. The coupling would reduce the effort of the GLAM institutions, encourage them to publish more, and potentially increase the re-use of images thanks to the improved data quality.


Logical future steps include:

  • Evaluating and improving the tool in cooperation with heritage institutions
  • Running crowdsourcing campaigns to tag photograph collections on Wikimedia Commons; improving the crowdsourcing process
  • Making the service available for images outside Wikimedia Commons to be able to use it also on images that cannot be released under a free copyright license due to copyright issues