Wikidata turns eight – An interview with Lydia Pintscher
Happy Birthday, Wikidata! The open knowledge base is turning eight. Since it came into being, the project developed by Wikimedia Deutschland in intensive collaboration with an international community has made enormous progress. In the following interview, Lydia Pintscher, Wikidata product manager, talks about Wikidata’s beginnings, milestones and plans for the future.
Why is this birthday so important to you personally?
What makes and shapes Wikidata is the community who perseveres day after day in order to provide the world with high-quality, complete and machine-readable data. While working on all these different projects, it is easy to lose sight of the bigger picture and of everything that we have achieved so far. It is always good to pause for a moment and take the time to reflect and celebrate successes.
We also use the birthday as an opportunity to come together as a community, for example during the WikidataCon in Berlin in 2017 and 2019, or online in 2018 and 2020.
How are you celebrating the birthday?
We are celebrating online this year. Community members all over the world have organized many events. There will be a 24-hour livestream with fantastic new features every hour, as well as the WikiCite conference. And no birthday is complete without presents. At Wikidata, that will be programmes, documentations or games, but also songs and cakes. Opening the presents will be part of the fun! You can find all the details about the party here.
Please tell us about the beginnings of Wikidata! How was the idea conceived at the time?
The idea behind Wikidata was conceived around 2005. Denny Vrandečić and Markus Krötzsch wanted to improve Wikipedia and make it machine-readable. Wikipedia’s texts are fantastic, but unfortunately not ideal in terms of machine readability. This makes it very difficult to enhance them by adding new applications or visualizations.
In April 2012, we started to develop Wikidata at Wikimedia Deutschland. On 29 October 2012, the wiki was released under wikidata.org for the first time. We started with the maintenance of the links between the different language versions of the Wikimedia projects. Previously, it had been quite time-consuming to maintain them manually for each article and language version, whereas now, they are all managed centrally at Wikidata.
A short while later, it became possible to store the actual data in statements and also integrate them into the Wikimedia projects. And during this whole time, we have had the support of many volunteers – by now more than 12,500 people who continue to fill Wikipedia with life and data.
Over the past eight years, so much has happened that it is very difficult to pick out some highlights. But I’ll try anyway: A major milestone from the point of view of the development teams was certainly to release the wiki after many months of laying the foundations with their work. Important steps were also when editors and bots began to fill Wikidata with concepts from Wikipedia and when these data were then used in other Wikimedia projects .
This was followed by diversification to include other types of data, supported by media data on Wikimedia Commons and lexicographical data on Wikidata. These data enabled us to compile a machine-readable, multilingual dictionary.
Over the past few years, we have also focused on the Wikibase ecosystem, which pursues the goal to enable an increasing number of organizations to build a knowledge database modelled on Wikidata and closely link this database with Wikidata.
Wikidata has received several awards along the way – with the most important for us being the Open Data Award presented by Sir Tim Berners-Lee and Sir Nigel Shadbolt.
What are the challenges?
Wikidata is growing, and growth comes with challenges, both technical and from a social point of view. In terms of technology, our infrastructure has to cope with more data and an increased demand for access to these data, because the data in Wikidata are more and more frequently used by more and more applications within and outside of Wikimedia. But that is a good challenge that we are happy to face.
In terms of social coherence, it is important not to lose sight of the human factor of the project – which can easily happen in a community that is as large as Wikidata. Which is why events like this birthday and the WikidataCon every two years matter so much to us.
Finally, I would like to mention data quality. The more data Wikidata makes available to the world, the more maintenance and updates of these data are necessary, particularly since the use of Wikidata’s data happens in more and more places that really matter.
What are the plans for next year?
Next year, we will explore how to increase language and cultural diversification, how to continue to improve the infrastructure and scale it and how to provide even better quality-assurance support for the editors. And I am looking forward to another year with a great community and an amazing project.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies. However you may visit Cookie Settings to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
This cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to check whether or not the user has given the consent to the usage of cookies under the category ‚Settings'.
pll_language
0
1 year
This cookie is set by Polylang plugin for WordPress powered websites. The cookie stores the language code of the last browsed page.
This cookie name is associated with functionality to convert a user IP address into a geographic location record. It would most commonly be used to serve content to a user based on their location.
We use the open source analysis tool Matomo (formerly Piwik) in order to better understand what interests visitors to our website and whether they can find their way around there. This tool sets a cookie to distinguish individual users from one another.
_pk_ses.2.b225
0
30 minutes
We use the open source analysis tool Matomo (formerly Piwik) in order to better understand what interests visitors to our website and whether they can find their way around there. This tool sets a cookie to distinguish individual users from one another.
This cookie is set by Youtube and registers a unique ID for tracking users based on their geographical location
IDE
1
2 years
Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
VISITOR_INFO1_LIVE
1
5 months
This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
vuid
0
2 years
These cookies are used by the Vimeo video player on websites.
YSC
1
This cookies is set by Youtube and is used to track the views of embedded videos.