Wikidata is an open knowledge base that collects facts (statements) on pieces of knowledge (items). It is run by the Wikimedia Foundation, developed by a team led by Wikimedia Deutschland, and tended and cared for by a global community of volunteers. Unlike Wikipedia, that contains knowledge collected by volunteers written in free form, it is machine-readable and pieces of knowledge and can be queried in relation to each other.
Dozens of application already use the knowledge base. One particularly cool way to access knowledge in Wikidata is through queries in the SPARQL query language. Just a little knowledge of SPARQL goes a long way to query for facts and relationships – thus opening new horizons and rearranging knowledge in a totally new way.
With SPARQL, the possibilities are virtually endless. Follow along for 10 cool queries:
#1 The best cocktail recipes according to Wikidata
No longer sure about the ingredients for your Mojito on a hard day’s night? Wikidata to the rescue! There’s even a picture to go with the list of ingredients. This query is especially impressive as it magically generates a recipe in English. It was conjured up by a Wikidata volunteer known on Twitter as WikidataFacts.
#2 The Internet loves cats. So does Wikidata.
Cats are kind of a big deal on the Internet. But what famous and well-known cats are there in the world? Find out with just one click on Wikidata. We are definitely fans of Humphrey and Larry, both of them bearing the very British title of Chief Mouser to the Cabinet Office.
#3 German settlements ending with “-ow” or “-itz”
Where are German towns, villages, and cities located ending in “-ow” or “-itz”? Geography and cultural science buffs may know the answer, but Wikidata will show it to you on a map: It’s basically East Germany with with a few notable exceptions in the North (Olpenitz near Kappeln) and South (Flanitz, a part of Frauenau).
#4 The world’s most common surnames
What are the world’s most common surnames? A fun fact that may surprise you. Of course Smith, Miller, and similar (former) names of occupations are in that list, but don’t forget Lee, Liu, and Zhan!
#5 Things named after French presidents
Many countries have the custom of naming institutions and buildings after former public officials – France is no exception to that rule. Did you ever want to know what kind of things has been named after former French presidents?
#6 Awesomely assorted alliterations…
…always acquire accolades. A SPARQL query in Wikidata can be used to print out all the title of works containing an alliteration. It’s everything from All About Anna to Wild Wild West.
#7 Horses are dangerous. Especially if you happen to be of noble birth and fall down.
Wikidata can also tell you the cause of death for most noble people. Apparently, horses are dangerous, as the sixth most common cause is “horse fall”.
#8 Average gestation period of genera, color-coded by order
Wikidata isn’t only for queries, but also for data visualizations. This biology SPARQL query impressively demonstrates this with a bubble chart.
#9 The data are alive with the sound of music. But which key is used most often?
A simple query will tell: The most common key is C, then D, then E♭and B♭. The first minor key is on the seventh rank (D minor).
#10 Wikidata isn’t only fun and games, but can also be an inspiration for new Wikipedia articles.
Many of the queries we showed may be curious, funny or surprising and yet you may say “What’s in it for me?” If you’re a hardcore Wikipedian who has run out of ideas for new articles, just ask Wikidata for missing articles. One example: Which women born in Suriname are lacking an article in English Wikipedia? Wikidata will tell you!
Where to go from here?
There is much more to discover regarding SPARQL. Wikidata has extensive documentation on the topic. If you want, you can turn to the community to request a query. And every Sunday there’s the #SundayQuery hashtag on Twitter for all your SPARQL questions.
Wonderful blog. Excellent.
Comment by Anthere on 2. November 2016 at 20:48
@Vladimir Alexiev: Each bubble is for one genus, but the gestation period statements are on individual species of the genus, and the bubble size corresponds to the average of those. For example, here are all the subtaxa of the White Rhino (including itself), with gestation period if stated: http://tinyurl.com/zl6poef
@Papuass: Yes, the regular expression is supposed to account for such characters, but it looks like BlazeGraph, the software that runs the Wikidata Query Service, doesn’t do proper Unicode handling in regular expressions :/ in this case, “š” is not taken to be a “word character” (\w), even though it should be: http://tinyurl.com/zxdo7ys
Comment by WikidataFacts on 2. November 2016 at 16:28
I love the Cocktail example. Great work!
Comment by YMS on 2. November 2016 at 12:57
#6 returns “Es esmu šeit” (Latvian movie). Is something wrong with regex?
Comment by Papuass on 31. October 2016 at 22:44
#9 returns bubbles like “year”, “mo”.
And one of the biggest bubbles is White Rhino https://www.wikidata.org/wiki/Q5762446 but that item doesn’t have prop “gestation period”??
Comment by Vladimir Alexiev on 31. October 2016 at 07:22
Very cool display of the power of SPARQL (finally!). But for #9: this is not all music, but only those (very few) scores Wikidata knows about. And for #10: change Surinam to the USA (Q30, just remove the 7 from all the 730s) and change en -> de. There’s work to do on the German Wikipedia!!
Comment by WiseWoman on 30. October 2016 at 17:36