zurück

Data Partnerships with Wikidata: beaTunes

An interview with Hendrik Schreiber of beaTunes about how to use Wikidata in a commercial software product.

Jens Ohlig

25. July 2017

Dieser Beitrag ist auch auf Deutsch verfügbar.

Wikidata data partnerships happen in many ways. While data donations are a way for institutions, organisations or individuals to contribute content to the free knowledge-base, re-use of data in applications is just as interesting as it contributes to the eco system of Free Knowledge and thus gives more people more access to more knowledge.

beaTunes is an application for the Mac that lets you build playlists for your music collection. It uses Wikidata in various ways to enrich the application.

The free license for Wikidata allows commercial re-use as well.  We talked with Hendrik Schreiber of beaTunes about how to use Wikidata in a commercial software product.

Can you introduce yourself?

Hendrik Schreiber Foto: Pascal Nordmann CC BY-SA 4.0

I’m an independent software developer with almost 20 years of professional experience. I live and work in Cologne and my main interest is Music Information Retrieval (MIR).

What is beaTunes? How does it work and what problems does it solve?

I started coding beaTunes more than ten years ago as a just-for-fun-project. At the time I had spent multiple years working as a contractor and felt a little burned out from the daily grind. I just wanted to create something fun. So the idea of a smart music library management and audio analysis tool was born. The first version was in English only and would run exclusively on OS X. The language of choice was Java. Back then I still believed Steve Job’s famous promise that the Mac will be the best Java platform ever. Well, we all know how that turned out.

Over the years, beaTunes became more than just a hobby and is now my main occupation.

So what is beaTunes good for?

It can help fix textual metadata (artist, album, title, etc.) using consistency checks and a reference database. That database is powered by a mixture of user submissions and data from third parties like MusicBrainz and Discogs (both MusicBrainz and Discogs ids can be found in Wikidata). But textual metadata is only half the story. Other important musical features are tonal key, tempo (BPM), timbre, loudness, etc. These are properties that beaTunes can extract straight from the audio signal. They come in handy when you want to build playlists of similar sounding songs. And that’s essentially the third main functionality: using comprehensive metadata to create great playlists.

This means that beaTunes appeals to a quite diverse group of users. Of course they all love music. But some really just want to fix their metadata, others need a good key and tempo detection for their next DJ gig, because they use beatmatching and harmonic mixing. And yet another group likes to work out to music and wants to find tracks that match their running or cycling pace.

What kind of problems does Wikidata solve for you? How did you discover Wikidata?

beaTunes uses Wikidata in multiple ways.

When a user manually edits song metadata, alternative spellings and additional data is fetched from different sources. beaTunes essentially acts like a smart spellchecker specialized on music metadata. Wikidata is one of the reference sources. The fact that Wikidata is searchable via MusicBrainz and Discogs ids makes this very straight forward and easy. Because Wikidata makes it so easy to access DBpedia or Wikipedia data, it is also used when looking up additional a information on albums, TV shows or movies—beaTunes simply displays the first couple of sentences of the corresponding Wikipedia article. The fact that information on Wikidata is typed is very helpful here and helps with disambiguation.

Besides using plain textual data from the Wiki universe, beaTunes also exploits relationships. It can use Wikidata to look up similar artists, by searching for artists from a region, producing music in a certain genre and having been active during a particular time. Additional information like band membership, influencers etc. can also be taken into account. For the final ranking of artists found this way, beaTunes uses a simple machine learning approach.

Last but not least: beaTunes uses Wikidata to answer questions about genre relationships. Because of the subgenre relationship between Wikidata genres, it is easy to reason that Hard Rock is a subgenre of Rock, but Calypso is not. And genre relationships are what helped me discover Wikidata. I was trying to learn genre relationship graphs from another database and needed an “objective” reference graph. In my evaluation I used both Wikidata and DBpedia (see http://www.tagtraum.com/learned_ontologies.html). Wikidata’s clear structure was very appealing to me.

Are you happy with the API, the data we provide, the documentation? Do you have a request for a feature or an improvement we could work on?

I would love to see more semantic data about music. E.g. all Beatles songs are completely annotated with key, chords, tempo, etc. (see http://isophonics.net/content/reference-annotations-beatles) The data exists, it’s just not reachable from Wikidata yet.

Free knowledge and commercial products — it’s not a contradiction, but not what many people may have in mind. How does working with material under a free license work for you? How can both sides benefit from each other?

First of all I’m extremely grateful for all the free data out there. And not just Wikidata, but also projects like MusicBrainz and AcousticBrainz. They are amazing treasure troves of knowledge waiting to be used. And that’s the thing—without someone who actually uses the data, it’s worthless. So the more interesting applications we have out there, the more attention we get, the better the knowledge eco-system works. Publicity, be it from commercial or free products, is important for free data to succeed.

By the way, using free data in your product can benefit the knowledge base. I have just released a little plugin for beaTunes that makes it extremely easy for users to contribute to AcousticBrainz (see http://blog.beatunes.com/2017/06/acousticbrainz-plugin-available-now.html). And for Wikidata, beaTunes has this “Open in…” function, that lets you open a Wikidata page for the selected song/artist. If someone wants to contribute, that’s a perfect starting point. So the data does not only flow in one direction. Even a commercial application like beaTunes can contribute to building open knowledge databases.

Hinterlasse ein Kommentar

Your email address will not be published. Required fields are marked *