Posts Tagged ‘english’



From Damascus to Berlin: A very special internship at Wikimedia Deutschland

German summary: Alaa Mustafa hat gerade 6 Wochen Praktikum in der Software-Entwicklung bei Wikimedia Deutschland hinter sich. Der syrische IT-Spezialist flüchtete letzten Sommer vor dem Bürgerkrieg in Syrien nach Deutschland. Während er auf die Mühlen der Bürokratie wartete, bewarb er sich um ein Praktikum bei der Entwicklung von Wikidata. Wir haben ihn zum Ende seines Praktikums zu seinen Erfahrungen befragt. Das Interview fand auf Englisch statt, der Sprache, die Alaa auch in der täglichen Arbeit bei Wikimedia Deutschland benutzte.

Alaa Mustafa just finished six weeks of an internship at the software development department at Wikimedia Deutschland. The Syrian IT specialist came to Germany last summer, fleeing from the war. While he was stuck in bureaucracy, he applied for an internship to become part of the Wikidata team. We asked him about his experience at Wikimedia Deutschland in a short interview as his internship came to an end.

Can you tell me something about your background?

My name is Alaa Mustafa. I was born in Damascus and I am 28 years old. First, I studied in an institute for computer engineering for two years. Then I moved to university and studied for four years, with a major in information technology. Actually, after graduation, I didn’t work in that field. Rather, I was working in a company which sold consumer electronics – pretty much like Media Markt or Saturn here. There I worked in sales, in marketing, and in the business development team.

And then you came to Germany?

Yes, I came to Germany last summer, one year ago.

And what made you apply for an internship at Wikimedia Deutschland?

I’m a newcomer here, so I was looking for ways to integrate – Germany is a new country for me. I searched on websites for jobs in English and there I came across Wikimedia Deutschland. I reached out, got an interview, and then Lydia (product manager of Wikidata) accepted me.

Right now, you’re still waiting for the bureaucracy to sort out everything. Are you allowed to work now?

I may work, but it took a long time to get an approval from the Ausländerbehörde. The Agentur für Arbeit supports me, but right now I’m not allowed to make money through my work.

All in all, did you like the few weeks that you spent with us?

I liked it very much. Back in Syria, I had already heard of Wikimedia, a big organization and a great source for knowledge. I only worked here a bit over a month, for 45 days, but I feel really proud that I was part of this organization.

Let’s talk a bit about what you did here as an intern. I understand that you mostly helped Lydia?

Actually, I was working as an assistent for Lydia. There are many things on Wikidata pages that take a lot of time that Lydia doesn’t always have to do herself – things like updates on events or new features, so I did that.

But I was also asked about the website from the point of view of a user – not as a developer, but as an ordinary user: how does the website look like when a user opens Wikidata for the first time. We talked about possible improvements to the interface. Our UX team at Wikimedia Deutschland is currently working on the user experience and I was able to support them.

Last week you invited your colleagues for Arabic food for dinner. How did that go?

I wanted to have an opportunity to talk to everyone personally. Here, in the office, we always talk about work, but having dinner together gave us a chance to get to know each other personally. It was a very friendly dinner and that evening made me very happy.

I cooked something called Hummus (حُمُّص‎‎) and some rice with chicken. Typical Arabic food – two kinds of Hummus and chicken rice. It’s delicious! But you need to learn how to eat it correctly: with your hands, using bread to scoop it up.

So many people from Syria are now in Germany, I think we’ll soon see high quality Syrian food over here. You can already find good Hummus around Hermannplatz, so it’s a start.

Would you say that there are huge cultural differences regarding the work you did in Damascus and the work here? Or is work in IT the same all over the world?

The management side is definitely different. In Syria, even if your manager is wrong, you should go with him.

Here I feel that everyone can discuss everything freely and is listened to. We have a daily standup meeting where everyone has a chance to say something. I was only here for a little more than a month and I could give an honest critique about aspects of the product and Lydia never told me that I shouldn’t criticize things – rather, she appreciated it and took it as input.

Yes, things are different here. Discussion is very much valued. That makes the work very motivating. Everyone can discuss everything with everyone and it’s a very friendly atmosphere.

Anything else you would like to say? What comes next for you?

I now feel that these 45 days were… how should I say… I’d call it my “golden days” in Germany.

Within the next 10 days, I will start learning German at a school. I already studied German on my own with books, so I’m in a good position, but I really should go to school and take a proper course. That will take about 6 months, 4 hours every day.

After that I will search for work. Let’s see if Wikimedia Deutschland will have openings. But in any case, I’m proud that I was part of this organization and I will always try to keep in touch with you.

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (17 Bewertungen, Durchschnitt: 5,00 von 5)
Loading...

Writing a bachelor’s thesis at Wikimedia Deutschland e.V.

Wie es ist, bei Wikimedia Deutschland im Bereich Software-Entwicklung Bachelorandin zu sein, erzählt uns Charlie Kritschmar in diesem Gastbeitrag auf Englisch. Charlie berichtet davon, wie sie zu ihrem Thema im Bereich User-Interface-Design fand, der Arbeitsatmosphäre und wie aus all dem eine Arbeit zum Editieren auf Wikidata von der Wikipedia aus wurde. Die Arbeit wird in den nächsten Tagen auch auf Wikimedia Commons veröffentlicht werden. Wie bei vielen guten Geschichten gibt es am Ende von Charlies Bericht auch noch ein Happy End. Willkommen, Charlie!

Q920285 – how I found my place in the Wikimedia universe

It’s April 2015 and it’s about time for me to organise a topic for my bachelor’s thesis. I study Internationale Medieninformatik at the HTW Berlin. I was always fascinated by the intersection of humans and computers. Especially the psychological component of this subject and so I started specialising in this direction towards the end of my studies. Thus it was pretty clear to me that my thesis should fit within this scope. But the initial question remained and I still had no topic.

A fellow student and friend, Lucie (who coincidentally works at Wikimedia) pointed me to Lydia, Wikidata’s product manager. So far I have only been a consumer of the Wikimedia projects and had never contributed to any of them, let alone know what Wikidata was and what it does. It was about time to change that. It turns out that Lydia has loads of topics with many different focuses for students writing their thesis and we quickly decided on a topic that would benefit both of us.   

Weiterlesen »

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (5 Bewertungen, Durchschnitt: 5,00 von 5)
Loading...

Teaching machines to make your life easier – quality work on Wikidata

German summary: ORES is eine künstliche Intelligenz, die Vorschläge zur Bekämpfung von Vandalismus machen kann. Nachdem sie auf einigen Wikis bereits erfolgreich eingesetzt wurde, hilft sie jetzt auch bei der Qualitätsverbesserung bei Wikidata.  Amir Sarabadani und Aaron Halfaker beschreiben die Entwicklung und den Einsatz von ORES ein einem Gastbeitrag auf englisch.


 

A post by Amir Sarabadani and Aaron Halfaker

Today we want to talk about a new web service for supporting quality control work in Wikidata. The Objective Revision Evaluation Service (ORES) is an artificial intelligence web service that will help Wikidata editors perform basic quality control work more efficiently. ORES predicts which edits will need to be reverted. This service is used on other wikis to support quality control work. Now, Wikidata editors will get to reap the benefits as well. Weiterlesen »

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (9 Bewertungen, Durchschnitt: 4,44 von 5)
Loading...

Q167545: Wikidata celebrated its third birthday

 Wikidata celebrated its third birthday on October 29th. The project went online in 2012 and a lot has happened ever since.

Coincidentally, the birthday also happened along with the project being awarded a prize from Land der Ideen, so so a proper party for volunteers and everyone involved with the project was in order.

There was cake and silly birthday hats, but above all this was an occassion to look at the past, present, and future.

Denny Vrandečić and Eric Möller used a video message to talk about the genesis and development of Wikidata.

Community members Magnus Manske and Marteen Dammers talked about their work for Wikidata in GLAM and science. And Lydia Pintscher not only looked backed to a successful year behind us, but also gave us a peek into the future that lies ahead for the project.

In order to experience Wikidata there was a little exhibition of projects that use it: From Histropedia which visualizes timelines to Ask Platypus, a project that parses questions about the knowledge of the world according to Wikidata using natural language.

No birthday would be complete without presents. Especially the software developers had worked hard to improve parts of Wikidata for this special date. To give you just two examples:

  • https://www.wikidata.org/wiki/Special:Nearby shows nearby items in Wikidata and invites you to improve structured data knowledge in your neighborhood
  • A machine learning model called  ORES helps to identify vandalism with artificial intelligence and can be used as a tool for administrators

These are only two new features released for the birthday party. There is much, much more to come for the Wikidata project next year and we’ll talk about it in length in another post.

Wikidata has data in its name. However — this was more than obvious at the birthday party — it’s about more than just cold numbers. As in all collaborative projects, people are at the core of it all. Those behind or around Wikidata have love in their hearts for something that may at first sound as abstract as „structured data for Wikimedia projects and beyond“.

Upon exiting the party, guests could add themselves on a board and leave a tiny love letter to Wikidata . „I love Wikidata because… with machine-readable data, machines can do the heavy lifting for me“ one guest wrote. The last three years were all about building a foundation for machine-readable data. Let the heavy lifting begin in all the years to come. Q167545!

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (2 Bewertungen, Durchschnitt: 3,00 von 5)
Loading...

Internship at Wikimedia Deutschland e. V.

Wie es ist, bei Wikimedia Deutschland im Bereich Software Entwicklung Praktikant zu sein, erzählt uns Andrew Pekarek-Kostka in diesem Gastbeitrag auf Englisch. Andrew berichtet über seine Vorstellungen vor dem Praktikum, der Arbeitsatmosphäre, wie er sich an unsere Prozesse der agilen Softwareentwicklung angepasst hat und an welchen Wikiprojekten er arbeitete.

First Impression

Sandra Müllrick, „WMDE Softwareentwicklung2“, CC BY-SA 4.0

Before starting at Wikimedia Deutschland e. V., being a 17-year old with no previous work experience, I was under the impression that working for such a big organization would be similar to the way a typical work day was portrayed in the famed 90s movie “Office Space”: boring, dull, and uninteresting. However, when I walked into the office for the first time the contrast between the rather dark and grim weather in Berlin and the ecstatic attitude of the office was huge. The overall working atmosphere is welcoming and spirited with a perfect balance of fun and professionalism. It was a positive opposite of what I was expecting.

 

My Welcome

My first few days at Wikimedia were spent going through a process called “onboarding” which enabled me to get ready for my following 7 weeks. This consisted of prepping my machine, my software, and getting all my accounts setup along with a quick tour which gave me the opportunity to briefly meet most of my colleagues from all departments. If I had any questions or needs, I always knew I could ask one of my colleagues for assistance. Following the setup came a series of meetings through which I gained valuable insight into the various different departments and simultaneously meet the many interesting people who, together make Wikimedia Deutschland possible.

Weiterlesen »

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (15 Bewertungen, Durchschnitt: 5,00 von 5)
Loading...

Visualizing history with automated event maps

German summary: Fred Johansen hat eine Webseite erstellt mit der sich, basierend auf Daten in Wikidata, einfach historische Ereignisse zeitlich und räumlich einordnen lassen. Hier erzählt er über die Seite und seine Arbeit daran.


The following post is a guest blog by Fred Johansen about EventZoom.

Just as today’s online maps are being continually updated, historical maps can be automatically generated and updated to reflect our ever-evolving knowledge about the past. As an example, please allow me to tell you about a project that I’m working on. Recently I implemented an event visualization site which accepts geolocation data combined with info about time spans of events, and renders the input as points on a map zoomable in time and space. Each such point is an object with a title, description, latitude / longitude and a time, as well as a reference back to its source. But what source should be used to fill this framework with data? Even though this is a tool born outside of the Wikimedia world, so far the best content I’ve found for it is Wikidata – more specifically, the Wikidata API. By importing data about events that are part of larger events all defined in Wikidata, with the restriction that they contain a start or end date as well as a location, that’s all the data that’s needed for representation in this kind of dynamic historical map.

Extracting data from the Wikidata API works like a charm. Sometimes, of course, some data might be missing from Wikidata. For example, an event may contain an end date, but no start date. So, what’s fantastic about Wikidata is that it’s easy to simply extend its data by adding the missing fact. In addition to helping in increasing the data of Wikidata, this also improves the overall possibilities for visualization.

This very activity serves as a positive feedback loop: The visualization on a map of, for example, the events of a war makes errors or omissions quite obvious, and serves as an incentive to update Wikidata, and finally to trigger the re-generation of the map.

The site I’m referring to here is EventZoom.net – currently in Beta and so far containing 82 major event maps and growing. You can extend it yourself by triggering the visualization of new maps: When you do a search for an event, for example a war, and the Search page reports it as missing, you can add it directly. All you need is its Q-ID from Wikidata. Paste this ID into the given input field, and the event will be automatically imported from the Wikidata API, and a map automatically generated – with the restriction that there must exist some ‘smaller’ events that contain time & location data and are part (P361) of the major event. Those smaller events become the points on our map, with automatic links back to their sources. As for the import itself, for the time being, it also depends on wdq.wmflabs.org, but I expect that will change in the future.

Although you can always click Import to get the latest info from Wikidata, an automatic update is also in the pipeline, to trigger a re-import whenever the event or any of its constituent parts have changed in Wikidata. As for other plans, at the very least our scope should encompass all the major events of history. Here, wars represent a practical starting point, in so far as they consist of events that are mostly bounded by very definite time spans and locations, and so can be defined by those characteristics. The next step would be to extend the map visualization to other kinds of events – as for Wikidata, it could be interesting to visualize all kinds of items that can be presented with a combination of geolocations and temporal data, and that can be grouped together in meaningful ways.

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (3 Bewertungen, Durchschnitt: 3,33 von 5)
Loading...

Using Wikidata to Improve the Medical Content on Wikipedia

German summary: Vor einigen Tagen wurde eine wissenschaftliche Veröffentlichung publiziert die sich damit beschäftigt wie Wikipediaartikel zu medizinischen Themen durch Wikidata verbessert werden können. Hier stellen sie die Veröffentlichung und ihre Ergebnisse vor.

 

This is a guest post by Alexander Pfundner, Tobias Schönberg, John Horn, Richard D. Boyce and Matthias Samwald. They have published a paper about how medical articles on Wikipedia can be improved using Wikidata.

An example of an infobox that shows drug-drug-interactions from Wikidata. Including this information could be of significant benefit to patients around the world.

The week before last a study was published in the Journal of Medical Internet Research that investigates how Wikidata can help to improve medical information on Wikipedia. The researchers from the Medical University of Vienna, the University of Washington and the University of Pittsburgh that carried out the study are active members of the Wikidata community.

The study focuses on how potential drug-drug interactions are represented on Wikipedia entries for pharmaceutical drugs. Exposure to these potential interactions can severely diminish the safety and effectiveness of therapies. Given the fact that many patients and professionals often rely on Wikipedia to read up on a medical subject, the quality, completeness and relevance of these interactions can significantly improve the situation of patients around the world.

In the course of the study, a set of high-priority potential drug-drug-interactions were added to Wikidata items of common pharmaceutical drugs (e.g. Ramelteon). The data was then compared to the existing information on the English Wikipedia, revealing that many critical interactions were not explicitly mentioned. It can be expected that the situation is probably worse for many other languages. Wikidata could play a major role in alleviating this situation: Not only does a single edit benefit all 288 languages of Wikipedia, but the tools for adding and checking data are much easier to handle. In addition, adding qualifiers (property-value pairs that further describe the statement, e.g. the severity of the interaction) and sources to each statement puts the data in context and makes cross-checking easier . In the study Wikidata was found to be capable to act as a repository for this data.

The next part of the study investigated how potential drug-drug interaction information in Wikipedia could be automatically written and maintained (i.e. in the form of infoboxes or within a paragraph). Working with the current API and modules, investigators found that the interface between Wikidata and Wikipedia is already quite capable, but that large datasets still require better mechanisms to intelligently filter and format the data. If the data is displayed in an infobox, further constraints come from the different conventions on how much information can be displayed in an infobox, and whether large datasets can be in tabs or collapsible cells.

Overall the study comes to the conclusion that, the current technical limitations aside, Wikidata is capable to improve the reliability and quality of medical information on all languages of Wikipedia.

The authors of the study would like to thank the Wikidata and Wikipedia community for all their help. And additionally the Austrian Science Fund and the United States National Library of Medicine for funding the study.

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (4 Bewertungen, Durchschnitt: 5,00 von 5)
Loading...

Improving data quality on Wikidata – checking what we have

German summary: Ein Team von Studenten des Hasso Plattner Instituts in Potsdam arbeitet aktuell mit Wikimedia Deutschland an Werkzeugen um die Datenqualität auf Wikidata zu verbessern und zu sichern. In diesem Beitrag stellen sie ihre beiden Projekte vor: die Prüfung von Wikidatas Daten auf Konsistenz mit sich selbst sowie die Prüfung von Wikidatas Daten gegen andere Datenbanken.

 

 Hello, we are the Wikidata Quality Team. We are a team of students from Hasso Plattner Institute in Potsdam, Germany. For our bachelor project we are working together with the Wikidata development team to ensure high quality of the data on Wikidata.

Wikidata provides a lot of structured data open to everyone. Quite a lot. Actually, they are providing an enormous amount of data approaching the mark of 13.5 million items, each of which has numerous statements. The data got into the system by diligent people and by bots, and neither people nor bots are known for infallibility. Errors are made and somehow we have to find and correct them. Besides erroneous data, incomplete data is another problem. Imagine you are a resident of Berlin and want to improve the Wikidata item about the city. You go ahead and add its highest point (Müggelberge), its sister cities (Los Angeles, Madrid, Istanbul, Warsaw and 21 others) and its new head of government (Michael Müller). As you do it the correct way, you are using qualifiers and references. Good job, but did you think of adding Berlin as the sister city of 25 cities? Although the data you entered is correct, it is incomplete and you have—both unwilling and unknowingly—introduced an inconsistency. And that’s only, assuming you used the correct items and properties and did not make a typo while entering a statement. And thirdly, things change. Population numbers vary, organizations are dissolved and artists release new albums. Wikidata has the huge advantage that this change only has to be made in one place, but still: Someone has to do it and even more importantly, someone has to become aware of it.

Facing the problems mentioned above, two projects have emerged. People using Wikidata are adding identifiers of external databases like GND, MusicBrainz and many more. So why not make use of them? We are developing a tool that scans an item for those identifiers and then searches in the linked databases for data against which it compares the items statements. This does not only help us verify Wikidata’s content and find mismatches that could indicate errors, but also makes us aware of changes. MusicBrainz is a specialist for artists and composers, GND for data related to people, and these specialists‘ data is likely to be up to date. Using their databases to cross-check, we hope to be able to have the latest data of all fields represented in Wikidata.

The second projects focuses on using constraints on properties. Here are some examples to illustrate what this means:

  • Items that have the property “date of death” should also have “date of birth“, and their respective values should not be more than 150 years apart
  • Properties like “sister city“ are symmetric, so items referenced by this statement should also have a statement “sister city“ linking back to the original item
  • Analogously, properties like “has part” and “part of” are inverse and should be used on both items in a lot of cases
  • Identifiers for IMDb, ISBN, GND, MusicBrainz etc. always follow a specific pattern that we can verify
  • And so on…

Checking these constraints and indicating issues when someone visits an items page, helps identify which statements should be treated with caution and encourages editors to fix errors. We are also planning to provide ways to fix issues (semi-)automatically (e.g. by adding the missing sister city when he is sure, that the city really has this sister city). We also want to check these constraints when someone wants to save a new entry. This hopefully prevents errors from getting into the system in the first place.

That’s about it – to keep up with the news visit our project page. We hope you are fond of our project and we appreciate your feedback! Contact information can also be found on the project page.

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (11 Bewertungen, Durchschnitt: 5,00 von 5)
Loading...

Platypus, a speaking interface for Wikidata

PPP (Projet Pensées Profondes)  is a student project aiming to build an open question answering platform. Its demo, Platypus (http://askplatyp.us) is massively based on Wikidata content. 

At the École normale supérieure de Lyon we have to do a programming project during the first part of your master degree curriculum. Some of us were very interested in working on natural language processing and others on knowledge bases. So, we tried to find a project that could allow us to work on both sides and, quickly, the idea of an open source question answering tool came up.
This tool has to answer to a lot of different questions so one of the requirements of this project was to use a huge generalist knowledge base in order to have a usable tool quickly. As one of us was already a Wikidata contributor and inspired by the example of the very nice but ephemeral Wiri tool of Magnus Manske, we quickly chose to use Wikidata as our primary data source.

Weiterlesen »

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (3 Bewertungen, Durchschnitt: 5,00 von 5)
Loading...

Asking Ever Bigger Questions with Wikidata

German summary: Maximilian Klein benutzt Wikidata als als Datenfundus für statistische Auswertungen über das Wissen der Welt. In seinem Artikel beschreibt er, wie er in Wikidata nach Antworten auf die großen Fragen sucht.

Asking Ever Bigger Questions with Wikidata

Guest post by Maximilian Klein

A New Era

Simultaneous discovery can sometimes be considered an indication for a paradigm shift in knowledge, and last month Magnus Manske and I seemed to have both had a very similar idea at the same time. Our ideas were to look at gender statistics in Wikidata and to slice them up by date of birth, citizenship, and langauge. (Magnus‘ blog post, and my own.) At first it seems like quite elementary and naïve analysis, especially 14 years into Wikipedia, but only within the last year has this type of research become feasible. Like a baby taking its first steps, Wikidata and its tools ecosystem are maturing. That challenges us to creatively use the data in front of us.

Describing 5 stages of Wikidata, Markus Krötsch foresaw this analyis in his presentation at Wikimania 2014. The stages which range fromKnow to Understand are: Read, Browse, Query, Display, and Analyse (see image). Most likey you may have read Wikidata, and perhaps even have browsed with Reasonator, queried with autolist, or displayed with histropedia. I care to focus on analyse – the most understand-y of the stages. In fact the example given for analyse was my first exploration of gender and language, where I analysed the ratio of female biographies by Wikipedia Language: English and German are around 15% and Japanese, Chinese and Korean are each closer to 25%.

Weiterlesen »

1 Stern2 Sterne3 Sterne4 Sterne5 Sterne (5 Bewertungen, Durchschnitt: 4,80 von 5)
Loading...