[…] Denny Vrandečić (ever keen to show just how much he deserved his PhD in the subject!) argued recently, strong classification of the kind that categories lead to seems to hit a nerve in the human […]
Last night I came across [[:fr:Statue du Christ-Roi]] with a list of giant statues of Christ the King and the corresponding wikidata item 'monumental statue of Christ the King'. I went and tagged all those statues on Wikidata with 'instance of:monumental statue of Christ the King'.
Sleeping on it I realised this morning that 'monumental statue of Christ the King', when used as a class, is trying to do two things at the same time. Today I am going to change all of the items for those statues to 'instance of:colossal statue' and 'depicts:Christ the King'.
Similarly the important property for classifying humans is probably going to be 'occupation' even if they are all tagged with 'instance of:human' as well. ~~~~
'instance of' will be important but it is not the only important property.
Classes, defined using the 'subclass of' property to link specific classes to more general items, seem to have a place defining what values are acceptable with various properties. The 'occupation' property will mostly link to items which are a subclass of the 'occupation' item. 'instance of' should, in general, not link to items which are a subclass of the 'occupation' item. Bots can be used to highlight exceptions to these guidelines for review by humans. The human can then change the property, mark the value item as a 'subclass of:occupation' or accept it as an exception.
This is, I now believe, the appropriate compromise between rules and exceptions and is not something I would ever have come up with without the nudging built into the software by Denny.
Hi. I think the biggest problem with no hardwired classification scheme is not a philosophical or a theorical one, it's a practical one : It tends to make quite difficult to make properties suggestion to the user, as the system does not have knowledge of what the item is.
I puts a lot into the hands of the community to make a lot of specialised tool with a more or less hard-wired properties for domain specific tools, and into periodic reports and building tools for expressing constraints or patterns (with Wikisyntax like http://mappings.dbpedia.org/index.php/Main_Page dbpedia ? it can be made useful but I can't help myself thinking it's a bit of a suboptimal hack :) ).
On the other hand a type or class system in Wikidata could be implemented with the same principle that exists in your posts : annotated (qualified) classification, soft constraints which serve more as patterns than as limits, with additionate benefits as immediate reports to the user. It's a matter of choice but I think it tend to be kind of hard for community to understand all these problems and make (another level of) choices. Maybe this would help community at no expressive costs and will make things go faster, make users understand a little better the project ?
Thank you, thank you, thank you, Denny! All of the textbooks on semantic modelling seem to strongly believe in the ability to naturally classify everything into a neat hierarchy. Computer scientists love to have things fit nicely into disjunct categories. Except that as human beings, we are messy and don't fit. There are always exceptions, and especially exceptions over time. With the man recently giving birth in Neukölln we now have a jillion or so standard father-mother-children examples shot to hell ;)
Please keep strong classification OUT of WikiData. Otherwise we will end up having to fudge our way around the problems that occur, and since we don't have good means of representing inference, that will make it immensely difficult to figure out what went wrong. Tagging will permit multiple and overlapping "classification" and more closely fit the Real World (tm), imho.
I think you should be more explicit about what you mean by "strong classification". Does it just mean knowing automatically the set of all the superclasses of an item ? It seems that we can do that trough other means anyway (by recursively querying the value of the "instance of" property or, if we want to make it more efficient, by having bots compile lists of subclasses/superclasses for major items).
We use cookies on our website to provide you with the best experience by remembering your preferences even on repeat visits. By clicking "Accept" you consent to the use of all cookies. However, you can access the cookie settings to give controlled consent. You can also find more information in our Privacy policy
Diese Website verwendet Cookies, um Ihre Erfahrung zu verbessern, während Sie durch die Website navigieren. Von diesen Cookies werden die nach Bedarf kategorisierten Cookies in Ihrem Browser gespeichert, da sie für das Funktionieren der Grundfunktionen der Website unerlässlich sind. Wir verwenden auch Cookies von Drittanbietern, mit denen wir analysieren und nachvollziehen können, wie Sie diese Website nutzen. Diese Cookies werden nur mit Ihrer Zustimmung in Ihrem Browser gespeichert. Sie haben auch die Möglichkeit, diese Cookies zu deaktivieren. Das Deaktivieren einiger dieser Cookies kann sich jedoch auf Ihr Surferlebnis auswirken.
[…] Denny Vrandečić (ever keen to show just how much he deserved his PhD in the subject!) argued recently, strong classification of the kind that categories lead to seems to hit a nerve in the human […]
Last night I came across [[:fr:Statue du Christ-Roi]] with a list of giant statues of Christ the King and the corresponding wikidata item 'monumental statue of Christ the King'. I went and tagged all those statues on Wikidata with 'instance of:monumental statue of Christ the King'. Sleeping on it I realised this morning that 'monumental statue of Christ the King', when used as a class, is trying to do two things at the same time. Today I am going to change all of the items for those statues to 'instance of:colossal statue' and 'depicts:Christ the King'. Similarly the important property for classifying humans is probably going to be 'occupation' even if they are all tagged with 'instance of:human' as well. ~~~~ 'instance of' will be important but it is not the only important property. Classes, defined using the 'subclass of' property to link specific classes to more general items, seem to have a place defining what values are acceptable with various properties. The 'occupation' property will mostly link to items which are a subclass of the 'occupation' item. 'instance of' should, in general, not link to items which are a subclass of the 'occupation' item. Bots can be used to highlight exceptions to these guidelines for review by humans. The human can then change the property, mark the value item as a 'subclass of:occupation' or accept it as an exception. This is, I now believe, the appropriate compromise between rules and exceptions and is not something I would ever have come up with without the nudging built into the software by Denny.
Hi. I think the biggest problem with no hardwired classification scheme is not a philosophical or a theorical one, it's a practical one : It tends to make quite difficult to make properties suggestion to the user, as the system does not have knowledge of what the item is. I puts a lot into the hands of the community to make a lot of specialised tool with a more or less hard-wired properties for domain specific tools, and into periodic reports and building tools for expressing constraints or patterns (with Wikisyntax like http://mappings.dbpedia.org/index.php/Main_Page dbpedia ? it can be made useful but I can't help myself thinking it's a bit of a suboptimal hack :) ). On the other hand a type or class system in Wikidata could be implemented with the same principle that exists in your posts : annotated (qualified) classification, soft constraints which serve more as patterns than as limits, with additionate benefits as immediate reports to the user. It's a matter of choice but I think it tend to be kind of hard for community to understand all these problems and make (another level of) choices. Maybe this would help community at no expressive costs and will make things go faster, make users understand a little better the project ?
Thank you, thank you, thank you, Denny! All of the textbooks on semantic modelling seem to strongly believe in the ability to naturally classify everything into a neat hierarchy. Computer scientists love to have things fit nicely into disjunct categories. Except that as human beings, we are messy and don't fit. There are always exceptions, and especially exceptions over time. With the man recently giving birth in Neukölln we now have a jillion or so standard father-mother-children examples shot to hell ;) Please keep strong classification OUT of WikiData. Otherwise we will end up having to fudge our way around the problems that occur, and since we don't have good means of representing inference, that will make it immensely difficult to figure out what went wrong. Tagging will permit multiple and overlapping "classification" and more closely fit the Real World (tm), imho.
I think you might like the approach presented recently by Stefan Decker that prefers prototypes to classes when modelling data. See Stefan's slides (http://www.slideshare.net/stefandecker1/stefan-decker-keynote-at-cshals/28) or somewhat quiet Google+ group on this topic (https://plus.google.com/communities/102405508518643959546).
I think you should be more explicit about what you mean by "strong classification". Does it just mean knowing automatically the set of all the superclasses of an item ? It seems that we can do that trough other means anyway (by recursively querying the value of the "instance of" property or, if we want to make it more efficient, by having bots compile lists of subclasses/superclasses for major items).