DCMI OpenWEMI#

(This text: 2024-06-25, with a small update 2024-07-02 at the very end of the text)

Wonderful news (2023-11-29): DCMI goes WEMI!

_images/github.com_dcmi_openwemi_2024-06-21.png

Fig. 2 Source: dcmi/openwemi#

Discussion#

Multiple rdfs:range e.g. for openwemi:instantiates?#

The semantics of a single open arrow is specified in dcmi/openwemi in normal language with domain and range, and concretized in the turtle file with rdfs:domain and rdfs:range. This is not a problem if no more than one domain or range is specified per property.

Question: Is it wise to specify several arrows for a property such as openwemi:instantiates in the informal mapping? What is the concrete semantics of OR in the normal language explanation, and how is this implemented in RDF(S)?

Let’s take a look at the ttl code.

openwemi:instantiates
  a rdf:Property ;
  rdfs:label "instantiates"@en ;
  rdfs:comment "An Endeavor that instantiates a Manifestation, an Expression or a Work."@en ;
  rdfs:isDefinedBy openwemi: ;
  rdfs:subPropertyOf dct:relation ;
  rdfs:domain openwemi:Item ;
  dct:description "A relationship asserted from an Item to a Manifestation, an Expression, or a Work."@en ;
  rdfs:range [
    a owl:Class ;
    owl:unionOf (
      openwemi:Work
      openwemi:Expression
      openwemi:Manifestation
    )
  ] .

The rdfs:comment for the property openwemi:instantiates explains that an openwemi:Item openwemi-instantiates “a manifestation, an expression or a work”. (A layman would read XOR here, but as logicians we of course read a non-exclusive OR). Together with the semantics of rdfs:range, the rdfs:comment indicates that one or more classes from openwemi:Work, openwemi:Expression or openwemi:Manifestation are openwemi-instantiated. – Explicitly: “one or more classes from …” here means more exactly “one or more elements out of the class, which consists of the classes …”.

However, something else is actually specified in the ttl file: The rdfs:range of openwemi:instantiates is specified as an anonymous class consisting of the union of openwemi:Work, openwemi:Expression and openwemi:Manifestation. – Explicitly: “consistsing of the union of …” here means “all elements which are contained in at least one of the the classes … “.

The consequence: rdfs:range can no longer be used to derive a specific class from openwemi:Work, openwemi:Expression and openwemi:Manifestation.

Say we have the triple :myItem_123 openwemi:instantiates :myManifestation_123:

  • Using the above domain information, it is in fact possible to infer that :myItem_123 a openwemi:Item;

  • however, using the above range information, it is not possible to infer that :myItem_123 a openwemi:Manifestation.

Possible solutions#

(1) good solution, mentioned in dcmi/openwemi#43: Do it like FaBiO … and discuss the term “disadvantage”:

The obvious disadvantages are: […] people need to think about when creating their metadata (“is this an item of a manifestation or an item of an expression?” (dcmi/openwemi#43)

IMHO this is an advantage

(philosophical discussion: IMHO the WEMI classes are strongly disjoint, from an ontological point of view. Thus an item may only be connected to a manifestation, but not to an expression or even work. item instantiates work does not fit to the original FRBR WEMI and not to RDA and CIDOC. But that’s not the point of discussion here.)

(2)

what if the object is a combination work/expression à la Bibframe?

If we go to WEMI we anyhow need to disambiguate work/expression-combinations into two different nodes. Long discussion: https://www.jbusse.de/lovs/semantische-dekomposition-konglomerat.html#anwendung-buch-nach-wemi-dcat-nach-wemi, https://www.jbusse.de/logd/dcat2frbr (in DE, but machine translation may help?).

(3)

Another solution: Create OpenWEMI as a radical prune of CIDOC. RDA was merged with CIDOC, see https://www.cidoc-crm.org/frbroo/fm_releases > https://www.cidoc-crm.org/frbroo/sites/default/files/LRMoo_V1.0.pdf; there see Figure: 4.2. Overview of the Model: Illustration 1, page 6 in LRMoo_V1.0.pdf

Wouldn’t it be nice to align OpenWEMI with CIDOC LRMoo, at least in principle? In any case, one should avoid introducing axioms that are not compatible with CIDOC. Some (including me) consider CIDOC and LRMoo to be “rather fat”: If OpenWEMI is intended to provide a lightweight alternative, then OpenWEMI could be created as a minimal lightweight version of the respective CIDOC classes?

(4)

IMHO best solution: Do not model domain and range at all. IMHO best solution: Do not model domain and range at all, at least not with rdfs:domain and rdfs:range.

This is because these language elements from RDFS (and OWL) have semantics that are misunderstood by many people. NEW 2024-07-02: some explanations see below, Semantics of rdfs: Entailment, not validation or constraint.

experiment with rdflib and owlrl#

We show the domain and range inferencing with a minimalistic example, based on the well known Python libraries

import rdflib
import owlrl

Exampe from above, plus one triple of example instances:

wemi_ttl = """
@prefix : <http://example.org/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix openwemi: <https://dcmi.github.io/openwemi/ns#> .

openwemi:instantiates
  a rdf:Property ;
  rdfs:domain openwemi:Item ;
  rdfs:range [
    a owl:Class ;
    owl:unionOf (
      openwemi:Work
      openwemi:Expression
      openwemi:Manifestation
    )
  ] .

:myItem_123
   a <urn:someSubjectClass> .
<urn:someSubjectClass> a owl:Class .

:myManifestation_123 
    a <urn:someObjectClass> .
<urn:someObjectClass> a owl:Class .

:myItem_123
    openwemi:instantiates :myManifestation_123 .
"""

Graph g1 is the graph before inferencing:

g1 = rdflib.Graph().parse(data= wemi_ttl)
print(f"Initially g1 has {len(g1)} triples")
Initially g1 has 16 triples
print(g1.serialize())
@prefix : <http://example.org/ns#> .
@prefix openwemi: <https://dcmi.github.io/openwemi/ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

:myItem_123 a <urn:someSubjectClass> ;
    openwemi:instantiates :myManifestation_123 .

openwemi:instantiates a rdf:Property ;
    rdfs:domain openwemi:Item ;
    rdfs:range [ a owl:Class ;
            owl:unionOf ( openwemi:Work openwemi:Expression openwemi:Manifestation ) ] .

:myManifestation_123 a <urn:someObjectClass> .

<urn:someObjectClass> a owl:Class .

<urn:someSubjectClass> a owl:Class .

We also allocate g2. It will be modified by owlrl.DeductiveClosure().

g2 = rdflib.Graph().parse(data= wemi_ttl)
print(f"Initially g2 has {len(g2)} triples")
Initially g2 has 16 triples
owlrl.DeductiveClosure(owlrl.OWLRL_Semantics,
    axiomatic_triples = False).expand(g2)
print(f"After inferencing g2 has {len(g2)} triples")
After inferencing g2 has 169 triples
g2_ttl = g2.serialize(format="ttl")
# print(g2_ttl)

What’s the domain of our exemplar?#

q_domain = """
PREFIX : <http://example.org/ns#> 
PREFIX openwemi: <https://dcmi.github.io/openwemi/ns#> 
PREFIX owl: <http://www.w3.org/2002/07/owl#> 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 

SELECT ?S ?SClass
WHERE {
  ?S a ?SClass .
  ?S openwemi:instantiates ?O .
  }
"""

Graph g1 reflects the situation before inferencing:

for row in g1.query(q_domain):
    print(row)
(rdflib.term.URIRef('http://example.org/ns#myItem_123'), rdflib.term.URIRef('urn:someSubjectClass'))

Graph g2 reflects the situation after inferencing:

for row in g2.query(q_domain):
    print(row)
(rdflib.term.URIRef('http://example.org/ns#myItem_123'), rdflib.term.URIRef('urn:someSubjectClass'))
(rdflib.term.URIRef('http://example.org/ns#myItem_123'), rdflib.term.URIRef('https://dcmi.github.io/openwemi/ns#Item'))
(rdflib.term.URIRef('http://example.org/ns#myItem_123'), rdflib.term.URIRef('http://www.w3.org/2002/07/owl#Thing'))

As we can see, our example item :myItem_123' is now an instance of openwemi:Item.

What’s the range of our exemplar?#

q_range = """
PREFIX : <http://example.org/ns#> 
PREFIX openwemi: <https://dcmi.github.io/openwemi/ns#> 
PREFIX owl: <http://www.w3.org/2002/07/owl#> 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 

SELECT ?O ?OClass
WHERE {
  ?O a ?OClass .
  ?S openwemi:instantiates ?O .
  }
"""

Graph g1 reflects the situation before inferencing:

for row in g1.query(q_range):
    print(row)
(rdflib.term.URIRef('http://example.org/ns#myManifestation_123'), rdflib.term.URIRef('urn:someObjectClass'))

Graph g2 reflects the situation after inferencing:

for row in g2.query(q_range):
    print(row)
(rdflib.term.URIRef('http://example.org/ns#myManifestation_123'), rdflib.term.URIRef('urn:someObjectClass'))
(rdflib.term.URIRef('http://example.org/ns#myManifestation_123'), rdflib.term.BNode('n2fa38e3a01824b92985d162259e6175cb1'))
(rdflib.term.URIRef('http://example.org/ns#myManifestation_123'), rdflib.term.URIRef('http://www.w3.org/2002/07/owl#Thing'))

As we can see, our example manifestation :myManifestation_123' is now NOT instance of openwemi:Manifestation. Instead it is an instance of a BNODE with an internal ID.

The full set of triples of g2 after inferencing can be seen here:

print(g2.serialize())
@prefix : <http://example.org/ns#> .
@prefix openwemi: <https://dcmi.github.io/openwemi/ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:myItem_123 a owl:Thing,
        openwemi:Item,
        <urn:someSubjectClass> ;
    owl:sameAs :myItem_123 ;
    openwemi:instantiates :myManifestation_123 .

rdf:HTML a rdfs:Datatype ;
    owl:sameAs rdf:HTML .

rdf:PlainLiteral a rdfs:Datatype ;
    owl:sameAs rdf:PlainLiteral .

rdf:XMLLiteral a rdfs:Datatype ;
    owl:sameAs rdf:XMLLiteral .

rdf:first owl:sameAs rdf:first .

rdf:langString a rdfs:Datatype ;
    owl:sameAs rdf:langString .

rdf:rest owl:sameAs rdf:rest .

rdf:type owl:sameAs rdf:type .

rdfs:Literal a rdfs:Datatype ;
    owl:sameAs rdfs:Literal .

rdfs:comment a owl:AnnotationProperty ;
    owl:sameAs rdfs:comment .

rdfs:domain owl:sameAs rdfs:domain .

rdfs:isDefinedBy a owl:AnnotationProperty ;
    owl:sameAs rdfs:isDefinedBy .

rdfs:label a owl:AnnotationProperty ;
    owl:sameAs rdfs:label .

rdfs:range owl:sameAs rdfs:range .

rdfs:seeAlso a owl:AnnotationProperty ;
    owl:sameAs rdfs:seeAlso .

rdfs:subClassOf owl:sameAs rdfs:subClassOf .

rdfs:subPropertyOf owl:sameAs rdfs:subPropertyOf .

xsd:NCName a rdfs:Datatype ;
    owl:sameAs xsd:NCName .

xsd:NMTOKEN a rdfs:Datatype ;
    owl:sameAs xsd:NMTOKEN .

xsd:Name a rdfs:Datatype ;
    owl:sameAs xsd:Name .

xsd:anyURI a rdfs:Datatype ;
    owl:sameAs xsd:anyURI .

xsd:base64Binary a rdfs:Datatype ;
    owl:sameAs xsd:base64Binary .

xsd:boolean a rdfs:Datatype ;
    owl:sameAs xsd:boolean .

xsd:byte a rdfs:Datatype ;
    owl:sameAs xsd:byte .

xsd:date a rdfs:Datatype ;
    owl:sameAs xsd:date .

xsd:dateTime a rdfs:Datatype ;
    owl:sameAs xsd:dateTime .

xsd:dateTimeStamp a rdfs:Datatype ;
    owl:sameAs xsd:dateTimeStamp .

xsd:decimal a rdfs:Datatype ;
    owl:sameAs xsd:decimal .

xsd:double a rdfs:Datatype ;
    owl:sameAs xsd:double .

xsd:float a rdfs:Datatype ;
    owl:sameAs xsd:float .

xsd:hexBinary a rdfs:Datatype ;
    owl:sameAs xsd:hexBinary .

xsd:int a rdfs:Datatype ;
    owl:sameAs xsd:int .

xsd:integer a rdfs:Datatype ;
    owl:sameAs xsd:integer .

xsd:language a rdfs:Datatype ;
    owl:sameAs xsd:language .

xsd:long a rdfs:Datatype ;
    owl:sameAs xsd:long .

xsd:negativeInteger a rdfs:Datatype ;
    owl:sameAs xsd:negativeInteger .

xsd:nonNegativeInteger a rdfs:Datatype ;
    owl:sameAs xsd:nonNegativeInteger .

xsd:nonPositiveInteger a rdfs:Datatype ;
    owl:sameAs xsd:nonPositiveInteger .

xsd:normalizedString a rdfs:Datatype ;
    owl:sameAs xsd:normalizedString .

xsd:positiveInteger a rdfs:Datatype ;
    owl:sameAs xsd:positiveInteger .

xsd:short a rdfs:Datatype ;
    owl:sameAs xsd:short .

xsd:string a rdfs:Datatype ;
    owl:sameAs xsd:string .

xsd:time a rdfs:Datatype ;
    owl:sameAs xsd:time .

xsd:token a rdfs:Datatype ;
    owl:sameAs xsd:token .

xsd:unsignedByte a rdfs:Datatype ;
    owl:sameAs xsd:unsignedByte .

xsd:unsignedInt a rdfs:Datatype ;
    owl:sameAs xsd:unsignedInt .

xsd:unsignedLong a rdfs:Datatype ;
    owl:sameAs xsd:unsignedLong .

xsd:unsignedShort a rdfs:Datatype ;
    owl:sameAs xsd:unsignedShort .

owl:backwardCompatibleWith a owl:AnnotationProperty ;
    owl:sameAs owl:backwardCompatibleWith .

owl:deprecated a owl:AnnotationProperty ;
    owl:sameAs owl:deprecated .

owl:equivalentClass owl:sameAs owl:equivalentClass .

owl:equivalentProperty owl:sameAs owl:equivalentProperty .

owl:incompatibleWith a owl:AnnotationProperty ;
    owl:sameAs owl:incompatibleWith .

owl:priorVersion a owl:AnnotationProperty ;
    owl:sameAs owl:priorVersion .

owl:sameAs owl:sameAs owl:sameAs .

owl:unionOf owl:sameAs owl:unionOf .

owl:versionInfo a owl:AnnotationProperty ;
    owl:sameAs owl:versionInfo .

:myManifestation_123 a _:n2fa38e3a01824b92985d162259e6175cb1,
        owl:Thing,
        <urn:someObjectClass> ;
    owl:sameAs :myManifestation_123 .

rdf:Property owl:sameAs rdf:Property .

() owl:sameAs () .

openwemi:Expression rdfs:subClassOf _:n2fa38e3a01824b92985d162259e6175cb1,
        owl:Thing ;
    owl:sameAs openwemi:Expression .

openwemi:Manifestation rdfs:subClassOf _:n2fa38e3a01824b92985d162259e6175cb1,
        owl:Thing ;
    owl:sameAs openwemi:Manifestation .

openwemi:Work rdfs:subClassOf _:n2fa38e3a01824b92985d162259e6175cb1,
        owl:Thing ;
    owl:sameAs openwemi:Work .

owl:Nothing a owl:Class ;
    rdfs:subClassOf _:n2fa38e3a01824b92985d162259e6175cb1,
        owl:Nothing,
        owl:Thing,
        <urn:someObjectClass>,
        <urn:someSubjectClass> ;
    owl:equivalentClass owl:Nothing ;
    owl:sameAs owl:Nothing .

openwemi:Item owl:sameAs openwemi:Item .

openwemi:instantiates a rdf:Property ;
    rdfs:domain openwemi:Item ;
    rdfs:range _:n2fa38e3a01824b92985d162259e6175cb1,
        owl:Thing ;
    rdfs:subPropertyOf openwemi:instantiates ;
    owl:equivalentProperty openwemi:instantiates ;
    owl:sameAs openwemi:instantiates .

<urn:someObjectClass> a owl:Class ;
    rdfs:subClassOf owl:Thing,
        <urn:someObjectClass> ;
    owl:equivalentClass <urn:someObjectClass> ;
    owl:sameAs <urn:someObjectClass> .

<urn:someSubjectClass> a owl:Class ;
    rdfs:subClassOf owl:Thing,
        <urn:someSubjectClass> ;
    owl:equivalentClass <urn:someSubjectClass> ;
    owl:sameAs <urn:someSubjectClass> .

owl:Class owl:sameAs owl:Class .

owl:AnnotationProperty owl:sameAs owl:AnnotationProperty .

owl:Thing a owl:Class ;
    rdfs:subClassOf owl:Thing ;
    owl:equivalentClass owl:Thing ;
    owl:sameAs owl:Thing .

rdfs:Datatype owl:sameAs rdfs:Datatype .

_:n2fa38e3a01824b92985d162259e6175cb2 rdf:first openwemi:Work ;
    rdf:rest _:n2fa38e3a01824b92985d162259e6175cb3 ;
    owl:sameAs _:n2fa38e3a01824b92985d162259e6175cb2 .

_:n2fa38e3a01824b92985d162259e6175cb3 rdf:first openwemi:Expression ;
    rdf:rest _:n2fa38e3a01824b92985d162259e6175cb4 ;
    owl:sameAs _:n2fa38e3a01824b92985d162259e6175cb3 .

_:n2fa38e3a01824b92985d162259e6175cb4 rdf:first openwemi:Manifestation ;
    rdf:rest () ;
    owl:sameAs _:n2fa38e3a01824b92985d162259e6175cb4 .

_:n2fa38e3a01824b92985d162259e6175cb1 a owl:Class ;
    rdfs:subClassOf _:n2fa38e3a01824b92985d162259e6175cb1,
        owl:Thing ;
    owl:equivalentClass _:n2fa38e3a01824b92985d162259e6175cb1 ;
    owl:sameAs _:n2fa38e3a01824b92985d162259e6175cb1 ;
    owl:unionOf _:n2fa38e3a01824b92985d162259e6175cb2 .

Semantics of rdfs: Entailment, not validation or constraint#

update 2024-07-02

dcmi/openwemi#94 states:

We want to make that a validation point …

it should be possible to detect that as an inconsistency …

(1) There is a misunderstanding that is as common as it is profound about RDF(S): rdfs:domain and rdfs:range definitely do not help you to validate an RDF graph or detect inconsistencies in an RDF graph.

https://lists.w3.org/Archives/Public/semantic-web//2006May/0118.html cites a DCMI text:

  1. Using domains and ranges: RDF supports using “domain” and “range” constraints on RDF properties, for limiting the kinds of resources that a property apply to, …

I would think that the above paragraph reveals a deep misunderstanding about the nature of rdfs:range and rdfs:domain … correct?

Correct! The semantics of rdfs:domain (and range) are given in https://www.w3.org/TR/rdf11-mt/#rdfs-entailment, rdfs2 and rdfs3. It’s clearly stated there that you can entail new relationships between two nodes based on the domain information if they don’t exist anyway. (for a more detailes explanation c.f. https://lists.w3.org/Archives/Public/semantic-web//2006May/0121.html)

(2) However, there seems to be a backdoor to use rdfs:damain and rdfs:range as constraints, see https://www.w3.org/TR/rdf-schema/#ch_domainrange :

RDF Schema provides a mechanism for describing this information, but does not say whether or how an application should use it. For example, while an RDF vocabulary can assert that an author property is used to indicate resources that are instances of the class Person, it does not say whether or how an application should act in processing that range information. Different applications will use this information in different ways. For example, data checking tools might use this to help discover errors in some data set, an interactive editor might suggest appropriate values, and a reasoning application might use it to infer additional information from instance data.

I don’t think that a ttl model should use this backdoor. If you want to provide a validator that explicitly does not want to use RDFS semantics with information about permitted domains and ranges, https://schema.org/rangeIncludes and https://schema.org/domainIncludes would be more appropriate. Or obe could use SHACL, but that’s also a rather complex technology.