Manifestation: subclass of Expression?#
In 2027, there was a discussion in a CIDOC SIG group as to whether a manifestation is a subclass of expression. In the current version CIDOC_7.1.3 (https://www.cidoc-crm.org/sites/default/files/cidoc_crm_version_7.1.3.pdf), this consideration has not been incorporated. We trace the reasoning and show what the consequences would have been if the subclass relationship had prevailed.
https://www.cidoc-crm.org/frbroo/Issue/ID-360-lrmoo:
posted by Pat on 18/12/2017: Declaring F3 Manifestation as a subclass of F2 Expression will be very unexpected to our readers and will have to be carefully explained in the scope note, as in the E-R formulation of LRM we have declared Expression and Manifestation to be disjoint. But in LRM we do not explicitly acknowledge that all manifestations are one-to-one with publication expressions, and thus have an essential nature as an aggregate (where the publisher’s contribution is often minimized in cataloguing practice, but of course it must be there). Once we see it this way, then it does makes sense that the F3 Manifestation which is basically a specific kind of aggregating expression, is a subclass of F2.
Posted by Maja on 19/12/2017: I agree with all except the Manifestation as subclass of Expression. I am not sure that Manifestation is an aggregating expression. Manifestation is the aggregate, the result, and it also embodies the aggregating expression. We also need the distinction between the expression ( in case of text an abstract sequence of words) and the way it is presented in a publication/Manifestation (with layout design, particular font, colour etc) in order to cluster all publications embodying the same expression (text in this example)…
posted by Martin on 19/12/2017: Dear Maja, […] I think there some fundamental methodological question to clarify: […]
In 2022, Pat Riva published ([RvZumerA22], https://repository.ifla.org/bitstream/123456789/2217/1/144-riva-en-paper.pdf):
2.2 Manifestation as subtype of Expression: […] In LRMoo, F1 Work is defined as a subtype of CRM E89 Propositional Object, and F2 Expression as a subtype of CRM E73 Information Object. F3 Manifestation is defined as a subtype of F2 Expression (and so implicitly also a subtype of CRM E73 Information Object). The latter is a notable change from FRBRoo where Manifestation (called F3 ManifestationProduct Type) was defined as subclass of CRM E55 Type, which was neither intuitive nor convenient. The main reasoning for inheriting Expression, is that the substance of manifestations is both the content and the form of its presentation. […] Advantages of this solution is that it allows for convenient implementation of aggregates and other examples where the manifestation as a whole can be associated with a work, without introducing redundant additional expressions for the whole. This also aligns well with the direct relationship between manifestation and work that is implemented in some vocabularies for library data. […]
Status as of 2024#
The document Volume A: Definition of the CIDOC Conceptual Reference Model. Produced by the CIDOC CRM Special Interest Group. Version 7.1.3, February 2024 (https://www.cidoc-crm.org/sites/default/files/cidoc_crm_version_7.1.3.pdf) does not define the terms Expression or Manifestation. Instead, the class E73 Information Object is initially relevant for our discussion:
Hint
2024, https://www.cidoc-crm.org/sites/default/files/cidoc_crm_version_7.1.3.pdf
E73 Information Object
Subclass of:
E89 Propositional Object
E90 Symbolic Object
Superclass of:
E29 Design or Procedure
E31 Document
E33 Linguistic Object
E36 Visual Item
Scope note:
This class comprises identifiable immaterial items, such as poems, jokes, data sets, images, texts, multimedia objects, procedural prescriptions, computer program code, algorithm or mathematical formulae, that have an objectively recognizable structure and are documented as single units. The encoding structure known as a “named graph” also falls under this class, so that each “named graph” is an instance of E73 Information Object.
An instance of E73 Information Object does not depend on a specific physical carrier, which can include human memory, and it can exist on one or more carriers simultaneously. Instances of E73 Information Object of a linguistic nature should be declared as instances of the E33 Linguistic Object subclass. Instances of E73 Information Object of a documentary nature should be declared as instances of the E31 Document subclass. Conceptual items such as types and classes are not instances of E73 Information Object, nor are ideas without a reproducible expression.
The connection between FRBR and CIDOC CRM is established via LRMoo:
Hint
2022: [BDLBoeufR21], https://www.cidoc-crm.org/frbroo/sites/default/files/LRMoo_V0.7(draft 2021-06-29).pdf:
F3 Manifestation
Subclass of: F2 Expression
Scope note: This class comprises products rendering one or more Expressions. A Manifestation is defined by both the overall content, and the form of its presentation. The substance of F3 Manifestation is not only signs, but also the manner in which they are presented to be consumed by users, including the kind of media adopted. […]
2.5.2. LRMOO Class Hierarchy Aligned with (Part of) CIDOC CRM Class Hierarchy
E89 Propositional Object
F1 Work
E73 Information Object
F2 Expression
F3 Manifestation
E24 Physical Human-Made Thing
F5 Item (p.13)
Hint
2024: https://cidoc-crm.org/frbroo/sites/default/files/LRMoo_V0.9.4(draft after WWR).pdf
F2 Expression
Subclass of: E73 Information Object
Scope note: This class comprises the intellectual or artistic realisations of Works in the form of identifiable immaterial objects, such as texts, poems, jokes, musical or choreographic notations, movement pattern, sound pattern, images, multimedia objects, or any combination of such forms. The substance of F2 Expression is signs.
F3 Manifestation
Subclass of: E73 Information Object
Scope note: This class comprises products rendering one or more Expressions. A Manifestation is defined by both the overall content and the form of its presentation. The substance of F3 Manifestation is not only signs, but also the manner in which they are presented to be consumed by users, including the kind of media adopted.
An instance of F3 Manifestation typically incorporates one or more instances of F2 Expression representing a distinct logical content and all additional input by a publisher such as text layout and cover design. Additionally an F3 Manifestation can be identified by the physical features for the medium of distribution, if applicable.
Contrasting FRBR and LRMoo 2022#
(written in DE, translated to EN with https://www.deepl.com)
In 2017, a CIIDOC working group proposed interpreting manifestations as a specialization of manifestations; for official publication 2022 see [RvZumerA22]. This change was obviously not adopted in the current version of LRMoo 0.9.4 (2024) and thus also CIDOC 7.1.3 (2024) – and rightly so.
The following comparison attempts to show what far-reaching consequences the subclass relationship from 2022 would have had. We assume that we live in an RDF(S) and OWL world, and that corresponding ontologies are also formulated in RDF(S) and/or OWL. This is particularly relevant for the interpretation of the subclass relationship.
Community 1: FRBR#
In a first community, there are above all the two disjoint sets Expression (frbr:E
) and Manifestation (frbr:M
). Technically, there is also a set Thing with ID, or TWID
for short, which includes all things in the world that could have a unique ID (i.e. pretty much everything).
It is common practice to describe elements x
from the set E
in particular by their language (DE, EN), and (elements) x
from (the set) M
in particular by their format (html, pdf). Say that there is also a consensus that you can specify a length for elements of the set E
and M
: In the case of an x
from the set E
, let its quantity be the number of words; in the case of an x
from the set M
, let its length be the number of pages. (A computer scientist will recognize here that the function length is overloaded, a typical approach in the Python language, for example).
Language and format are characteristic attributes: If we know from an element x
from the set TWID
that its language is DE (resp. its format is pdf), then we can conclude that x
is an element of the set E
(resp. the set M
).
Length, on the other hand, is not a characteristic attribute: As long as it is not known whether an unknown x
is an element of the set E
or of the set M
, it is not possible to determine for a given length (e.g. 277
) whether the length is the number of words of an E
(e.g. an encyclopedia entry) or the number of pages of an M
(e.g. a pdf document).
E
and M
are declared as subsets of TWID
, where subset is understood from the point of view of naive set theory: Every x
that is an element of the set E
is also an element of the set TWID
.
Let us assume that we have an xe
from the set E
in front of us with the length 277
. Then, based on the subset declaration, we can also conclude that our xe
is not only an E
, but also a TWID
.
If we query a database with a inference component for all elements of the set TWID
, we can expect that our xe
is also contained in the result set. If the length of this element xe
from the set TWID
is also given, we must either submit further queries to our database to decide whether length here is the number of words or pages – or we search for a characteristic attribute such as language or format to solve this problem.
The same applies to another element xm
from the set M
.
Community 2: LRMoo 2017#
In another community there are also the two sets Expression (lrmoo:E
, in the following E
) and Manifestation (lrmoo:M
, in the following M
), as well as the set TWID
.
Here, too, it is common practice to describe E
in particular by the language (DE, EN), and M
in particular by the format (html, pdf). And here, too, there is a consensus that you can specify a length for elements of the set E
and M
: In the case of an x
from the set E
, let its quantity be the number of words; in the case of an x
from the set M
, let its length be the number of pages.
Language and format are also provided as characteristic attributes in LRMOO: If we know from an element x
from the set TWID
that its language is DE (resp. its format is pdf), then we can conclude that x
is an element of the set E
(resp. the set M
).
Just as in FRBR, E
and M
are additionally declared as subsets of TWID
.
But OTHER than in FRBR, E
and M
are not meant to be disjoint sets, but declared as a subset: M rdfs:subClassOf E
.
Again we have an xe
from the set E
with the length 277
in front of us. Then, based on the subset declaration, we can also infer that our xe
is also a TWID
. If we query a database with an inference component for all elements of the set TWID
, we can expect that our xe
is also contained in the result set.
But the element xm
behaves DIFFERENT from FRBR. This element will also appear in a query for all elements from TWID
. But according to the subclass logic described above, it will also appear as a result for all elements from the set E
: Because every thing that is an element of the set M
is also an element of the set E
because of M rdfs:subClassOf E
.
This has far-reaching consequences in terms of modeling:
(1) It is true that language is still a characteristic attribute of E
. But contrary to what we thought, the format is no longer a characteristic attribute of M
: Although we can conclude that x
is an M
because an x
has a format, the subclass declaration immediately classifies our x
as an element of the set E
. The format information is no longer suitable for distinguishing E
from M
, which has far-reaching consequences in terms of modeling:
(3) To distinguish E from M, a more complex query would have to be made: “pure” E
are all elements that are not also elements of the set M
. Such a query with negation is usually very expensive in terms of memory and computing time. The real problem, however, is that negation in the RDF(S) and OWL world with its Open World Assumption (OWA) behaves completely differently than in other worlds, in which a Closed World Assumption (CWA) is often tacitly assumed.
(4) Due to the inferencing, there are always elements from the set E
that have a format (!). In the FRBR conceptualization of E
, such a specification would simply be an error.
(5) Above all, the possibility of a type-based differentiation of the length (of an E
, of an M
) is lost: What is indicated by a length of e.g. 277
if one does not know whether an element x
from E
is actually a “pure” E
or also an M
after all?
Pragmatically, the decalration lrmoo:M rdfs:subClassOf lrmoo:E
has a much more far-reaching consequence: lrmoo:M
and lrmoo:E
are no longer two different things that can be clearly distinguished. It makes little sense to look for characteristic attributes that distinguish a frbr:E
and frbr:M
in an either-or fashion. Where the FRBR community sees two different things, the CIDO community essentially sees only one thing, lrmoo:E
, which can be further differentiated into a special lrmoo:E
, namely lrmoo:M
, if required. The actual frbr:M
is no longer contained in LRMOO.