Ontologies and schemas¶
Tier 0¶
Raw data collections (“tier 0”) in the production system do not adhere to a fixed schema or ontology, but instead have a schema which is very close to the raw data. Modifications to field names tend to be quite basic, such as lowercase and removal of whitespace in favour of a single underscore.
Tier 1¶
Processed data (“tier 1”) is intended for public consumption, using a common ontology. The convention we use is as follows:
- Field names are composed of up to three terms: a
firstName
,middleName
andlastName
- Each term (e.g.
firstName
) is written in lowerCamelCase. firstName
terms correspond to a restricted set of basic quantities.middleName
terms correspond to a restricted set of modifiers (e.g. adjectives) which add nuance to thefirstName
term. Note, the specialmiddleName
termof
is reserved as the default value in case nomiddleName
is specified.lastName
terms correspond to a restricted set of entity types.
Valid examples are date_start_project
and title_of_project
.
Tier 0 fields are implictly excluded from tier 1 if they are missing from the schema_transformation
file. Tier 1 schema field names are applied via nesta.packages.decorator.schema_transform
Tier 2¶
Although not-yet-implemented, the tier 2 schema is reserved for future graph ontologies. Don’t expect any changes any time soon!