ValOS Paths ('VPaths') identify paths between valospace resources. Vrids (a subset of VPaths) identify valospace resources.
These VRIds are also affiliated with ValOS event logs which define their internal path semantics further.
VPaths are strings with restricted grammar so that they can be embedded into various URI component and list formats without additional encoding.
This document is part of the library workspace @valos/raem but is `NOT SUPPORTED NOR IMPLEMENTED` by it yet in any manner.
VPaths serve two notably different purposes, both as paths and as resource identifiers. A VPath which has a global valospace resource identifier (or 'vgrid') as its first segment is a valospace resource identifier (or *VRId*).
§ Main VPath rules
vpath = "@" vgrid-tail / verbs-tail vgrid-tail = "$" vgrid "@" [ verbs-tail ] verbs-tail = verb "@" [ verbs-tail ] verb = verb-type params
Many valospace resources, so called *structural sub-resources* are identified by a fixed path from the global resource defined by the same verbs that define non-VRId VPaths. Thus while paths and identifiers are superficially different it is useful to represent them both using the same VPath verb structure.
Both verb and vgrid params can also have context term references to an external lookup of URI prefixes and semantic definitions.
Two VPaths identify the same path and in case they're VRIds, refer to the same resource iff their URN representations are urn-equivalent. In other words two VPath are equivalent if and only if they are lexically equivalent after case normalization of any percent-encoded characters.
For the general case the actual semantics of a VPath and specifically of its context-term's depends on the context it is used. Vrids have a fixed context which is established by the vgrid. This has implications on VRId equivalence.
A verb is a one-to-maybe-many relationship between resources. A verb can be as simple as a trivial predicate of a triple or it can represent something as complex as a fully parameterized computation or a function call.
§ Main verb rules
verbs-tail = verb "@" [ verbs-tail ] verb = verb-type params verb-type = 1*unencoded params = context-tail / value-tail context-tail = "$" [ context-term ] [ ":" param-value [ params ] / context-tail ] value-tail = ":" param-value [ params ] context-term = ALPHA *unreserved-nt param-value = vpath / 1*( unencoded / pct-encoded )
A verb is made up of type and a parameter list. A parameter consists of an optional context-term and an optional value.
Note that while the grammar of verb-type and context-term are still relatively restricted, *param-value* both allows for fully unencoded nesting of vpath's as well as allows encoding of all unicode characters in percent encoded form (as per encodeURIComponent).
*verb-type* specifies the relationship category between the segment host resource and sub-resource, a set of inferred triples as well as other possible constraints.
Verb for selecting the resource (typically a ScopeProperty) with the given name and which has the head as its scope.
§ Property selector example
Triple pattern `?s <urn:valos:.:myProp> ?o` matches like:Mnemonic: '.' is traditional property accessor (ie. ScopeProperty).?o valos:scope ?s ; valos:name "myProp"
Verb for selecting all resources (typically Relations) with the given name and which have the head as their source.
§ Sequence selector example
Triple pattern `?s <urn:valos:*:PERMISSIONS> ?o` matches like:Mnemonic: '*' for many things as per regex/glob syntax (Relations are the only things that can have multiple instances with the same name).?o valos:source ?s ; valos:name "PERMISSIONS"
Verb for selecting the resource (typically an Entity) with the given name and which has the head as their container.
§ Container selector example
Triple pattern `?s <urn:valos:-:Scripts> ?o` matches like:Mnemonic: "-" links a container to its parent.?o valos:parent ?s ; valos:name "Scripts"
Verb for selecting the Media with the given name which has the head as their folder.
§ Content selector example
Triple pattern `?s <urn:valos:':foo.vs> ?o` matches like:Mnemonic: "'" for quoted content suggests text data.?o valos:folder ?s ; valos:name "foo.vs"
Verb that is a synonym for predicate 'rdf:object'.
§ Property selector example
Triple pattern `?s <urn:valos:-> ?o` matches like:Mnemonic: follow line '-' to target.?s rdf:object ?o
Verb for selecting named subspaces and ghosts.
§ Language subspace selector example
Triple pattern `?s <urn:valos:.:myProp@_$lang:fi> ?o` matches like:Mnemonic: '_' is underscore is subscript is subspace.?_sp valos:scope ?s ; valos:name "myProp" . ?o valos:subspacePrototype* ?_sp ; valos:language "fi"
If the verb name context term is an identifier term then the subspace denotes the ghost subspace of the identified resource inside the current resource.
§ Ghost subspace selector example
Triple pattern `?s <urn:valos:_$~u4:ba54> ?o` matches like:Mnemonic: The '_$~' is a 'subspace of ghoStS'.?o valos:ghostHost ?s ; valos:ghostPrototype <urn:valos:$~u4:ba54>
Verb representing the result of an eager evaluation. When a VPath is bound to a context all nested eager evaluators selectors are resolved depth first, left to right. The resolution of a selector first evaluates the evaluator operation using the head and term lookups of the original context and then replaces the selector with the result of the evaluation.
The first parameter defines the evaluation operation. If this parameter has a trivial context-term (ie. no context-term or is a simple prefix term in the context term definition) then the operation is a path operation.
If the context-term is non-trivial then the context must have a definition for the operation.
§ Computation selector example
Triple pattern `?s <urn:valos:!$valk:add$number:10:@!:myVal@> ?o` matches like:?_:0 valos:scope ?s ; valos:name "myVal" ; valos:value ?myVal . FILTER (?o === 10 + ?myVal)
Editorial Note: this section should be greatly improved. The purpose of computation verbs lies more on representing various conversions (as part of dynamic operations such as rest API route mapping) and less on clever SPARQL trickery. The illustration here uses (questionable) SPARQL primarily for consistency.
A verb (and vgrid via its format-term) can be contextual via the context-term's of its params. The context where the verb is used defines the exact meaning of these terms. The meaning for context-terms is recommended to be uniform across domains where possible. A verb is invalid in contexts which don't have a definition for its context-term. This gives different contexts a fine-grained mechanism for defining the vocabularies that are available.
Idiomatic example of such context is the event log and its JSON-LD context structure which is to define both URI namespace prefixes as well as available semantics.
*params* is a sequence of param-value's, optionally prefixed with "$" and a context-term. The idiomatic param-value is a string. If present a context-term usually denotes a URI prefix in which case the param-value is a URI reference. However contexts are free to provide specific semantics for specific context-terms, such as interpreting them as the value type of the param-value etc.
A VRId is a vpath which has vgrid as its first production (via vgrid-tail).
§ Main vrid rules
vpath = "@" vgrid-tail / verbs-tail vgrid-tail = "$" vgrid "@" [ verbs-tail ] vgrid = format-term ":" param-value [ params ]
The vgrid uniquely identifies a *global resource*. If a VRId contains a vgrid and no verbs this global resource is also the *referenced resource* of the VRId itself.
§ Main vgrid rules
vgrid-tail = "$" vgrid "@" [ verbs-tail ] vgrid = format-term ":" param-value [ params ] format-term = "~" context-term params = context-tail / value-tail context-tail = "$" [ context-term ] [ ":" param-value [ params ] / context-tail ] value-tail = ":" param-value [ params ] context-term = ALPHA *unreserved-nt param-value = vpath / 1*( unencoded / pct-encoded )
The format-term defines the global resource identifier schema as well as often some (or all) characteristics of the resource.
Some vgrid types restrict the param-value further, with only "$" in addition to *unreserved* as specified in the URI specification).
Note: when using base64 encoded values as vgrid param-value, use the url-and-filename-readybase64url characters.
The VRId can be directly used as the NSS part of an 'urn:valos:' prefixed URI.
Each valospace resource is identified by a VRId.
If a resource VRId has only vgrid part but no verbs the resource is called a global resource.
If a resource VRId has verbs then the verbs describe a structural path from the global resource of its initial vgrid part to the resource itself. The resource is called a *structural sub-resource* of that global resource.
Each resource is affiliated with an event log of its global resource.
All direct VRId context-terms are references to this event log JSON-LD context.
The resource identified by a VRId is always affiliated with an event log of its global resource. Because the VRId doesn't contain the locator information of this event log it must be discoverable from the context where the VRId is used.
All context-terms of the VGRId and VRId verb params are references to the event log JSON-LD context (this applies only to immediate but not to nested VPath params).
Global resources can be transferred between event logs. To maintain immutability across these transfers VGRId's must not contain partition or other non-identifying locator information. Similar to URN's VRId's always relies external structures and systems for carrying locator information.
Note: uuid v4 (format term `~u4`) is recommended for now, but eventually VGRId generation will be tied to the deterministic event id chain (format term `~cc`). This in turn should be seeded by some ValOS authority.
Two VRIds refer to the same resource iff their URN representations are urn-equivalent(i.e. if the two VRIds are equivalent after section 3.1. case normalization for step 3. percent-encoding case normalization).
Maintaining the consistency between this lexical equivalence and the
semantic equivalence of a resource which has been transferred between
event logs without having to dereference VRIds is useful but has
implications.
Rule: When resources are transferred between event logs
the semantics of their context terms and body-parts must remain
equivalent.
A *simple equivalence* is that two simple prefix term definitions resolve to the same URI. An *extended equivalence* is when two extended term definitions in the source and target event logs are equivalent after normalization. These two equivalences are [will be] defined by this document.
More complex equivalences are outside the scope of this document but can be defined by specifications specifying segment types. These equivalences might take details of the particular verb-type into account and/or specify context definition additions which do not change the equivalence semantics.
In VRId context the verbs-tail that follows the VGRId specifies a structural path from the global resource to a *structural sub-resource* of the global resource. The triple constraints of each verb in that path are _inferred as triples_ for the particular resource that that verb affects.
Principle: a structural sub-resource using a particular
verbs-tail in its identifying VRId will always infer the triples that
are required to satisfy the same verbs-tail in a query context which
starts from the same global resource.
This fixed triple inference is the meat and bones of the structural sub-resources: they allow for protected, constrained semantics to be expressed in the valospace resources. This allows both simplified semantics (eg. properties _cannot_ be renamed so the complex functionality doesn't need to be supported on fabric level), more principled mechanism for partition crypto behaviours (permission relations are structural sub-resources which simplifies security analysis but retains valospace convenience) and also a mechanism for expressing non-trivial resources such as hypertwin resources.
The sub-resources can be nested and form a tree with the global resource as the root. Typical verb sub-segments specify the edges in this tree (some verbs only specify the current node resource further without specifying a new edge). The global resource is the host resource for the first verb; the sub-resource of that segment is the host resource of the second verb and so on.
As the VRId identities of the sub-resources are structurally fixed to this tree the coupling between host and sub-resource must be static. The typical implementation for this is an ownership coupling.
VGRId context-term specifies the particular identifier format and possible semantics of the identified global resource. ValOS kernel reserves all context-terms matching '"i" 2( ALPHA / DIGIT )' for itself with currently defined formats exhaustively listed here.
An identifier for native valospace resource with an event log. This is insecure as there are no guarantees against resource id collisions by malicious event logs. These identifiers can thus only be used in trusted, protected environments.
An identifier of an immutable octet-stream, with the content hash in the param-value.
An identifier of an immutable, procedurally generated resource with its content inferred from the vpath embedded in the param-value. While of limited use in itself this is useful when used as the prototype of structural ghost sub-resources which are quite mutable.
An identifier of a native, secure valospace resource with an event log. This id is deterministically derived from the most recent hash-chain event log entry of the particular event which created it, the cryptographic secret of the creating identity and a salt, thus ensuring collision resistance and a mechanism for creator to prove their claim to the resource.
This is a legacy format for native ghost resources, with id created from the hash of the 'ghost path' of the resource.
VRId *verb-type* specifies the relationship category between the segment host resource and sub-resource, a set of inferred triples as well as other possible constraints.
Ghost sub-resources are products of ghost instantiation. All the ghosts of the directly _and indirectly_ owned resources of the instance prototype are flattened as _direct_ structural sub-resources of the instance itself. The instance is called *ghost host* of all such ghosts.
§ Structural ghost triple inference
`<urn:valos:$~u4:f00b@_$~u4:ba54>` reads as "inside the instance resource `f00b` the ghost of the $~u4 resource `ba54`" and infers triples:<urn:valos:$~u4:f00b@_$~u4:ba54> valos:ghostHost <urn:valos:$~u4:f00b> ; valos:ghostPrototype <urn:valos:$~u4:ba54>
In case of deeper instantiation chains the outermost ghost segment provides inferences recursively to all of its sub-resources; nested ghost segments wont provide any further inferences.
§ Recursive ghost triple inference
`<urn:valos:$~u4:f00b@_$~u4:ba54@_$~u4:b7e4>` reads as "inside the instance resource `f00b` the ghost of `<urn:valos:$~u4:ba54@_$~u4:b7e4>`" and infers triples:<urn:valos:$~u4:f00b@_$~u4:ba54@_$~u4:b7e4> valos:ghostHost <urn:valos:$~u4:f00b> ; valos:ghostPrototype <urn:valos:$~u4:ba54@_$~u4:b7e4>
Selects a variant resource value for a base resource within a structurally identified subspace. The variant resource provides inferred `subspacePrototype` fallbacks to an *inner* subspace and eventually to the non-variant base resource as well as to the homologous sub-resource of the host resource inheritancePrototype.
This means that no matter where a subspace variant is defined in the prototype chain or in the nested sub-structure its value will be found.
§ Structural subspace triple inference
`<urn:valos:$~u4:f00b@.:myProp@_$lang:fi>` is a lang fi variant of f00b myProp and infers triples:<urn:valos:$~u4:f00b@.:myProp@_$lang:fi> a valos:ScopeProperty ; valos:subspacePrototype <urn:valos:$~u4:f00b@.:myProp> , <urn:valos:$~u4:f00b-b507-0763@.:myProp@_$lang:fi> ; valos:language "fi"
Subspace selectors can be used to access language variants, statically identified ghost variants within an instance, statically identified Relation's etc.
The verb segment-term can also specify triple inferences for *all* sub-resources in the subspace (not just for the immediate sub-resource of the selector segment).
§ Structural subspace recursive inference
`<urn:valos:$~u4:f00b@_$~u4:b453@_$lang:fi@_$~u4:b74e@.:myProp>` infers triples:<urn:valos:$~u4:f00b@_$~u4:b453@_$lang:fi@_$~u4:b74e@.:myProp> a valos:ScopeProperty ; valos:ghostHost <urn:valos:$~u4:f00b> ; valos:ghostPrototype <urn:valos:$~u4:b453@_$lang:fi@_$~u4:b74e@.:myProp> ; valos:subspacePrototype <urn:valos:$~u4:f00b@_$~u4:b453@_$~u4:b74e@_$lang:fi@.:myProp> ; valos:language "fi"
Structural properties infer a type, fixed owner and name.
§ Structural scope property triple inference
`<urn:valos:$~u4:f00b@.:myProp>` is a resource with fixed name "myProp", dominant type ScopeProperty, $~u4 resource f00b as the owning scope and a structurally homologous prototype inside f00b-b507-0763 and thus infers triples:<urn:valos:$~u4:f00b@.:myProp> a valos:ScopeProperty ; valos:scope <urn:valos:$~u4:f00b> ; valos:inheritancePrototype <urn:valos:$~u4:f00b-b507-0763@.:myProp> ; valos:name "myProp"
The verbs `.O.`, `.O*`, and `.O'` denote the properties `valos:value`, `valos:target`, and `valos:content'` respectively. These are the primary rdf:object sub-properties of ScopeProperty, Entity and Media, respectively (the letter 'o' in the verbs stands for rdf:object). When given as a parameter to a primary resource they modify it with a fixed rdf:object triple. In addition `.S*` and `.O*` denote `valos:source` `valos:target` which are the rdf:subject and rdf:object properties of a Relation.
§ Structural rdf:object triple inference
`<urn:valos:$~u4:f00b@*:PERMISSIONS:@.*$~ih:8766>` is a PERMISSIONS relation with fixed ~ih target 8766 and infers triples:Mnemonic: these verbs are read right-to-left, eg. `.O*` -> 'Relation rdf:object property is valos:target'<urn:valos:$~u4:f00b@*:PERMISSIONS:@.*$~ih:8766> a valos:Relation ; valos:connectedSource <urn:valos:$~u4:f00b> ; valos:prototype <urn:valos:$~u4:f00b-b507-0763@*:PERMISSIONS:@.*$~ih:8766> ; valos:name "PERMISSIONS" ; valos:target <urn:valos:$~u4:8766-src>
Structural relations infer a type, fixed owner (connector), name and possibly source and target.
§ Structural relation triple inference
`<urn:valos:$~u4:f00b@*:PERMISSIONS@_:1>` is a resource with fixed name "PERMISSIONS", dominant type Relation, ~u4 f00b as the source, a structurally homologous prototype inside f00b-b507-0763 and thus infers triples:<urn:valos:$~u4:f00b@*:PERMISSIONS> a valos:Relation ; valos:connectedSource <urn:valos:$~u4:f00b> ; valos:inheritancePrototype <urn:valos:$~u4:f00b-b507-0763@*:PERMISSIONS> ; valos:name "PERMISSIONS" <urn:valos:$~u4:f00b@*:PERMISSIONS@_:1> a valos:Relation ; valos:subspacePrototype <urn:valos:$~u4:f00b@*:PERMISSIONS> , <urn:valos:$~u4:f00b-b507-0763@*:PERMISSIONS@_:1>
Structural entities infer a type, fixed owner (parent) and name.
§ Structural Entity triple inference
`<urn:valos:$~u4:f00b@-:Scripts>` has a fixed name "scripts", dominant type Entity, $~u4 resource f00b as the owning container and a structurally homologous prototype inside f00b-b507-0763 and thus infers triples:<urn:valos:$~u4:f00b@-:Scripts> a valos:Entity ; valos:parent <urn:valos:$~u4:f00b> ; valos:inheritancePrototype <urn:valos:$~u4:f00b-b507-0763@-:scripts> ; valos:name "scripts"
Structural medias infer a type, fixed owner (folder) and name.
§ Structural Media triple inference
`<urn:valos:$~u4:f00b@':foo.vs>` has a fixed name "foo.vs", dominant type Media, $~u4 resource f00b as the owning folder and a structurally homologous prototype inside f00b-b507-0763 and thus infers triples:<urn:valos:$~u4:f00b@':foo.vs> a valos:Media ; valos:folder <urn:valos:$~u4:f00b> ; valos:inheritancePrototype <urn:valos:$~u4:f00b-b507-0763@':foo.vs> ; valos:name "foo.vs"
The VPath grammar is an LL(1) grammar. It is recursive as param-value productions can contain nested vpaths without additional encoding.
The list of definitive rules:
vpath = "@" vgrid-tail / verbs-tail
vgrid-tail = "$" vgrid "@" [ verbs-tail ]
vgrid = format-term ":" param-value [ params ]
format-term = "~" context-term
verbs-tail = verb "@" [ verbs-tail ]
verb = verb-type params
verb-type = 1*unencoded
params = context-tail / value-tail
context-tail = "$" [ context-term ] [ ":" param-value [ params ] / context-tail ]
value-tail = ":" param-value [ params ]
context-term = ALPHA *unreserved-nt
param-value = vpath / 1*( unencoded / pct-encoded )
unencoded = unreserved / "!" / "*" / "'" / "(" / ")"
unreserved = unreserved-nt / "~"
unreserved-nt = ALPHA / DIGIT / "-" / "_" / "."
pct-encoded = "%" HEXDIG HEXDIG
In addition there are pseudo-rules which are not used by an LL(1) parser but which have well-defined meaning and can thus be referred to from other documents.
The list of informative pseudo-rules:
vrid = "@" "$" vgrid "@" [ verbs-tail ]
verbs = "@" verbs-tail
vparam = [ "$" [ context-term ] ] [ ":" param-value ]
context-term-ns = ALPHA 0*30unreserved-nt ( ALPHA / DIGIT )
There are couple notes not explicitly expressed by the the grammar itself. These notes primarily relate to LL(1)-parseability:
Pseudo-rule 'vparam': this class contains all 'context-tail' and 'value-tail' expansions while excluding their '[ params ]' and 'context-tail' right recursive expansions.
Also note how 'params' rule is right recursive. This is to ensure that the string "$foo:bar" will be properly LL(1)-parsed as a singular 'context-tail' with 'param-value', instead of a 'context-tail' (without 'param-value') that is followed by 'value-tail'. To represent a 'context-tail' (without 'param-value') that is followed by a 'value-tail' an empty context must be added: "$foo$:bar". To represent an empty param an empty "$" can be inserted: "$$foo:bar" and as a consequence if the following param of an empty param has no context it must also prepended with "$" like so: "$$:bar".
Pseudo-rule 'context-term-ns': this class contains all 'context-term'
expansions which match this more restrictive specification (max 32
chars, special chars only in the middle). All 'context-term's which
are plain namespace prefixes should be restricted to this rule as
this is the prefix grammar of some relevant prefix context.
Editorial Note: which context was this again? Neither
SPARQL, Turtle nor JSON-LD have this limitation.
The nesting hierarchy can be manually quickly established by first splitting a valid vpath string by the delimiter regex /(@$:)/ (retaining these delimiters in the result). Then a tree structure is formed by traversing the array from left to right and dividing it to different nesting depths. The nesting depth is increased for the initial "@" and for each "@" that is preceded by a ":" (corresponds to the 'vpath' production prefix of some 'short-param' production) and reducing the nesting depth for each "@" that is succeeded by a "$", ":", "@" or EOF (corresponds to the terminator of the last 'vgrid' or 'verb' production of some 'vpath' production). All remaining "@" correspond to non-final 'vgrid' or 'verb' production terminators of some 'vpath' rule production and thus don't change the nesting depth.
This section contains considerations on the choice of character set and on where and how VPaths need or don't need to be encoded. There's a historical emphasis on the decision of which characters to use as delimiters (ie. "@", ":" and "$").
If a character is a delimiter in some context within a VPath then this character must always encoded when not used as a delimiter.
Characters not encoded are ruled out from structural delimiters. This leaves "?" | "#" and "/" | ":" | "@" and "$" | "+" | ";" | "," | "=" | "&"
In general VPaths don't require encoding in contexts where the VPath
delimiters "@" / ":" / "$" and the encodeURIComponent result character
set ALPHA / DIGIT / "-" / "_" / "." / "~" / "!" / "*" / "'" / "(" / ")"
can be used.
Editorial Note: "(" and ")" can in principle be
substantially inconvenient in many contexts. But as they're grouped
with "!" / "*" / "'" which have their uses in verb-type's all
five are for now retained as allowed characters.
VPaths can be used as-is in URI path parts (except as segment-nz-nc, see below). This rules out "?", "#", "/" from structural delimiters
Rules out "," | ";" from structural delimiters
VPath can and is intended to be used as-is in the query part (even as the right-hand side value of "=") as long as the URI consumer or possible middlewares don't perform x-www-form-urlencoded (or other) decoding of the key-value pairsbefore VPath expansion.
Rules out "=" , "&" from structural characters.
Note: This is completely regular. If the consumer is
known to explicitly decode query values and because VPaths can
contain "%" characters they must be appropriately symmetrically
encoded. This can result in double encoding. However as the intent is that VPath
expansion should be considered to be part of the URI parsing and separation itself any separate encoding and decoding should not be needed.
Doesn't rule out any delimiter options not yet ruled out.
Doesn't rule out any delimiter options not yet ruled out. Specifically this does not rule out ":" as that is allowed in NSS sub-parts.
Covered by URI query and fragment sections.
URI's in general need to be quoted here and VPath is URI-like. This retains "@" as an allowed delimiter.
This retains ":" as an allowed delimiter which segment-nz-nc would otherwise prevent.
Encoded and serialized as per https://url.spec.whatwg.org/#urlencoded-serializing
RFC 2141 reserves "~" but encodeURIComponent doesn't encode it. To maintain direct drop-in 2141 compatibility would require disallowing "~" from the character set. This in turn would complicate specific javascript domain implementations as they would have to encode "~" separately without being able to solely rely on encodeURIComponent.
As this concern is not likely to be a problem in practice anyway we choose to refer to RFC 8141 for URN's which removes "~" from the set of reserved character. This solves this (relatively theoretical) issue.