Specification: Grammar & Node Tests
NOTE: This is an attempt at a formal grammar for the ExtensibilityFramework, or EF. It is, however, a work in both flux and progress. There are Python/SAX and XSLT implementations for this grammar.
Synopsis: Compact RELAX NG Grammar
default namespace = "" namespace atom = "http://purl.org/atom/ns#" start = element atom:atom { XMLAtts, Property* } | Property Property = SimpleProp | LiteralProp | ComplexLiteralProp SimpleProp = element * { (XMLAtts, RefAttr) | (XMLAtts, RefAttr?, Property*) } LiteralProp = element * { (XMLAtts, text) | (XMLAtts, ModeAttr, any) } ComplexLiteralProp = element * { XMLAtts, NonSpecialAttr+, ModeAttr?, any } RefAttr = attribute ref { xsd:anyURI } ModeAttr = attribute mode { text } XMLAtts = attribute xml:* { text }* NonSpecialAttr = attribute * - (mode | ref | xml:*) { text } any = (element * { attribute * { text }*, any } | text)*
There is an example of an Atom XML document that conforms to the .rnc schema above according to James Clark's Jing. Note that xml:base and xml:lang are both supported.
Description
Terminology, for an element event e:-
-
e.xmlns is the XML namespace of an element
-
e.name is the element's local name
-
e.parent is the element's parent
-
e.children is an ordered list of the element's child element nodes
-
e.URI is a URI constructed by concatenating e.xmlns and e.name
-
e.subject is initialized to Null and may be overridden
-
e.base is set to the value of @xml:base if present, or e.parent.base, or Null
-
e.lang is set to the value of @xml:lang if present, or e.parent.base, or Null
The attributes @ref, @mode, and @xml:* are treated as special. Hence, non-special attributes are those which do not belong to that set. (Note that this specification only supports @xml:base and @xml:lang; all other attributes in the xml:* namespace are ignored).
start = element atom:atom { Property* } | Property
-
If e.xmlns is the atom namespace and e.name is 'atom', parse each child element in e.children as Property elements. Else, parse e as a Property element.
Property = SimpleProp | LiteralProp | ComplexLiteralProp
-
If there are no non-special attributes and no @ref and either an @mode or no elements in e.children, then it's a LiteralProp. Else if there are non-special attributes, it's a ComplexLiteralProp. Otherwise it's a SimpleProp.
SimpleProp = element * { (XMLAtts, RefAttr) | (XMLAtts, RefAttr?, Property*) }
-
If there is an @ref attribute present, set e.subject to an entity with a URI that is the value of @ref resolved in the context of e.base. Otherwise, set e.subject to a new unnamed entity. If e.parent.subject is not Null, i.e. is an entity, create a property/edge/arc on e.parent.subject labelled e.URI and with e.subject as a value. If there are elements in e.children, parse each one as a property.
LiteralProp = element * { (XMLAtts, text) | (XMLAtts, ModeAttr, any) }
-
If e.parent is Null, return. Otherwise, if there is no @mode and there are no elements in e.children, add a property/edge/arc to e.parent.subject using e.URI as a label, e's text content as the value, and, if present, e.lang as the value's language. Else, if @mode's value is "xml", add a property/edge/arc as above with e's any content as the value and XMLLiteral as the datatype. Otherwise, if there is an @mode, add a property/edge/arc as above, with the element's ANY content as the value, and using a URI that is the concatenation of the atom namespace and the @mode value as a datatype.
ComplexLiteralProp = element * { XMLAtts, NonSpecialAttr+, ModeAttr?, any }
-
Set e.subject to a new unlabelled entity. If e.parent is not Null, add a property/edge/arc to e.parent.subject using e.URI as the edge label, and e.subject as the value. For each non-special attribute, add a property/edge/arc to e.subject with a concatenation of either the attribute's namespace-uri or e.xmlns and the attribute's local name as the edge URI, the attribute's value as the string value, and, if present, e.lang as the value's language. Add a property/edge/arc to e.subject using "value" as the label, and the element's content as the value. If there are elements in e.children, or @mode's value is "xml", set the datatype to XMLLiteral. Else, if @mode is present, set the datatype URI to a concatenation of the atom namespace and @mode's value. Otherwise, if present, use e.lang as the value's language.
XPath Node Matches
For use in, e.g., XSLT. Transcribed from Sjoerd Visscher's XSLT for AtomEF.
-
LiteralProp: *[count(@mode)=count(@*)][not(*) or @mode]
-
ComplexLiteralProp: *[count(@*) > count(@ref) + count(@mode)]
-
otherwise, SimpleProp: *
One may also, of course, parse the three in SimpleProp, LiteralProp, ComplexLiteralProp order, but the order above was found to produce the most readable results.