Resolvers

Resolvers are used to download and parse content referenced from an initial DATALOG-TEXT resource. In general these are datasets that are external to the Datalog source and often in alternate representations.

URI References

Relative URIs are resolved with base URIs as per Uniform Resource Identifier (URI): Generic Syntax RFC3986 using only the basic algorithm in section 5.2. Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) are performed.

The base pragma defines the Base URI used to resolve relative URIs per RFC3986 § 5.1.1, “Base URI Embedded in Content”. § 5.1.2, “Base URI from the Encapsulating Entity” defines how the In-Scope Base URI may come from an encapsulating document, such as a SOAP envelope with an xml:base directive or a mime multipart document with a Content-Location header. The “Retrieval URI” identified in § 5.1.3, Base “URI from the Retrieval URI”, is the URL from which a particular DATALOG-TEXT resource was retrieved. If none of the above specifies the Base URI, the default Base URI (§ 5.1.4, “Default Base URI”) is used.

In the case that multiple base pragmas exist, each replaces the previous value.

Content Negotiation

This section applies to URI schemes, such as http and https that provide content negotiation.

The resolver MAY use a provided media type to construct an HTTP Accept header (see RFC7231 § 5.3.2) for URIs with http or https schemes, see RFC2616, § 14.1.

In the following example no media type is provided.

.assert human(name: string).
.input human(uri="http://example.com/data/humans").

In this case the HTTP request contains a list of acceptable response types as well as language and encoding preferences. In this case, not only does the client accept CSV and TSV representations, it indicates its preference for CSV over TSV.

GET /data/humans HTTP/1.1
Host: example.com
Accept: text/csv; q=0.8, text/tab-separated-values; q=0.4
Accept-Language: en-us
Accept-Encoding: gzip, deflate

While Accept-Encoding RFC7231 § 5.3.4 and Accept-Language RFC7231 § 5.3.5 are RECOMMENDED, Accept-Charset RFC7231 § 5.3.3 if present MAY only take the value UTF-8.

Examples

Given the following program, the input file is provided as an absolute URI and can be fetched directly, the media type and parameters are also provided.

.assert human(name: string).
.input human(
    uri="http://example.com/data/humans.csv", 
    type="text/csv", 
    header=present).

The following is the resulting HTTP request.

GET /data/humans.csv HTTP/1.1
Host: example.com
Accept: text/csv;header=present

Given a relative URI, as shown below, the resolver requires a base URI to determine the absolute URI to fetch. An explicit base URI may be provided using the base pragma, or some other means such as processor command-line parameters, environment variables, or the current working directory of the host process1.

.pragma base="https://example.com/datalog/".
.assert human(name: string).
.input human(uri="data/humans.csv", type="text/csv", header=present).
.assert human(name: string).
.input human(uri="data/humans.csv", type="text/csv", header=present).

Dataset Media Types

The following media types are documented here, however only text/csv and text/tab-separated-values MUST be supported by a conforming DATALOG-TEXT processor.

Media TypeShortParametersConformExpectations
text/csvcsvheaderMUSTSee RFC4180
text/tab-separated-valuestsvN/AMUSTSee IANA-TSV
application/jsonjsonN/AN/ASee RFC4627
application/vnd.sqlite3sqlite3N/AN/ASee IANA-SQLITE3
application/vnd.datalogdatalogN/AN/AUsed on output only to write DATALOG-TEXT from an in-memory representation

Common Parameters

  • uri – this MUST BE interpreted as a URI RFC3986 and handled as per § URI References above.
  • type – this string value MUST BE either the type name and subtype name of a supported media type (i.e. text/csv) or one of the supported media type short form identifiers described in the § Dataset Media Types table.

Common Errors

text/csv Parameters

  • header – one of the following values {present, absent} from RFC4180.
  • columns – a comma-separated list of either positive integers (the first column is column 1)) or inclusive range specifiers of the form [min:max] where min and max are optional positive integers. If min is not present it is assumed to be the initial attribute index of 1, if max is not present it is assumed to the last attribute index $\small j$ in $\small(\alpha_1,\ldots,\alpha_j)$.

Errors

Example

.assert human(name: string).
.input human( 
    uri="http://example.com/data/humans.csv", 
    type="text/csv", 
    columns="2",
    header=present).

text/tsv Parameters

According to IANA-TSV in the tsv production (copied below) the name line MUST BE present and so no header parameter is exposed for this type.

tsv      ::= nameline record+ ;

Example

.assert car(make: string, model: string, year: integer).
.input car( 
    uri="http://example.com/data/cars.tsv", 
    type="text/tab-separated-values", 
    columns="[1:2],4").

application/json Parameters

This section is non-normative.

The JSON data interchange syntax ECMA-JSON is a well-defined and simple interchange format and can be used to represent datasets, given the following structural restrictions.

  1. The JSON root value MUST be an array.
  2. Each member of this array MUST be either:
    1. An array where each value being MUST be an atomic value (string, number, boolean).
    2. An object where each member value MUST be an atomic value (string, number, boolean).
  3. All members of this array MUST be homogenous.

Additional, more complex structures MAY be supported by a DATALOG-TEXT processor.

  • columns – this MUST only be used when each member of the JSON array is also an array. See § Text/CSV Parameters.
  • members – this MUST only be used when each member of the JSON array is an object and is a comma-separated list of member names.

Errors

Examples

Given the following, array-shaped, JSON value.

[
  ["ford", "fiesta", "uk", 2010],
  ["ford", "escort", "uk", 2008]
]

The input processing instruction for the relation car specifies columns 1, 2, and 4 to skip the unused third column. By default an array-shaped input will match by position and there MUST be the same number of values in each array as there are attributes in the relation.

.assert car(make: string, model: string, year: integer).
.input car( 
    uri="http://example.com/data/cars.json",
    type="application/vnd.sqlite3",
    columns="1,2,4").

Given the following, object-shaped, JSON value.

[
  {"make": "ford", "model": "fiesta", "geo": "uk", "year": 2010},
  {"make": "ford", "model": "escort", "geo": "uk", "year": 2008}
]

The input processing instruction for the relation car specifies the members make, model, and year but not the geo value. By default an object-shaped input is matched by label IFF all attributes in the relation are named, or else by position. If matched by label the object may have more members than the relation has attributes; if matched by position the length of object and relation MUST be equal.

.assert car(make: string, model: string, year: integer).
.input car( 
    uri="http://example.com/data/cars.json",
    type="application/vnd.sqlite3",
    members="make,model,year").

application/vnd.sqlite3 Parameters

This section is non-normative.

  • table – the name of the table in the database to read from, or write to. If missing the processor MAY use the relation name as the table name, or MAY signal the error ERR_IO_INSTRUCTION_PARAMETER.
  • columns – a comma-separated list of column names in the table, these are matched positionally with relation attributes. If missing the processor MAY use the attribute names as column names, or MAY signal the error ERR_IO_INSTRUCTION_PARAMETER.

According to SQLITE3-URI the URI specified for the database location MUST have the scheme file RFC8089.

A conforming DATALOG-TEXT processor MAY choose to validate these parameters at parse time or MAY wait until the actual input/output operation is required. In either case if the processor determines that the table or columns specified in these parameters does not match the schema of the relation the processor MUST signal the error ERR_IO_INSTRUCTION_PARAMETER.

Example

In the following the intensional relation mortal is output to a SQLite database.

.infer mortal(name: string).
.output mortal( 
    uri="file:///var/db/results.sql?cache=private&mode=rwc", 
    type="application/vnd.sqlite3",
    table="mortals",
    columns="short_name").

Text/Datalog

This section is non-normative.

In either the input or output cases the representation is a list of facts that corresponds to the grammar production assertion, see § Relations & Facts.

 


1

See The Open Group man page for the getcwd() function.