Part 5: Application schema discovery for profiles
Current situation
- Multiple variants share the same media types
application/jsonandapplication/geo+json - Profiles are introduced to flag additional semantics for a variant (constraints, conventions, extensions)
- Multiple profiles can be applied on each individual resource (if compatible with each other)
- Profiles can be negotiated using the profile query-parameter
- Profiles which are applied are communicated at runtime via response links (RFC 6906, header or payload)
The problem
- OpenAPI Specification is not profile-aware and cannot key schema selection on RFC 6906 profile
- This forces the use of bundled schemas, using
oneOf(or similar), sacrificing the correlation between profile and schema - As a result, it is not discoverable what the impact of an individual profile is on the response schema
For example:
application/geo+json:
schema:
oneOf:
- $ref: "#/components/schemas/FeatureRfc7946"
- $ref: "#/components/schemas/FeatureJsonFg"
- $ref: "#/components/schemas/MyCustomOgcRecord"
A machine cannot automatically determine that, when retrieving a document with profile https://www.opengis.net/def/profile/OGC/0/ogc-record applied, the expected response schema is #/components/schemas/MyCustomOgcRecord.
Suggestion 1: Introduce discovery mechanism for profile schemas
Introduce a standard location, where the mapping (from a global context) is published between profiles and schemas. The location could be for example the landing page or a dedicated "profiles" page.
The mapping could look similar to this:
{
"profileSchemas": {
"http://www.opengis.net/def/profile/OGC/0/rfc7946": {
"application/geo+json": "https://schemas.example.org/v1/FeatureRfc7946.json",
},
"http://www.opengis.net/def/profile/OGC/0/jsonfg": {
"application/geo+json": "https://schemas.example.org/v1/FeatureJsonFg.json",
},
"https://www.opengis.net/def/profile/OGC/0/ogc-record": {
"application/geo+json": "https://schemas.example.org/v1/MyCustomOgcRecord.json",
},
}
}
The schema must be a resolvable URL pointing to a schema document, corresponding to the media type. Schema documents can also be documents other than JSON Schema, such as an XSD document for a GML application schema.
The above would enable static validation / codegen without losing determinism.
Suggestion 2: Support retrieving a compiled application schema for a given (set of) profile(s)
In addition to retrieving the logical schema for every collection (by requesting /collections/{collectionId}/schema), also support retrieving application schemas by passing a (set of) profile(s) as a query parameter.
@cportele can you share your current LDproxy implementation as a starting point?
Regarding retrieving the application schema of a collection for validation we currently support four profiles in ldproxy. The profiles are:
- "validation-returnables-geojson": JSON Schema for validating a GeoJSON feature that is returned as a response to a query
- "validation-receivables-geojson": JSON Schema for validating a GeoJSON feature that is sent to the server as part of a create or update operation
- "validation-returnables-jsonfg": JSON Schema for validating JSON-FG features that a returned as a response to a query
- "validation-receivables-jsonfg": JSON Schema for validating a JSON-FG feature that is sent to the server as part of a create or update operation
Here is an example: https://gvd.ldproxy.net/gvd-level-a/collections/ax_gebaeude/schema?f=json&profile=codelists-ref,validation-returnables-geojson
For the GeoJSON variant, all properties except those with role id and role primary-geometry are moved to properties. The property with role id is mapped to the id member, the one with role primary-geometry to the geometry member. We also remove everything not relevant for validation (title, description, other x-ogc fields, etc.). For the returnables profiles, all writeOnly properties are removed, too.
The profiles for the receivables are only active when CRUD support is enabled for the collection. If a POST or PUT request with Content-Type: application/geo+json is submitted with Prefer: handling=strict, the payload is validated against the schema. If the request also indicates the JSON-FG profile (Link: <http://www.opengis.net/def/profile/OGC/0/jsonfg>; rel=profile), the JSON-FG variant is used, otherwise the GeoJSON variant.
We also add additionalProperties: false as there cannot be additional properties in the returnables (by design) and for the receivables we reject unknown properties as those cannot be stored.
PS: In the OpenAPI definition ldproxy currently only represents the GeoJSON schema for application/geo+json, not the JSON-FG extensions.
I did some research and found related discussions in the OpenAPI spec repo and related provisions in the JSON API spec:
- https://github.com/OAI/OpenAPI-Specification/issues/2342
- https://github.com/OAI/OpenAPI-Specification/issues/1342
- JSON API