arktype icon indicating copy to clipboard operation
arktype copied to clipboard

Allow string format as metadata associated with JSON schema

Open semohr opened this issue 1 year ago • 4 comments

Issue

When using the string.date.parse type in combination with toJsonSchema I expected the type to be parsed to string, maybe even with the format keyword (see json-schema).

Example

import {type} from "arktype";

const date = type("string.parse.date");
date.in.toJsonSchema(); // <-- Errors

This errors with Uncaught ParseError: Predicate $ark.isParsableDate is not convertible to JSON Schema.

Expected output:

{
"type":string",
 "format": "date"
}

Solution (proposed by @ssalbdivad)

Add format as a metadata key. This would have no effect on validation but would allow custom types like string.date that rely on non-serializable predicates in the type system to be converted to JSON schema.

The format key should be added to the output JSON schema alongside any other constraints. It should be specifically added to the non-serializable conditions it should replace, in this case a predicate for validating whether a string can be parsed as a Date. If multiple format constraints exist on the same IntersectionNode, they must be identical.

semohr avatar Aug 30 '24 14:08 semohr

I had a small look into the source and you could probably just quickly patch it by adding

parsableDate.toJsonSchema = () => ({
"anyOf": [
  {
  "type": "string",
  "format": "date-time",
  },
  {
  "type": "string",
  "format": "date",
  }
]
})

into ark/schema/type/keywords/string/date.

I'm not sure if this is where you want to do that, further you probably want some tests for that.

semohr avatar Aug 30 '24 14:08 semohr

I'll have to think about this a bit since format does not fit into the type system the same way pattern, divisor, etc. do.

If you're okay with a specific date format like ISO8601, you can use something like string.date.iso.parse:

const user = type({
	name: "string",
	birthday: "string.date.iso.parse"
})

const schema = user.in.toJsonSchema()

const result = {
	type: "object",
	properties: {
		birthday: {
			type: "string",
			pattern:
				"^([+-]?\\d{4}(?!\\d{2}\\b))((-?)((0[1-9]|1[0-2])(\\3([12]\\d|0[1-9]|3[01]))?|W([0-4]\\d|5[0-3])(-?[1-7])?|(00[1-9]|0[1-9]\\d|[12]\\d{2}|3([0-5]\\d|6[1-6])))([T]((([01]\\d|2[0-3])((:?)[0-5]\\d)?|24:?00)([.,]\\d+(?!:))?)?(\\17[0-5]\\d([.,]\\d+)?)?([zZ]|([+-])([01]\\d|2[0-3]):?([0-5]\\d)?)?)?)?$"
		},
		name: { type: "string" }
	},
	required: ["birthday", "name"]
}

I think having a way to add "format" as metadata to a string type though would be useful, so I'm going tweak this issue a bit to reflect the broader goal.

ssalbdivad avatar Aug 30 '24 15:08 ssalbdivad

In theory format date or date-time means it complies with the RFC 3339 specification. I'm not sure if the current string.date.parse functionalities fulfill the specification so using the regex might event be the correct choice.

semohr avatar Aug 30 '24 15:08 semohr

Yeah I can likely create some new keywords to specifically align with these standards, but generally these subtypes are stricter the further you chain them. string.date should be the most permissive, so it allows anything that can be parsed via new Date() that doesn't result in an invalid date.

Once we have the capability to associate JSON schema format as metadata, we'll have to ensure all the other format keywords are integrated with the type system as well.

ssalbdivad avatar Aug 30 '24 16:08 ssalbdivad

As the Temporal API starts to roll out (it's available now behind flags in Node, Deno and Bun), it will be nice to have RFC 9557 support:

2025-02-20T20:39:03-05:00[America/New_York]
2025-02-20T20:39:03Z[America/New_York]
2025-02-20T20:39:03-05:00[America/New_York][foo=bar]

jacksteamdev avatar Feb 20 '25 21:02 jacksteamdev