When filtering unused schemas, schemas that are only self-referenced still show up
We're trying to use this filterFile:
{
"methods": [],
"inverseMethods": [],
"tags": [],
"inverseTags": [],
"operationIds": [],
"inverseOperationIds": [],
"operations": [],
"flags": [],
"inverseFlags": [],
"flagValues": [],
"inverseFlagValues": [],
"unusedComponents": [
"schemas"
],
"stripFlags": [],
"responseContent": [],
"inverseResponseContent": [],
"requestContent": [],
"inverseRequestContent": []
}
to filter our OpenAPI specs. It works for most use-cases, but schemas like this, that are not being referenced elsewhere in the filtered spec:
components:
schemas:
selfRefSchema:
type: object
properties:
children:
type: array
items:
$ref: "#/components/schemas/selfRefSchema"
are also being captured
hi @aneeshnazar
I recreated your example in the playground
If I understand correctly, you also want to remove schema that are only referencing themselves? If they would be used somewhere else they would have to remain?
I'm bit surprised how a schema would reference itself, can you share that case so we can look into building the logic to filter out this case.
hi @aneeshnazar
Did you have some time to look at my questions? So we can further look into this.
I'm also running into this issue as well. For my use-case, when removing unused schemas, I want to remove all schemas that do not show up within any of the paths endpoints.
So even though a schema references itself in schemas, it is still unused within any of the paths, and I am seeing the self-referenced schemas kept although it is unused.
hi @shaun-jacks
With "unused schemas", you mean not used in "Paths". The current logic checks for unused anywhere, so if a schema is still referenced in the schema section itself, for now it is still kept.
Am I correct that your use-case is to have a clean spec with only "used" schemas, that one way or another are referenced in Paths?
Yeah I definitely see why with the current logic it is still kept!
Yes by unused schemas, I mean schemas not referenced at all within "Paths"! My use-case is yes to have a clean spec, where the only schemas left, are the ones referenced in Paths
hi @shaun-jacks
The case explained makes total sense, so I'll try to come with a extension of the unused logic to also remove nested schemas that are not referenced in paths, and have no reason to remain in the OpenAPI document.
Thank you really appreciate it!
hi @shaun-jacks & @aneeshnazar
I just released a new version of openapi-format (1.25.3), which has the improvements to handle unused components that are not referenced/used in any of the paths. This should result in a more compact OpenAPI, which keeps only the actual used components.
I created a playground to showcase latest version.
It does contain a self-referenced schema, that is now detected and removed.
Let me know if this works for you as expected? if it does, feel free to close this issue. If did not solve your case, provide some more details so I can reproduce your case.
Hi @thim81 thanks for working on this so quickly! I've tested this out myself.
So far, it does work as expected! However, I am noticing a performance-hit when using this for a very large OpenApi Spec.
For example, I am using the filter multiple times for a large spec, it was taking about 10 seconds, but is now taking 4-5minutes.
Would it be possible to add a flag to opt-out of removing self-referenced schemas, or opt-in to removing self-referenced schemas?
Depending on how large the spec is, it'll allow the user to choose if they want to use it, or if it is taking a very long time to run, they can opt-out of using the unused self-referenced schema filtering.
In summary, it does solve the issue, but the performance took a big hit for my use-cases.
Hi @shaun-jacks
Thanks for sharing this vital feedback.
This is indeed an unacceptable performance degradation. It should remain seconds, not minutes.
Can you share how big (line numbers) the OA document is before and after the filtering?
I ll work on a new version, where I might have to introduce an option (like you suggested) or change the approach. Now I added a loop which seems to have this performance impact.
FYI: I'm going to do some performance testing with the Stripe OpenAPI document which is a 5MB YAML file.
Before: 7061.96 ms or 7 sec Now: 298219.74 ms or 4,97min
So your case, can be reproduced.
I'll try to figure out how to make it fast again.
hi @shaun-jacks
We just released openapi-format v 1.25.4, which should resolve the performance degradation.
Performance testing with the Stripe OpenAPI document (5MB YAML file):
1.25.2 version: 7061.96 ms or 7 sec 1.25.3 version: 298219.74 ms or 4,97min
=> 1.25.4 version: 7325.22 ms or 7,3 sec
The outcome is the same, meaning that the unused $ref items in the components should properly be removed.
before:
after:
Can you let me know, how the latest version behaves for your OpenAPI document?
Hi @thim81 thank you very much for the quick fix again! I can confirm the performance is now a lot better, maybe even faster than it was at 1.25.2. I am seeing both the unused circular schemas removed, and nice performance as well!
hi @shaun-jacks
It sounds like openapi-format is working for you as expected, with a decent/good performance. I'm going to close this item.
Thank you and @aneeshnazar again, for taking the time to share all your feedback and reporting the performance degredation. Great to see that openapi-format is being used and gets better thanks to active users like yourself.