SchemaDefinition.Create picks up internal fields instead of just public only - affects F#
This used to work before but now I cannot use CreateFromEnumerable<T> in F# now.
In F#, we define mutable classes by annotating F# 'record' types with CLIMutable:
[<CLIMutable>]
type D =
{
Data : float[]
}
F# compiler generates IL that looks as follows:
public class D [ interface list ]
{
internal Data@ = Double[];
public Double[] Data
{
get
{
return Data@;
}
set
{
Data@ = value;
}
}
[ more methods including a default constructor ]
}
The SchemaDefinition.Create method picks up both "Data@" and "Data" as fields required by the schema. It should only pickup the public fields.
Please add unit tests for F# as well so these issues are caught earlier.
@dsyme
I just found out that one workaround for the above issue is to set "ignoreMissingColumns=true" when calling CreateEnumerable(...)
Glad you found a workaround, but yeah that isn't the behavior we want I don't think.
When did you notice this change? We haven't made a recent changes to SchemaDefinition.Create that I am aware of. I wonder if it has to do with a .NET version change? What version are you using now? Could you also try an older version?
I think the change was F# tooling rather than ML.Net (rethinking the issue). Some newer version of F# (maybe 5.x) changed the IL generation for mutable records, surfacing this issue.
Totally agree that only public fields should be considered for serialization/deserialization.
Note also that, as-is, when creating a DataView (LoadFromEnumerable…) , the xxx@ fields also get added into the dataset, bloating its size.
Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows
From: Michael @.> Sent: Monday, June 13, 2022 1:51 PM To: @.> Cc: @.>; @.> Subject: Re: [dotnet/machinelearning] SchemaDefinition.Create picks up internal fields instead of just public only - affects F# (Issue #6209)
Glad you found a workaround, but yeah that isn't the behavior we want I don't think.
When did you notice this change? We haven't made a recent changes to SchemaDefinition.Create that I am aware of. I wonder if it has to do with a .NET version change? What version are you using now? Could you also try an older version?
— Reply to this email directly, view it on GitHubhttps://github.com/dotnet/machinelearning/issues/6209#issuecomment-1154210278, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AACGZMFVW53NGRAHXS6NG2TVO5YKFANCNFSM5XKT6S2Q. You are receiving this because you authored the thread.Message ID: @.***>
Since this is a change by the F# tooling there is nothing that we can do about it unfortunately. Closing this for now, but if things change in the future please create a new issue and we will take another look.
@michaelgsharp Just to check....
There are some situations where F# needs to emit public fields - notably when emitting code into multiple assemblies in F# Interactive.
My understanding is that serialization of fields should ignore
- internal fields
- private fields
- public fields marked "CompilerGenerated".
This is what System.Json and Newtonsoft.Json both do. Does SchemaDefinition do that? It would be great if it does.
See also https://github.com/dotnet/fsharp/pull/13494 which is related
@fwaris Do you know if you hit this issue when using F# Interactive?
Thanks Don
@dsyme let me reopen the issue for now while I look into that. I am not actually sure off of the top of my head.