Validate `consumes` and infer `produces` for Lightweight Python components
When the user uses Lightweight Python components (https://github.com/ml6team/fondant/issues/558) we want to get any information we currently get from the component spec from the provided Python code.
For the consumes section, we can assume it matches the schema of the dataset the operation is applied to, possibly altered by the consumes argument passed to the apply method.
For the produces section, the user can either provide a schema via the produces argument on the apply method, or we can try to infer it by simulating the transform function. We could do this by generating dummy data based on the consumes schema, and applying the transform method on it.
This only makes sense for Transform components since we always expect the user to provide a produces schema for a Read component, and a Write component doesn't produce anything.
Inferring the produces schema by simulation would also validate the consumes schema if it succeeds. It doesn't invalidate it when failing though, since there can be multiple reasons for a failed simulation: either the consumes schema is incorrect, there's a bug in the component, or a bug in the dummy data generation.
https://www.coiled.io/blog/dask-dtype-astype
Happy to hear additional opinions on #806. Implements a produce infer for the PandasTransformer components under the prerequisites that all needed requirements are installed on the local machine.