Issues with taking in and outputting dataFrameDirectory in a custom component on azureml studio
I am having issues with creating a custom component. Not sure where to ask for help, so I will post it here for now. Please direct me to the right channel if this is not the right one.
Basically, I am trying to create a component that takes in a dataFrameDirectory, cleans the data up, and output a cleaned up version of the dataFrameDirectory. I am having issues with retrieving the data frame.
I have this in my yml file:
inputs: dataset: display_name: Input Dataset type: DataFrameDirectory optional: false description: the data set containing the text data for cleaning outputs: cleaned_dataset: display_name: Cleaned Dataset type: DataFrameDirectory description: the cleaned up version of the input dataset command: >- python textDataPreprocess.py --dataset {inputs.dataset} --output_dir {outputs.cleaned_dataset}
In my python file I have this: ` from azureml.studio.core.io.data_frame_directory import ( load_data_frame_from_directory, save_data_frame_to_directory, ) from azureml.studio.core.data_frame_schema import DataFrameSchema
parser.add_argument( "--dataset", help="Dataset",) parser.add_argument( "--output_dir", help="set the output location",) args = parser.parse_args()
df = load_data_frame_from_directory(args.dataset).data
clean_up(df)
save_data_frame_to_directory( save_to=args.output_dir, data=cleaned_data, schema=DataFrameSchema.data_frame_to_dict(cleaned_data), ) `
I tried running the pipeline containing this component many times while changing things here and there to try to get it working.
The latest error I have is this:

Any help would be much appreciated! Thanks!
Thank you for creating the issue! One of our team members will get back to you shortly with additional information. If this is a product issue, please close this and contact the particular product's support instead (see https://support.microsoft.com/allproducts for the list of support websites).