cog icon indicating copy to clipboard operation
cog copied to clipboard

Path handling with Input

Open fetimo opened this issue 7 months ago • 0 comments

Hi!

I noticed some strange behaviour when using the Path input and I'm not sure if this is a feature or a bug.

If I have a training definition like so:

from cog import Path

def train(
  dataset_path: Path = Input("Path to dataset directory.")
) -> TrainingOutput:
  print(dataset_path)

and invoke it with cog train -i dataset_path="training_data", where training_data is a local directory it throws an error complaining that it isn't one of http, https, or data schema.

{
       "detail": [
               {
                       "loc": [
                               "body",
                               "input",
                               "dataset_path",
                               "is-instance[Path]"
                       ],
                       "msg": "Input should be an instance of Path",
                       "type": "is_instance_of"
               },
               {
                       "loc": [
                               "body",
                               "input",
                               "dataset_path",
                               "function-plain[validate()]"
                       ],
                       "msg": "Value error, '' is not a valid URL scheme. 'data', 'http', or 'https' is supported.",
                       "type": "value_error"
               }
       ]
 }

If I then point it at a hosted file and run it again with cog train -i dataset_path="https://my-bucket.com/bucket" it passes schema validation but fails to download and create a temporary file (the file is a zip hosted with GCP and is public, I've had success with Cloudinary so this is odd). I'm not sure what's causing the error here but the docs say that "[Path] represents a path to a file on disk." so it's a bit surprising that you can also pass it a URL.

I've solved it by changing it slightly to:

from cog import Path

def train(
  dataset_path: str = Input("Path to dataset directory.")
) -> TrainingOutput:
  dataset_path = Path(dataset_path)
  print(dataset_path)

note that it now uses dataset_path as a simple str and converts it to a Path in the body itself.

Is the first example, passing a local filepath and getting an error, a bug? And is the behaviour of Path and Input expected?

fetimo avatar Jun 25 '25 15:06 fetimo