Dean Wyatte

Results 9 comments of Dean Wyatte

This issue looks like it used to be focused on Google Cloud Run, but I'm interested in deploying using GCP AI Platform's [custom container](https://cloud.google.com/ai-platform/prediction/docs/use-custom-container) functionality which looks to be more...

I could use this functionality, so I put together a PR using @kyamagu's suggestion to use `fsspec` in `datasets.utils.file_utils` https://github.com/huggingface/datasets/pull/5580

The current implementation depends on gcsfs/s3fs being able to authenticate through some other means e.g., environmental variables. For AWS, it looks like you can set `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_SESSION_TOKEN` Note...

> Note that while testing this just now, I did note a discrepancy between gcsfs and s3fs that we might want to address where gcsfs passes the timeout from storage_options...

@lhoestq I've been using this feature for the last year on GCS without problem, but I think we need to fix an issue with S3 and then document the supported...

> In the meantime you can install ray manually first. Turns out this doesn't work because of the pinned version of trlx from git. We should be able to pin...

After digging in, I think blocks is doing the optimal thing here by reading all datafile frames into a list and then concatenating. Pandas will always require 2x memory when...

Assuming **text-generation-inference 3.0.0** from here unless otherwise noted. With `CUDA_LAUNCH_BLOCKING=1`, the source of the error looks to be `flashinfer`/`BatchPrefillWithPagedKVCache` ``` #033[2m2025-01-17T15:08:27.088671Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Args { model_id: "/tmp/tgi/model", revision: None,...

It looks like this may have been added in https://github.com/huggingface/text-generation-inference/pull/2046