Problem
When you try to download a large (multiple GBs) Hugging Face model, you encounter an insufficient storage error.
OSError: [Errno 28] No space left on device
Cause
You’re trying to download the model to a root location, such as the root disk /tmp
. This storage can’t autoscale, so can’t accommodate the model artifacts.
Solution
Use a scalable storage location such as /local_disk
or Databricks File System (DBFS).
You can adapt and use the following code. It:
- Imports classes from the Hugging Face library.
- Specifies the pre-trained model to use and sets a custom scalable storage,
/local_disk0
, to save model files instead of fixed local storage. - Downloads and saves the tokenizer to the custom path.
- Downloads and saves the model weights to the same path.
from transformers import AutoModel, AutoTokenizer
model_name = "<your-model-name>"
custom_dir = "/local_disk0"
tokenizer = AutoTokenizer.from_pretrained(model_name,cache_dir=custom_dir)
model = AutoModel.from_pretrained(model_name, cache_dir=custom_dir)