Description
I'm currently evaluating setfit in a proof of concept situation. Unfortunately, I'm working behind a company firewall, where I do not have access to the world wide web, only to company-internal URLs.
That's a bit annoying in terms of downloading models, but I can work around that. More importantly, it seems there are calls to huggingface.co:443 after the training is done, which obviously cannot succeed due to the blocked internet access.
That wouldn't be big problem if the timeout were 1 minute or so, but it seems to be more like 5-10 minutes, which is a lot of time wasted just waiting for the results.
How can I disable these blocking HTTP requests?
My minimal training pipeline looks somewhat like this (shortened for readability, especially data loading):
model = SetFitModel.from_pretrained(
"/local/path/local-bge-small-en-v1.5",
local_files_only=True,
multi_target_strategy="multi-output",
)
train_dataset, test_dataset = a_bunch_of_loading_and_sampling_code_thats_irrelevant_here()
args = TrainingArguments(
batch_size=128,
num_epochs=10,
report_to=None
)
trainer = Trainer(
model=model,
args=args,
train_dataset=train_dataset,
metric="f1",
callbacks=None,
column_mapping={"column": "mapping"},
metric_kwargs={"average": "samples"}
)
trainer.train()
After all training steps are done, I get the following console logs:
INFO:sentence_transformers.trainer:Saving model checkpoint to checkpoints/checkpoint-258
INFO:sentence_transformers.SentenceTransformer:Save model to checkpoints/checkpoint-258
Request [id]: GET https://huggingface.co/api/models/setfit-test/local-bge-small-en-v1.5 (authenticated: False)
DEBUG:huggingface_hub.utils._http:Request [id]: GET https://huggingface.co/api/models/setfit-test/local-bge-small-en-v1.5 (authenticated: False)
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): huggingface.co:443
Then nothing happens for about 10 minutes, before I get a "Batches: 100% [tqdm progress bar]", which is however finished almost immediately.
Is there any parameter I can set to disable this call to huggingface? "report_to=None" or "callbacks=None" don't seem to do the trick.