Define "do_sample" explicitly in generation_config.json

by Corellios - opened 10 days ago

base: refs/heads/main

←

from: refs/pr/6

Discussion Files changed

-1

Corellios

10 days ago

Hi again :)

I also got the following error when loading the model:

model = AutoModelForCausalLM.from_pretrained(MODEL_ID, dtype=torch.bfloat16, low_cpu_mem_usage=True, device_map="cuda", local_files_only=True)
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:10<00:00, 3.42s/it]
The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set TRANSFORMERS_VERBOSITY=info for more details.

When trying to save the model, it says that the default value for "do_sample" is false when not explicitly defined.

model.save_pretrained(SAVE_DIR, safe_serialization=True, max_shard_size="10GB")
2025-11-28T17:19:45.738718+0100 | get_model_compressor | INFO - skip_sparsity_compression_stats set to True. Skipping sparsity compression statistic calculations. No sparsity compressor will be applied.
Compressing model: 224it [00:05, 40.05it/s]
Traceback (most recent call last):
File "/home/thibaut/tools/llm-compressor/.venv/lib/python3.12/site-packages/transformers/generation/configuration_utils.py", line 723, in save_pretrained
self.validate(strict=True)
File "/home/thibaut/tools/llm-compressor/.venv/lib/python3.12/site-packages/transformers/generation/configuration_utils.py", line 684, in validate
raise ValueError("GenerationConfig is invalid: \n" + info_message)
ValueError: GenerationConfig is invalid:
temperature: do_sample is set to False. However, temperature is set to 0.6 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature.
top_p: do_sample is set to False. However, top_p is set to 0.95 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset top_p.
If you're using a pretrained model, note that some of these attributes may be set through the model's generation_config.json file.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 1, in
File "/home/thibaut/tools/llm-compressor/.venv/lib/python3.12/site-packages/llmcompressor/transformers/compression/compressed_tensors_utils.py", line 96, in save_pretrained_wrapper
original_save_pretrained.get(model, model_class)(
File "/home/thibaut/tools/llm-compressor/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3930, in save_pretrained
model_to_save.generation_config.save_pretrained(save_directory)
File "/home/thibaut/tools/llm-compressor/.venv/lib/python3.12/site-packages/transformers/generation/configuration_utils.py", line 725, in save_pretrained
raise ValueError(str(exc) + "\n\nFix these issues to save the configuration.")
ValueError: GenerationConfig is invalid:
temperature: do_sample is set to False. However, temperature is set to 0.6 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature.
top_p: do_sample is set to False. However, top_p is set to 0.95 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset top_p.
If you're using a pretrained model, note that some of these attributes may be set through the model's generation_config.json file.
Fix these issues to save the configuration.

Define "do_sample" explicitly in generation_config.json7ccdc12d

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment