量化_1

2025年6月6日

10:28

量化通常用于推理，训练时建议全精度（FP32/BF16）+梯度裁剪/缩放。

但在实际训练的过程中，由于只有16G显存，如果不量化模型的话，batch_size只有设置很小的数如1，导致train_loss不稳定。

不建议同时使用4-bit量化和BF16训练。两者组合极可能导致错误或训练失败。优先选择全精度训练，或仅对推理模型量化。

如果用SentenceTransformer训练，模型是stella_en_1.5B，量化参数如下，当mine_hard_negatives时发现positive的相似度是nan，而且train_loss是0。

Metric       Positive       Negative     Difference
Count           6,775         13,550
Mean              nan         0.4492            nan
Median            nan         0.4470            nan
Std               nan         0.0342            nan
Min               nan         0.3391            nan
25%               nan         0.4241            nan
50%               nan         0.4470            nan
75%               nan         0.4739            nan
Max               nan         0.5625            nan

bnb_config = BitsAndBytesConfig(

load_in_4bit=True,

bnb_4bit_quant_type="nf4",

bnb_4bit_compute_dtype=torch.bfloat16,

bnb_4bit_use_double_quant=True,

)

model = SentenceTransformer(

params.model_name,

trust_remote_code=True,

model_kwargs={

"quantization_config": bnb_config,

# "device_map": "auto",

)

查找到reddit上的相关问题：

-weird, what model is this? Some models use special methods that cannot just be quanted. But what model is it?

-Stella 1.5 billion。It’s an embedding model.

-GGUF only really works for chat models, not for prediction models like this. Try quanting BERT and the same thing will happen.

可能是stella模型的问题，这种模型不能量化，不是chat_models。