-
Notifications
You must be signed in to change notification settings - Fork 30.1k
Description
System Info
transformers
version: 4.32.1- Platform: Linux-3.10.0-1160.80.1.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.8.17
- Huggingface_hub version: 0.16.4
- Safetensors version: 0.3.2
- Accelerate version: 0.21.0
- Accelerate config: not found
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
my code
import torch
from transformers import AutoTokenizer
model_name_or_path = 'llama-2-7b-hf'
use_fast_tokenizer = False
padding_side = "left"
config_kwargs = {'trust_remote_code': True, 'cache_dir': None, 'revision': 'main', 'use_auth_token': None}
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=use_fast_tokenizer, padding_side=padding_side, **config_kwargs)
the error is
Traceback (most recent call last):
File "", line 1, in
File "/root/anaconda3/envs/llama_etuning/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 727, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/anaconda3/envs/llama_etuning/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
File "/root/anaconda3/envs/llama_etuning/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2017, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/anaconda3/envs/llama_etuning/lib/python3.8/site-packages/transformers/models/llama/tokenization_llama.py", line 156, in init
self.sp_model = self.get_spm_processor()
File "/root/anaconda3/envs/llama_etuning/lib/python3.8/site-packages/transformers/models/llama/tokenization_llama.py", line 164, in get_spm_processor
model_pb2 = import_protobuf()
File "/root/anaconda3/envs/llama_etuning/lib/python3.8/site-packages/transformers/convert_slow_tokenizer.py", line 40, in import_protobuf
return sentencepiece_model_pb2
UnboundLocalError: local variable 'sentencepiece_model_pb2' referenced before assignment
Expected behavior
what I need to do to solve the problem