You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
model_path = "LinkSoul/Chinese-Llama-2-7b"
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(model_path).half().cuda()
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
instruction = """[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n<</SYS>>\n\n{} [/INST]"""
prompt = instruction.format("用中文回答,When is the best time to visit Beijing, and do you have any suggestions for me?")
generate_ids = model.generate(tokenizer(prompt, return_tensors='pt').input_ids.cuda(), max_new_tokens=4096, streamer=streamer)
You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at huggingface/transformers#24565
[2023-07-28 10:55:34,419] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
进程已结束,退出代码137 (interrupted by signal 9: SIGKILL)
运行readme.md的快速测试代码
至
model = AutoModelForCausalLM.from_pretrained(model_path).half().cuda()
处,内存暴涨至100%后终止,报如下信息:You are using the legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at huggingface/transformers#24565
[2023-07-28 10:55:34,419] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
进程已结束,退出代码137 (interrupted by signal 9: SIGKILL)
请问是什么原因?
(使用 /~https://github.com/lvwerra/trl 进行微调完全可以顺利运行,显存基本可以满载,说明cuda应该是没问题的)
The text was updated successfully, but these errors were encountered: