显卡是8G的,跑起来风扇转速就直接拉满。。平常理解一下小命令应该是没有毛病了。
最后想问问 ollama拉取的模型默认放c盘的,怎么才能改到其他盘啊 ?已经下载的模型可以直接迁移吗
设置OLLAMA_MODELS
ollama/docs/windows.md at main · ollama/ollama
The Ollama install does not require Administrator, and installs in your home directory by default. You’ll need at least 4GB of space for the binary install. Once you’ve installed Ollama, you’ll need additional space for storing the Large Language models, which can be tens to hundreds of GB in size. If your home directory doesn’t have enough space, you can change where the binaries are installed, and where the models are stored.
To install the Ollama application in a location different than your home directory, start the installer with the following flag
OllamaSetup.exe /DIR=“d:\some\location”
To change where Ollama stores the downloaded models instead of using your home directory, set the environment variable OLLAMA_MODELS
in your user account.
OLLAMA_MODELS
where you want the models storedIf Ollama is already running, Quit the tray application and relaunch it from the Start menu, or a new terminal started after you saved the environment variables.
看起来不错
原来是这样
我装了一个1.5B 的玩具,7B的我那8G的小电脑一跑就宕机
8b 没有 7b 强啊
嗯?有什么说法吗。。
7b 是 qwen 7b 做基座的,比 llama 3.1 8b 新,跑分和中文能力应该也更好。
就是首字和推理可能会慢不少。
显卡8G就已经能跑得起来了嘛,看起来还行耶
好家伙,居然能跑起来
按localllama上的说法,1.5b就已经很强了
想问下最低要求的显卡配置如何,想部署一个玩玩,但是不知道我的4060顶不顶得住
我就是4060.
我用mac mini跑了14b,很爽
4g显存电脑装了7b,除了速度有点慢,其他还好
4g显存跑7b怎么说就是输出太慢了
我用arc 770 16g加 64g内存部署了14b的,跑起来速度在9t/s,日常对话够用了
7b为啥会比8b慢?
不知道,可能和架构有关。qwen 吐字就是比 llama 慢。。。
不可能,是不是加载7b的时候显存没释放?