Model swapping for llama.cpp (or any local OpenAPI compatible server)
-
Updated
Jun 20, 2025 - Go
Model swapping for llama.cpp (or any local OpenAPI compatible server)
Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on different ports and loading/unloading them on demand
Generate synthetic datasets using local LLMs via Ollama and LMstudio with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other major language models.
Add a description, image, and links to the localllm topic page so that developers can more easily learn about it.
To associate your repository with the localllm topic, visit your repo's landing page and select "manage topics."