Post by @utopiah

IMHO LLM usage isn't coherent with independence. That being said I wrote quite a bit on self-hosting LLMs. There are quite a few tools available, like ollama itself relying on llama.cpp that can both work locally and provide an API compatible replacement to cloud services. As you suggested though typically at home one doesn't have the hardware, GPUs with 100+GB of VRAM, to run the state of the art. There is a middle ground though between full cloud, API key, closed source vs open source at home on low-end hardware : running STOA open models on cloud. It can be done on any cloud but it's much easier to start with dedicated hardware and tooling, for that HuggingFace is great but there are multiples. TL;DR: closed cloud -> models on clouds -> self-hosted provide a better path to independence, including training.

Conversation (1)