An LLM is two things: the model and the weights. The model is basically a description of how different layers fit together. It’s usually not that complicated (you can create quite good ones in a few hundred lines of code with modern frameworks). But, by itself, the model is useless because each layer is something like ‘take an input and permute it using this operation with an NxM matrix as the other operand’. That other operand is not part of the model, it’s in the weights. The weights are large. They are the result of training. You process a lot of data to generate them.
In a classical neural network, the model defines the topology, but each neurone has an activation threshold. When you train it, you feed a bunch of data through it and this sets the threshold values. Eventually, you stop and now you have a trained model. Modern deep learning models work in a similar way, but with a huge pile of optimisations. The weights are the valuable thing because it takes vast amounts of compute and data to produce them. They’re also completely opaque. They’re just a massive blob of data, so trying to figure out the behaviour of a trained model by looking at the weights is almost impossible, as is working out what went into their training sets.
Very few ‘open’ LLMs have weights that were trained on known and reproducible data sets. Things like Meta’s LLaMa are ‘open’ in that you can recreate the model yourself (as llama.cpp did) and download their weights, but you have no visibility into what the weights were trained on, can’t reproduce the training (unless you have a data centre and a massive pile of lawyers who will be able to defend you against copyright infringement lawsuits). Oh, and the license says that you agree never to sue Meta for any IP infringement, so if @pluralistic is using one of the ‘open’ LLaMa weights, he has just given Meta a perpetual license to use all of his work for any purpose. I’m sure he considers that a great deal for a grammar checker with a 50% false positive rate.
This, by the way, is why I really like Mozilla’s translation models (which are much simpler than a general purpose LLM, though they use much of the same underlying technology). They are trained on curated open datasets designed for training machine-translation systems and they are specifically designed so that you can redo the training on a single (powerful, but affordable [at least, before the bubblers decided to buy everything]) machine. That made them things that people could experiment with, exploring different model structures to see how it affected speed and accuracy.
So, yes, a local model will not send data across the network when you use it (hopefully. Unfortunately, most are distributed as Python code and a load of the ones on Hugging Face also came with bundled malware. I hope they’ve managed to fix that now), but they’re not open in any meaningful way, they are still subject to the whims of massive corporations, and they are building a dependency on the exact companies that Doctrow criticises and handing them a load of control over your workflow.