LLMs look up and interpolate programs, they don’t reason

François Chollet captures the nature of LLMs nicely in an interview on the Mindcape podcast. Apologies if I’ve misinterpreted any of these points.

LLMs, like much of deep learning, are about fitting curves to data. When they respond to a prompt, they are effectively looking up a program on that curve. When the program is run it will generate a word (or words, well, tokens really).

Their power is that they can interpolate between different learned data points. That is, they memorize the examples they have seen, but also generalize. How well they generalize is an open question, and outside the domain of the training data, they likely fail.

But that’s it: they don’t reason. They are still useful, and we’re still learning how they work and how they form semantic representations. It’s a fascinating area.

One quote, on the reason FC doesn’t think LLMs will have an impact on science:

What we have seen is that LLMs are very good at turning people who have no skill into people who are capable of an average, mediocre outcome. They are extremely bad at helping someone who’s already extremely good getting better. It basically doesn’t work.

And there are many reasons why, but empirically, this is what you see. So this is why I don’t think LLMs are going to have much impact in science. Science is not about more mediocre papers.