11 Comments

So, at some given point, no matter how impressive the LLM, the principle of “GIGO” is always with us... Thanks for a very interesting article: I’ll need to chew it over a bit before the acronyms “stick”, but this stuff is conceptually fascinating (and very current).

Expand full comment

Hi Richard,

Yep, lots of acronyms... I'm working on a pitch for a start-up venture, expanding access to LLMs for industries not directly doing "AI," and a lot of that tech-speak got lodged in that post. Anyway thanks for your comment and I hope you emerge from chewing over the tech-speak not too worse for the wear....

Expand full comment

Very good. I don't think fine-tuning is entirely out of the picture , there will be hundreds of thousands of them, it will be a combination of pre-training by the OpenAI's (with competition of open source LLama etc), expensive but important fine-tuning, and in-context learning (prompts). Maybe GOFAI could actually be a source for fine-tuning and not only ICL.

Expand full comment

Hi Gerben,

Interesting. How GOFAI for fine-tuning I'm wondering. I suppose if you take the labels in the supervised training for concepts, you're in a sense using an ontology to fine-tune the model. I'd love to hear further thoughts on this!

Expand full comment

Yep. So not "How LLMs save GOFAI" but "how GOFAI can 'save' LLMs". Expert systems in particular could possibly be transformed to creating fine-tuning material for LLMs. Your story triggered me in realising that this could work. The strong point of LLMs is that they are the ultimate interactive documentation (you can have a conversation with the documentation — very cool), but being able to converse with it comes unavoidably with unpredictability and thus unreliability.

Expand full comment

Definitely agree.

Expand full comment

It's interesting, for me, to read about the way these models and systems are organized. It brings me back to my undergraduate degree in Linguistics. One could, for instance, take the programmatic elements you describe here for these transformational functions and plug them into something like Langacker's theory of Cognitive Linguistics. I'm surprised the Chomskyites aren't all over the LLMs as proof of their "transformational" approach to language description (Chomsky put out a piece in the NYT undermining ChatGPT last year -- https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html) -- if only the discussion centered around algorithmic efficiency!

For my perspective, many of these approaches to defining what language is and how it works are interesting and instructive failures. Nevertheless, if AI researchers spent more time trying to sync up AI systems that mostly consume language with a more viable theory of mind than {brain = computer}, it would be more compelling what I see in contemporary discourse, and I'm surprised nobody has tried this approach.

How much linguistics training do AI researchers generally undergo?

Expand full comment

Hi Jeffrey, I love this: "Nevertheless, if AI researchers spent more time trying to sync up AI systems that mostly consume language with a more viable theory of mind than {brain = computer}, it would be more compelling..."

There are some very good computational linguists out there, but it seems using a linguistic-first approach is too "brittle" for real-world applications. There are too many ways language can branch out, garden path, and create complexity for a computer. It turned out to be a lot better to put the comp. ling in the background, as say the popular NLP library SpaCy does, using a dependency grammar and not a phrase structure grammar more common in linguistics proper. You might look at it this way, we need to "dumb down" linguistics to make it usable on a computer.

On your question, typically very little. I doubt many NLP data scientists today know much about Chomskyan ideas at all. There are exceptions to this, of course, but it seems have been left behind as all things turned to data-driven methods and away from deep structure.

Thanks for your thoughts as always.

Expand full comment

I paid very little attention to computational linguistics when I was studying. My impression is that they spent a lot of time compiling statistics on the incidence and correlation of certain grammatical features or lexical items in big language corpora. This type of information can be mildly interesting, but it's trivia in the larger scheme of things. However, this was 20-odd years ago. Is there more to it?

What's the difference between dependency grammar and phrase structure grammar?

In terms of grammar, my focus was more traditional, I suppose you could say. I am not a fan of Chomskyan linguistics. For English, I worked through Quirk & Greenbaum's seminal grammar, and know this material reasonably well (although it has been a long time now). Their approach is structured around the interaction of linguistic form and linguistic function across grammatical constituents at different levels of grammatical organization from word to complex sentence. As I mentioned, I was mostly interested in paralanguage and rhythm, both highly neglected topics. I'm guessing LLMs can't process this aspect of language.

Expand full comment

I see that symbolism is still not dead. Back in the 80s I was against it, but it has now gone too much to the other extreme.

The embedding in LLMs intuitively show that 'concepts' are present in the system. It could be argued that attention makes ontologies emerge from the lattice. They are implicit, not explicit.

I share the view that real intelligence need them to be explicit.

It looks to me that recent developments are going towards this, like Q* or the DeepMing Olympiad Geometry paper from last week. There is also ongoing research analysing the content of trained networks.

Also Gemini is allowed to learn online and therefore you can add an ontology to the system and then use it. Expect OpenAI to release something to answer that (specific, automated, fine tuning per client ?).

It is moving fast !

Expand full comment

Great comment dpi, thanks. I worked as an ontologist and have done much "symbolic" work over the years. It always felt like we were building a car without an engine. Maybe the engine will have to be a deep neural network with attention, and the explicit representation part of the car. It'll be interesting to see what happens.

Expand full comment