Can Large Language Models Save "Good Old…

Hi Richard,

Yep, lots of acronyms... I'm working on a pitch for a start-up venture, expanding access to LLMs for industries not directly doing "AI," and a lot of that tech-speak got lodged in that post. Anyway thanks for your comment and I hope you emerge from chewing over the tech-speak not too worse for the wear....

Expand full comment

Gerben Wierda

Jan 23, 2024

Very good. I don't think fine-tuning is entirely out of the picture , there will be hundreds of thousands of them, it will be a combination of pre-training by the OpenAI's (with competition of open source LLama etc), expensive but important fine-tuning, and in-context learning (prompts). Maybe GOFAI could actually be a source for fine-tuning and not only ICL.

Expand full comment

Hi Gerben,

Interesting. How GOFAI for fine-tuning I'm wondering. I suppose if you take the labels in the supervised training for concepts, you're in a sense using an ontology to fine-tune the model. I'd love to hear further thoughts on this!

Expand full comment

Gerben Wierda

Yep. So not "How LLMs save GOFAI" but "how GOFAI can 'save' LLMs". Expert systems in particular could possibly be transformed to creating fine-tuning material for LLMs. Your story triggered me in realising that this could work. The strong point of LLMs is that they are the ultimate interactive documentation (you can have a conversation with the documentation — very cool), but being able to converse with it comes unavoidably with unpredictability and thus unreliability.

Expand full comment

Definitely agree.

Expand full comment

Jeffrey Quackenbush

Jan 23, 2024

It's interesting, for me, to read about the way these models and systems are organized. It brings me back to my undergraduate degree in Linguistics. One could, for instance, take the programmatic elements you describe here for these transformational functions and plug them into something like Langacker's theory of Cognitive Linguistics. I'm surprised the Chomskyites aren't all over the LLMs as proof of their "transformational" approach to language description (Chomsky put out a piece in the NYT undermining ChatGPT last year -- https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html) -- if only the discussion centered around algorithmic efficiency!

For my perspective, many of these approaches to defining what language is and how it works are interesting and instructive failures. Nevertheless, if AI researchers spent more time trying to sync up AI systems that mostly consume language with a more viable theory of mind than {brain = computer}, it would be more compelling what I see in contemporary discourse, and I'm surprised nobody has tried this approach.

How much linguistics training do AI researchers generally undergo?

Expand full comment

Hi Jeffrey, I love this: "Nevertheless, if AI researchers spent more time trying to sync up AI systems that mostly consume language with a more viable theory of mind than {brain = computer}, it would be more compelling..."

There are some very good computational linguists out there, but it seems using a linguistic-first approach is too "brittle" for real-world applications. There are too many ways language can branch out, garden path, and create complexity for a computer. It turned out to be a lot better to put the comp. ling in the background, as say the popular NLP library SpaCy does, using a dependency grammar and not a phrase structure grammar more common in linguistics proper. You might look at it this way, we need to "dumb down" linguistics to make it usable on a computer.

On your question, typically very little. I doubt many NLP data scientists today know much about Chomskyan ideas at all. There are exceptions to this, of course, but it seems have been left behind as all things turned to data-driven methods and away from deep structure.

Thanks for your thoughts as always.

Expand full comment

Jeffrey Quackenbush

I paid very little attention to computational linguistics when I was studying. My impression is that they spent a lot of time compiling statistics on the incidence and correlation of certain grammatical features or lexical items in big language corpora. This type of information can be mildly interesting, but it's trivia in the larger scheme of things. However, this was 20-odd years ago. Is there more to it?

What's the difference between dependency grammar and phrase structure grammar?

In terms of grammar, my focus was more traditional, I suppose you could say. I am not a fan of Chomskyan linguistics. For English, I worked through Quirk & Greenbaum's seminal grammar, and know this material reasonably well (although it has been a long time now). Their approach is structured around the interaction of linguistic form and linguistic function across grammatical constituents at different levels of grammatical organization from word to complex sentence. As I mentioned, I was mostly interested in paralanguage and rhythm, both highly neglected topics. I'm guessing LLMs can't process this aspect of language.

Expand full comment

dpi

Jan 24, 2024

I see that symbolism is still not dead. Back in the 80s I was against it, but it has now gone too much to the other extreme.

The embedding in LLMs intuitively show that 'concepts' are present in the system. It could be argued that attention makes ontologies emerge from the lattice. They are implicit, not explicit.

I share the view that real intelligence need them to be explicit.

It looks to me that recent developments are going towards this, like Q* or the DeepMing Olympiad Geometry paper from last week. There is also ongoing research analysing the content of trained networks.

Also Gemini is allowed to learn online and therefore you can add an ontology to the system and then use it. Expect OpenAI to release something to answer that (specific, automated, fine tuning per client ?).

It is moving fast !

Expand full comment