Discussion about this post

User's avatar
Jon Aarbakke's avatar

what do you make of the LLM ability to carry out this complex reasoning with double negations and whatnot - and still fail at simple cases that require a model of the world?

Have you - or anyone - a theory of how concepts are treated inside the LLM, where they "live" as it were?

I imagine there is a certain amount of smoke and mirrors here, too, in the sense that there is human input in the mix in the form of fine tuning where the LLM has been trained on the specific case your are putting to it, But the question remains - how does it do it?

I have seen most of 1brown3blue or vice versa and other videos.

In one of the videos 1brown3blue ( I think ) speculates about the ability of long vectors (embeddings) that are (somehow) not entirely orthogonal, show an ability to store/represent a massive amount of different concepts/information, orders of magnitude more than if they were orthogonal.

1 more comment...

No posts

Ready for more?