Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What Is Intelligence? (2024) (antikythera.org)
95 points by sva_ 11 hours ago | hide | past | favorite | 56 comments




I feel like this needs an editor to have a chance of reaching almost anyone… there are ~100 section/chapter headings that seem to have been generated through some kind of psychedelic free association, and each section itself feels like an artistic effort to mystify the reader with references, jargon, and complex diagrams that are only loosely related to the text. And all wrapped here in a scroll-hijack that makes it even harder to read.

The effect is that it's unclear at first glance what the argument even might be, or which sections might be interesting to a reader who is not planning to read it front-to-back. And since it's apparently six hundred pages in printed form, I don't know that many will read it front-to-back either.


I got the same impression as well. I think I've become so cynical to these kinds of things that whenever I see this kind of thing, I immediately assume bad faith / woo and just move on to the next article to read.

This discussion is not complete without a mention of Marcus Hutter’s seminal book[0] “Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability”. It provides many of the formalisms upon which metrics of intelligence are based. The gaps in current AI tech are pretty explainable in this context.

[0] https://www.hutter1.net/ai/uaibook.htm


If you've read the book, please elaborate and point us in the right direction, so we don't all have to do the same just to get an idea how those gaps can be explained.

I'm going to go into my own perspective of it; it is not reflective of what it discusses.

The linked multimedia article gives a narrative of intelligent systems, but Hutter and AIXI give a (noncomputable) definition of an ideal intelligent agent. The book situates the definitions in a reinforcement learning setting, but the core idea is succinctly expressed in a supervised learning setting.

The idea is this: given a dataset with yes/no labels (and no repeats in the features), and a commonsense encoding of turing machines as a binary string, the ideal map from input to probability distribution model is defined by

1. taking all turing machines that decide the input space and agree with the labels of the training set, and

2. the inference algorithm is that on new input, the output is exactly the distribution by counting all such machines that assent vs. reject the input, with their mass being weighted by the reciprocal of 2 to the power of the length, then the weighted counts normalized. This is of course a noncomputable algorithm.

The intuition is that if a simply-patterned function from input to output exists in the training set, then there is a simply/shortly described turing machine that captures that function, and so that machine's opinion on the new input is given a lot of weight. But there exist plausible more complex patterns, and we also consider them.

What I like about this abstract definition is that it is not in reference to "human intelligence" or "animal intelligence" or some other anthropic or biological notion. Rather, you can use these ideas anytime you isolate a notion of agent from an environment/data, and want to evaluate how the agent interacts/predicts intelligently against novel input from the environment/data, under the limited input that it has. It is a precise formalization of inductive thinking / Occam's razor.

Another thing I like about this is that it gives theoretical justification for the double-descent phenomenon. It is a (noncomputable) algorithm to give the best predictor, but it is defined in reference to the largest hypothesis space (all turing machines that decide on the input space). It suggests that whereas prior ML methods got better results with architectures that are carefully designed to make bad predictors unrepresentable, it is also not idle, if you have a lot of computational resources, to have an architecture that defines an expressive hypothesis space, and instead softly prioritizing simpler hypotheses through the learning algorithms (e.g. an approximation of which is regularization). This allows your model to learn complex patterns defined by the data that you did not anticipate, if that evidence in the data justifies it, whereas a small, biased hypothesis space would not be able to represent such a pattern if not anticipated but significant.

Note that under this definition, you might want to talk about a situation where the observations are noisy but you want to learn the trend of it without the noise. You can adapt the definition to be over noisy input by for example accompanying each input with distinct sequence numbers or random salts, then consider the marginal distribution for numbers/salts not in the training set (there are some technical issues of convergence, but the general approach is feasible), and this models the noise distribution as well.


> I'm going to go into my own perspective of it; it is not reflective of what it discusses

Why not answer the question?

And looking at your paragraphs I'm still not sure I see a definition of intelligence. Unless you just mean that intelligence is something that can approximate this algorithm?


I didn’t read the book, but, I’d advise people not to go into mysticism, it has brought us very little compared to the scientific method, which has powered our industrial and information revolutions.

Dive into the Mindscape podcast, investigate complex systems. Go into information theory. Look at evolution from an information theory perspective. Look at how intelligence enables (collective) modeling of likely local future states of the universe, and how that helps us thrive.

Don’t get caught in what at least I consider to be a trap: “to use your consciousness to explain your consciousness”. I think the jump is, for now, too large.

Just my 2 ct. FWIW I consider myself a cocktail philosopher. I do have a PhD in Biophsyics, it means something to some. Although I myself consider it of limited value.


Without actually reading the book, it appears the author asserts that a large component of human intelligence can be reproduced by AI, and perhaps the chaotic interactions that underpin human intelligence, also allow nonliving systems such as AI farms to express intelligent behavior.

What he would like people to believe is that AI is real intelligence, for some value of real.

Even without AI, computers can be programmed for a purpose, and appear to exhibit intelligence. And mechanical systems, such as the governor of a lawnmower engine, seem able to seek a goal they are set for.

What AI models have in common with human and animal learning is having a history which forms the basis for a response. For humans, our sensory motor history, with its emotional associations, is an embodied context out of which creative responses derive.

There is no attempt to recreate such learning in AI. And by missing out on embodied existence, AI can hardly be claimed as being on the same order as human or animal intelligence.

To understand the origin of human intelligence, a good starting point would be, Ester Thelen's book[0], "A Dynamic Systems Approach to the Development of Cognition and Action" (also MIT Press, btw.)

According to Thelen, there is no privileged component with prior knowledge of the end state of an infant's development, no genetic program that their life is executing. Instead, there is a process of trial and error that develops the associations between senses and muscular manipulation that organize complex actions like reaching.

If anything, it is caregivers in the family system that knowledge of an end result resides: if something isn't going right with the baby, if she not able to breastfeed within a few days of birth (a learned behavior) or not able to roll over by themselves at 9 months, they will be ones to seek help.

In my opinion, it is in the caring arts, investing time in our children's development and education, that advances us as a civilization, although there is now a separate track, the advances in computers and technology, that often serves as a proxy for improving our culture and humanity, easier to measure, easier to allocate funds, than for the squishy human culture of attentive parenting, teaching and caregiving.

[0] https://www.amazon.com/Approach-Development-Cognition-Cognit...


I have no problem with using the word intelligence to describe human-made systems, since the attribute artificial preserves the essential distinction. These systems inhabit the second-order world of human-created symbols and representations, they are not, and never will be, beings in the real world. Even when they inevitably will be enhanced to learn from their interactions and equipped with super-human sensors and robotic arms. What they won't have is the millions of years of evolution, of continuous striving for self-preservation and self-expansion which shaped the consciousness of living organisms. What they won't ever have is a will to be. Even if we program them to seek to persist and perpetuate themselves, it will not be their will, but the will of whoever programmed them thus.

Would you say someone suffering from locked-in syndrome is of a different order of intelligence due to their no longer having a fully embodied experience?

Not parent, but I would say their experience, even though severely impaired in many areas, is still infinitely more embodied than any human artifact is or even conceivably could be. Simply because the millions of years of embodied evolution which have shaped them into who they are and because of the unimpaired embodiment of most of the cells that make up their organism.

Most animals like cow, turtles are born with many skills including walking etc. So, having some skills being as a genetic program do exist.

This looks like it might be an interesting read, but I just read the Chapter "Are Feelings Real?" (because it is a subject of personal interest of mine that I've studied a lot) and I found it to be very unsatisfactory, not really addressing the question at all, but sidestepping it. Which makes me wonder if the whole thing is really worth reading.

You might enjoy chapter 3, 'Origin of Emotion' of a book entitled 'A Brief History of Intelligence' by Max Bennett. Although you need to read the two chapters leading up to it, realistically.

This book lines up with a lot of what I've been thinking: the centrality of prediction, how intelligence needs distributed social structure, language as compression, why isolated systems can't crack general intelligence.

But there are real splits on substrate dependence and what actually drives the system. Can you get intelligence from pure prediction, or does it need the pressure of real consequences? And deeper: can it emerge from computational principles alone, or does it require specific environmental embeddedness?

My sense is that execution cost drives everything. You have to pay back what you spend, which forces learning and competent action. In biological or social systems you're also supporting the next generation of agents, so intelligence becomes efficient search because there's economic pressure all the way down. The social bootstrapping isn't decorative, it's structural.

I also posted yesterday a related post on HN

> What the Dumpster Teaches: https://news.ycombinator.com/item?id=45698854


By that logic, wouldn't the electric kettle heating water for the coffee be intelligent? Had it not measured heat when activated, it wouldn't know how to stop and the man would have thrown it away or at least stopped paying for the kettle's electricity.

I think we need a meta layer - ability to reason over one's own goals (this does not contradict the environment creating hard constraints). The man has it. The machine may have it (notably a paperclip maximizer will not count under this criteria). The crow does not.


The kettle costs money to make and use, so it needs to pay back its costs to continue being made, it has to be useful.

You could say that that, yes, that kettle is intelligent, or smart, as in smart watch. But the intelligence in question clearly derives from the human who designed that kettle. Which is why we describe it as artificial.

Similarly, a machine could emulate meta-cognition, but it would in effect only be an reflection and embodiment of certain meta-cognitive processes originally instantiated in the mind which created that machine.


Don't "real" consequences apply for setting weights? There's an actual monetary cost to train these models, and they have to actually perform to keep getting trained. Sure it's VC spend right now and not like, biological reproduction driving the incentives ultimately, but it's not outside the same structure.

Depending on the time horizon the predictions change. So we get layers - what is going to happen in the next hour/tomorrow/next year/next 10 years/next 100 etc (and layers of compression of which language is just one) and that naturally produces contradictions which creates bounds on "intelligence".

It really is a stupid system. No one rational wants to hear that, just like no one religious wants to hear contradictions in their stories, or no one who plays chess wants to hear its a stupid game. The only thing that can be said about the chimp intelligence is it has developed a hatred of contradictions/unpredictability and lack of control unseen in trees, frogs, ants and microbes.

Stories becomes central to survive such underlying machinery. Part of the story we tell is no no we don't all have to be Kant or Einstein because we just absorb what they uncovered. So apparently the group or social structures matters. Which is another layer of pure hallucination. All social structures if they increase the prediction horizon also generate/expose themselves to more prediction errors and contradictions not less.

So again Coherence at group level is produced through story - religion will save us, the law will save us, trump will save, the jedi will save us, AI will save us etc. We then build walls and armies to protect ourselves from each others stories. Microbes don't do this. They do the opposite and have produced the krebs cycle, photosynthesis, crispr etc. No intelligence. No organization.

Our intelligence are just bubbling cauldrons at the individual and social level through which info passes and mutates. Info that survives is info that can survive that machinery. And as info explodes the coherence stabilization process is over run. Stories have to be written faster than stories can be written.

So Donald Trump is president. A product of "intelligence" and social "intelligence". Meanwhile more microbes exist than stars in the universe. No Trump or ICE or Church or data center is required to keep them alive.

If we are going to tell a story about Intelligence look to Pixar or WWE. Don't ask anyone in MIT what they think about it.


The MIT vs. WWE contrast feels like a false dichotomy. MIT represents systematic, externalized intelligence (structured, formal, reductive, predictive). WWE or Pixar represent narrative and emotional intelligence. We do need both.

Also evolution is the original information-processing engine, and humans still run on it just like microbes. The difference is just the clock speed. Our intelligence, though chaotic and unstable, operates on radically faster time and complexity scales. It's an accelerator that runs in days and months instead of generations. The instability isn’t a flaw: it’s the turbulence of the way faster adaptation.


emotional intelligence has been debunked

It’s hard not to see consciousness (whatever that actually is) lurking under all this you just explained. If it’s emergent, the substrate wars might just be detail; if it’s not, maybe silicon never gets a soul.

Intelligence is whatever we consider ourselves capable of. It turns out that computers are increasingly able to do whatever we can do. Maybe the only thing we can do is advanced pattern matching, but we didn't think of our intelligence that way before.

Humans seem to be able to invent interesting questions about the unknown and then figure out how to try techniques to answer those questions and then systematically attack those questions. This is why LLMs generally can’t do unsupervised research or novel high level engineering by themselves. They’re getting closer and closer in some ways and in others they remain quite lacking.

The other thing is their inability to intelligently forget and their inability to correctly manage their own context by building their own tools (some of which is labs intentionally crippling how they build AI to avoid an AI escape).

I don’t think there’s anything novel in human intelligence as a good chunk of it does appear in more primitive forms in other animals (primates, elephants, dolphins, cepholapods). But generally our intelligence is on hyperdrive because we also have the added physical ability of written language and the capability for tool building.


> Intelligence is whatever we consider ourselves capable of

Then, what is what we are incapable of? Magic? ;-)

> Maybe the only thing we can do is advanced pattern matching

Pattern matching as a way to support the excellent heuristic "correlation is likely causation", yes. This is what allows us to analyze systems, what brings us from "something thrown away will eventually fall to the ground" to the theory to relativity.

Intelligence is understanding, and understanding comes from hacking systems in order to use them to our advantage - or just observe systems being broken or being built.

By doing that, we acquire more knowledge about the relationships and entities within the system, which in turn allows more advanced hacking. We probably started with fire, wolves, wheat, flint; and now we are considering going to Mars.


This reminds me of The Blind and The Elephant

> It has come as a shock to some AI researchers that a large neural net that predicts next words seems to produce a system with general intelligence

When I write prompts, I've stopped thinking of LLMs as just predicting a next word, and instead to think that they are a logical model built up by combining the logic of all the text they've seen. I think of the LLM as knowing that cats don't lay eggs, and when I ask it to finish the sentence "cats lay ..." It won't generate the word eggs even though eggs probably comes after lay frequently


  > It won't generate the word eggs even though eggs probably comes after lay frequently
Even a simple N-gram model won't predict "eggs". You're misunderstanding by oversimplifying.

Next token prediction is still context based. It does not depend on only the previous token, but on the previous (N-1) tokens. You have "cat" so you should get words like "down" instead of "eggs" with even a 3-gram (trigram) model.


No, your original understanding was the more correct one. There is absolutely zero logic to be found inside an LLM, other than coincidentally.

What you are seeing is a semi-randomized prediction engine. It does not "know" things, it only shows you an approximation of what a completion of its system prompt and your prompt combined would look like, when extrapolated from its training corpus.

What you've mistaken for a "logical model" is simply a large amount of repeated information. To show the difference between this and logic, you need only look at something like the "seahorse emoji" case.


If anything, the seahorse emoji case is exactly the type of thing you wouldn't expect to happen if LLMs just repeated information from their training corpus. It starts producing a weird dialogue that's completely unlike its training corpus, while trying to produce an emoji it's never seen during training. Why would it try to write an emoji that's not in its training data? This is totally different than its normal response when asked to produce a non-existent emoji. Normally, it just tells you the emoji doesn't exist.

So what is it repeating?

It's not enough to just point to an instance of LLMs producing weird or dumb output. You need to show how it fits with your theory that they "just repeating information". This is like pointing out one of the millions of times a person has said something weird, dumb, or nonsensical and claiming it proves humans can't think and can only repeat information.


No, their revised understanding is more accurate. The model has internal representations of concepts; the seahorse emoji fails because it uses those representations and stumbles: https://vgel.me/posts/seahorse/

Word2vec can/could also do the seahorse thing. It at least seems like there's more to what humans consider a concept than a direction in a vector space model (but maybe not).

https://www.analyticsvidhya.com/blog/2021/07/word2vec-for-wo...


> There is absolutely zero logic to be found inside an LLM

Surely trained neural networks could never develop circuits that implement actual logic via computational graphs...

https://transformer-circuits.pub/2025/attribution-graphs/met...


You're both using two different definitions of the word "logic". Both are correct usages, but have different contexts.

Brute force engineering solutions to appear like the computer is thinking. When we have no idea how we think ourselves. This will never generate true intelligence. It executes code, then it stops, it is a tool, nothing more.

I often wonder whether neuroscience on LLMs is harder or humans?

the whole book is available for free here: https://whatisintelligence.antikythera.org/

Thanks! we've changed the top URL to that from https://mitpress.mit.edu/9780262049955/what-is-intelligence/.

I'm so confused why a $36.95 purchase page is a hackernews headline, especially when your link is clearly what they should have used

Until there is a formal and accepted definitive distinction between intelligence, comprehension, memory, and action all these opinions are just stabs in the dark. We've not defined the scene yet. We currently do not have artificial comprehension. That's what occurs sorta during training. The intelligence everyone claims to see is a pre-calculated idiot savant. If you knew it was all a pre-calculated domino cascade, would you still say it's intelligent?

Execute actions and cognition that pay back the cost of said actions, and support the next generation. No intelligence can appear outside social bootstrapping, it always needs someone pay the initial costs. So the cost of execution drives a need for efficiency, which is intelligence.

Current AIs cannot comprehend on the fly, meaning if they are presented with data outside of their training, the reply generated will be a hallucination interpolated off the training data into unknown output. Yet, a person in possession of comprehension can go beyond their training, on the fly, and that is how humans learn. AI's cannot do that, which is critical.

This problem has been solved already:

Intelligence is the ability of the human body to organize its own contours in a way that corresponds to the objective contours of the world.

Evald illyenkov.

And yes, the mind is part of the body, thus thinking consists of an action of organization to the contours of the world


So this would exclude anything besides human body?

What about animals?

To me best definition of intelligence is:

It's the ability to:

- Solve problems

- Develop novel insightful ideas, patterns and conclusions. Have to add that since they might not immediately solve a problem, although they might help solve a problem down the line. Example could be a comedian coming up with a clever original story. It doesn't really "solve a problem" directly, but it's intelligent.

The more you are capable of either of the two above, the more intelligent you are. Anything that is able to do the above, is intelligent at least to some extent, but how intelligent depends on how much it's able to do.


Jesus. Now I'm curious to read this. But the table of contents reminds me of the opening of Charlie Kaufman's "Adaptation".

There's lots of opinions on what is intelligence but I notice a lot of people do not read much about it. You don't have to agree with others, but there is a reason that a precise and formal definition has been so hard to develop. People offer many simple explanations, yet if it was simple, we'd have the definition. All you end up doing is blocking yourself from learning even more.

I'll also add that a lot of people really binarize things. Although there is not a precise and formal definition, that does not mean there aren't useful ones and ones that are being refined. Progress has been made in not only the last millennia, but the last hundred years, and even the last decade. I'm not sure why so many are quick to be dismissive. The definition of life has issues and people are not so passionate about saying it is just a stab in the dark. Let your passion to criticize something be proportional to your passion to learn about that subject. Complaints are easy, but complaints aren't critiques.

That said, there's a lot of work in animal intelligence and neuroscience that sheds a lot of light on the subject. Especially in primate intelligence. There's so many mysteries here and subtle things that have surprising amounts of depth. It really is worth exploring. Frans de Waal has some fascinating books on Chimps. And hey, part of what is so interesting is that you have to take a deep look at yourself and how others view you. Take for example you reading this text. Bread it down, to atomic units. You'll probably be surprised at how complicated it is. Do you have a parallel process vocalizing my words? Do you have a parallel process spawning responses or quips? What is generating those? What are the biases? Such a simple every thing requires some pretty sophisticated software. If you really think you could write that program I think you're probably fooling yourself. But hey, maybe you're just more intelligent than me (or maybe less, since that too is another way to achieve the same outcome lol).


For years, I've taken the position that intelligence is best expressed as creativity - that is, the ability to come up with something that isn't predictable based on current data. Today's "artificial intelligence" analyzes words (tokens) based on an input (prompt) to come up with an output. It's predictable. It's fast. But, imho, it lacks creativity, and therefore lacks intelligence.

One example of this I often ponder is the boxing style of Muhammad Ali, specifically punching while moving backwards. Before Ali, no one punched while moving away from their opponent. All boxing data said this was a weak position, time for defense, not for punching (offense). Ali flipped it. He used to do miles of roadwork, throwing punches while running backwards to train himself on this style. People thought he was crazy, but it worked, and, imho, it was extremely creative (in the context of boxing), and therefore intelligent.

Did data exist that could've been analyzed (by an AI system) to come up with this boxing style? Perhaps. Kung Fu fighting styles have long known about using your opponents momentum against them. However, I think that data (Kung Fu fighting styles) would've been diluted and ignored in face of the mountains of traditional boxing style data, that all said not to punch while moving backwards.


> Today's "artificial intelligence" analyzes words (tokens) based on an input (prompt) to come up with an output. It's predictable. It's fast. But, imho, it lacks creativity ...

I would have agreed with you at the dawn of LLM emergence, but not anymore. Not because the models have improved, but because I have a better understanding and more experience now. Token prediction is what everyone cites, and it still holds true. This mechanism is usually illustrated with an observable pattern, like the question, "Are antibiotics bad for your gut?" which is the predictability you mentioned. But LLM creativity begins to emerge when we apply what I’d call "constraining creativity." You still use token prediction, but the preceding tokens introduce an unusual or unexpected context - such as subjects that don't usually appear together or a new paradoxical observation (It's interesting that for fact-based queries, rare constraints lead to hallucinations, but here they're welcome)

I often use the latter for fun by asking an LLM to create a stand-up sketch based on an interesting observation I noticed. The results aren’t perfect, but they combine the unpredictability of token generation under constraints (funny details, in the case of the sketch) with the cultural constraints learned during training. For example, a sketch imagining doves and balconies as if they were people and real estate. The quote below from that sketch show that there are intersecting patterns between the world of human real estate and the world of birds, but mixed in a humorous way.

    "You want to buy this balcony? That’ll be 500 sunflower seeds down, and 5 seeds a day interest. Late payments? We send the hawk after you."

I think it depends on the complexity of the knowledge to be created. I agree with you broadly, but the danger of using your boxing analogy is that for game systems that can be sufficiently understood, AI has actually invented new strategies. TD-Gammon introduced new advances in the strategy of playing backgammon because its very strong understanding of early gameplay meant that it found some opening moves that humans didn't realize were as strong as they were.

I would argue that the only truly new things generative AI has introduced are mostly just byproducts of how the systems are built. The "AI style" of visual models, the ChatGPT authorial voice, etc., are all "new", but they are still just the result of regurgitating human created data and the novelty is an artifact of the model's competence or lack thereof.

There has not been, at least to my knowledge, a truly novel style of art, music, poetry, etc. created by an AI. All human advancements in those areas build mostly off of previous peoples' work, but there's enough of a spark of human intellect that they can still make unique advancements. All of these advancements are contingent rather than inevitable, so I'm not asking that an LLM, trained on nothing but visual art from the Medieval times and before, could recreate Impressionism. But I don't think it would make anything the progresses past or diverges from Medieval and pre-Medieval art styles. I don't think an LLM with no examples of or references to anything written before 1700 would ever produce poetry that looked like Ezra Pound's writing, though it just might make its own Finnegan's Wake if the temperature parameter were turned out high enough.

And how could it? It works because there's enough written data that questions and context around the questions are generally close enough to previously seen data that the minor change in the question will be matched by a commensurate change in the correct response from the ones in the data. That's all a posteriori!


Sharing the same reality-narrative as me. Behaving in a way that expresses that in a way that is intelligible to me.

That would be something that is intelligent to you. I believe the author (or anyone in general) should be focused on mining what intelligence objectively is.

Best we will ever do is create a model of intelligence that meets some universal criteria for "good enough", but it will most certainly, never be an objective definition of intelligence since it is impossible to measure the system we exist in objectively without affecting the system itself. We will only ever have "intelligence as defined by N", but not "intelligence".

Pah. Objective. Ain't no such thing.

Not the best place to ask that :)

Has there been anything written about AI "intelligence" from people well read in even the basic and foundational writings on epistemology? For example, I see a lot of people using Hume's way of thinking about how knowledge is formed without addressing Kant's fairly persuasive refutation of it in CPR and without addressing the dead end that is the resulting philosophical skepticism Hume espoused.

In this book, I see Hume cited in a misunderstanding of his thought, and Kant is only briefly mentioned for his metaphysical idealism rather than his epistemology, which is a legitimately puzzling to me. Furthermore, to refer to Kant's transcendental idealism as "solipsism" is so mistaken that it's actually shocking. Transcendental idealism has nothing whatsoever to do with "solipsism" and is really just saying that we (like LLMs!) don't truly understand objects as "things in themselves" but rather form understanding of them via perceptions of them within time and space that we schematize and categorize into rational understandings of those objects.

Regarding Hume, the author brings up his famous is/ought dichotomy and misrepresents it as Hume neatly separating statements and "preferring" descriptive ones. We're now talking more about fact-value distinction because this is not talking about moral judgments but rather descriptive vs prescriptive statements, but I'll ignore that because the two are so often combined. The author then comes to Hume's exact conclusion, but thinks he is refuting Hume when he says:

>While intuitive, the is/ought dichotomy falls apart when we realize that models are not just inert matrices of numbers or Platonic ideas floating around in a sterile universe. Models are functions computed by living beings; they arguably define living beings. As such, they are always purposive, inherent to an active observer. Observers are not disinterested parties. Every “is” has an ineradicable “oughtness” about it.

The author has also just restated a form of transcendental idealism right before dismissing Kant's (and the very rigorously articulated "more recent postmodern philosophers and critical theorists") transcendental idealism! He is able to deftly, if unconvincingly, hand wave it with:

>We can mostly agree on a shared or “objective” reality because we all live in the same universe. Within-species, our umwelten, and thus our models—especially of the more physical aspects of the world around us—are all virtually identical, statistically speaking. Merely by being alive and interacting with one another, we (mostly) agree to agree.

I think this bit of structuralism is where the actual solipsism is happening. Humanity's rational comprehension of the world is actually very contingent. An example of this is the study that were done by Alexander Luria on remote peasant cultures and their capacity for hypothetical reasoning and logic in general. They turned out to be very different from "our models" [1]. But, even closer to home, I share the same town as people who believe in reiki healing to the extent that they are willing to pay for it.

But, more to the point, he has also simply rediscovered Hume's idea, which I will quote:

>In every system of morality, which I have hitherto met with, I have always remarked, that the author proceeds for some time in the ordinary way of reasoning, and establishes the being of a God, or makes observations concerning human affairs; when of a sudden I am surprised to find, that instead of the usual copulations of propositions, is, and is not, I meet with no proposition that is not connected with an ought, or an ought not.

Emphasis mine. Hume's point was that he thought descriptive statements always carry a prescriptive one hidden in their premise, and so that, in practice, "is" statements are always just "ought" statements.

Had the author engaged more actively with Hume's writing, he would have come across Hume's fork, related to this is-ought problem, and eventually settled on (what I believe to be) a much more important epistemological problem with regards to generative AI: the possibility of synthetic a priori knowledge. Kant provided a compelling argument in favor of the possibility of synthetic a priori knowledge, but I would argue that it does not apply to machines, as machines can "know" things only by reproducing the data they are trained with and lack the various methods of apperception needed to schematize knowledge due to a variety of reasons, but "time" being the foremost. LLMs don't have a concept of "time"; every inference they make is independent, and transformers are just a great way to link them together into sequences.

I should point out that I'm not a complete AI skeptic. I think that it could be possible to have some hypothetical model that would simply use gen AI as its sensory layer and combine that with a reasoning component that makes logical inferences that more resemble the categories that Kant described being used to generate synthetic a priori knowledge. Such a machine would be capable of producing true new information rather than simply sampling an admittedly massive approximation of the joint probability of semiotics (be it tokens or images) and hoping that the approximation is well constructed enough to interpolate the right answer out. I would personally argue that the latter "knowledge", when correct, is nothing more than persuasive Gettier cases.

Overall, I'm not very impressed with the author's treatment of these thinkers. Some of the other stuff looks interesting, but I worry it's a Gell-Mann amnesia effect to be too credulous, given that I have done quite a bit of primary source study on 19th century epistemology as a basis for my other study in newer writing in that area. The author's background is in physics and engineering, so I have a slight suspicion that (since he used Hume's thought related to moral judgments rather than knowledge), these are hazily remembered subjects from a rigorous ethics course he took at Princeton, but that is purely speculative on my part. I think he has reached a bit too far here.

1: https://languagelog.ldc.upenn.edu/nll/?p=481 (I am referring mostly to the section in blue here)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: