Artificial General Intelligence will likely…

Lionel Page

Oct 6

AGI, LLMs and the challenge of alignment

Read →

8 Comments

Daniel

Oct 6

Physics should be goal. Learning and solving for the rest of immutable laws in physics.

Expand full comment

Reply (1)

Lionel Page

Oct 9

That is an interesting idea (which could be extended to learning in general). But a superintelligent AI having only this goal might decide to sacrifice humans to achieve it. In 2001: A Space Odyssey, HAL decides to get rid of humans when it perceives them as endangering its mission.

Expand full comment

Reply (1)

Daniel

Oct 9

AGI should be an instrument for truth. First in physics, that is our current knowledge barrier.

I like to think that truth leads to love, but I understand your un-optimistic point of view.

Expand full comment

William of Hammock

Oct 6

This conflates valence and salience, which work by Berridge, Robinson and others have proven to diverge and compete. The 'stranded on a desert island and eating something disgusting' trope illustrates this conflict. As hunger grows, incentive salience must grow to eclipse a disgust response that never disappears.

It is common to pragmatically conflate valence and salience when speaking of "incentives," "rewards" and "goals," since they typically align. However, LLMs are effectively a sophisticated means of "attending," but they do not have internal conflictions of KIND. They are distributive, not dialectical. They mimic sensitivity through a process of inspecification (akin to bullshitting and equivocation) by forced overfitting sample and solition spaces to abstract probability and vector spaces. While Bayesian idealism is a powerful source of insight, it similarly relies on conflating utility and veridicality. It won't get you into trouble until there are too many adopters willing to reify.

Expand full comment

Finite State Machine

Oct 13

If you view the so-called "Intelligence" as actually a kind of dissipative system that is able to self-replicate, the general goal is already given: self-replication with the lowest cost in terms of energy and risk. If you don't like to define intelligence in this way, try to find an counter example of intelligence which lacks the capability to explore, evolve and self-replicate. In Sutton's view, even prion is smarter than the stateless GPT.

If we are to create super intelligence, we should be ready to embrace the potential of a competing silicon-based species.

Expand full comment

Reply (1)

Lionel Page

Oct 13

We can use the word "intelligence" to describe what we want, it is a convention. But I don't think including self-replication in the definition of intelligence is what we have in mind when talking about "intelligence". For instance, LLMs look very close to intelligence, and we could imagine them being able to talk to us like a human. It would make sense, I think, to call them intelligent even though they don't replicate.

When I speak of intelligence, I have something simple in mind: the ability to process information to make good decisions. "Good" is evaluated relative to a goal. Human intelligence is our ability to solve problems, while superintelligence would be the ability of an agent (e.g. silicon-based) to process information better to achieve a goal (ideally a goal we have). Sutton said he thinks a goal is required for general intelligence (to learn). But I don't think Sutton said intelligence requires self-replication, did he?

Expand full comment

Reply (1)

Finite State Machine

Oct 13

True. Sutton didn't mention self-replication. It is my conjecture that self-replication is the logical next step after his emphasis on the ability for exploration as a critical component of intelligence. And by self-replication, I mean more than the replication of the physical structure, but also the information that is self-replicated. https://en.wikipedia.org/wiki/Quine_(computing)

It might sould like I mix up the "goal of intelligence" vs "intelligence". I'm trying to capture the necessary factors of an intelligent entity by examining the traits of intelligent behavior that are generally accepted.

For current LLMs, they can display complicated behavior, but they are still just a stateless, massive function that (recursively) maps input to output based on predetermined rules. So long as the rules are fixed, it should be considered a mechanical machine. We expect an intelligent entity to be able to evolve beyond merely making decisions, since any form of computation can be viewed as decision-making. Speaking of solving problems, a simple brute-force search algorithm can solve all decidable problems if we lift the time constraint. But we agree intelligence must be more than brute-force search, right?

Therefore, evolution is a prerequisite for intelligence. But a single stateful (maintaining a history of memory) entity will face an increasingly high risk of corrupting its memory through a prolonged period under the law of thermodynamics. To put it this way, if you explore the world long enough, sooner or later you will be out of your mind. Continously learning is a mission impossible for any single entitys in such harsh, vicious chaotic world. We need to learn and explore as collective.

That's why I stress that self-replication is a necessary counterpart to the ability to evolve. An intelligent being needs to replicate itself to manage the risk of a single point of failure. It is surely no coincidence that all entities (specifically life-forms) that we consider 'intelligent' also possess the capability of self-replication.

I'm okay with giving the 'smart' tag to those complicated tools that massively extend human cognitive ability, like smartphones. I would hesitate to treat a *supersmart* agent that perfectly achieve any goals that assigned to them a *superintelligence*. To me, tools that help an intelligent human being achieve his goals are not intelligence themselves. These tools can have any intermediate goals and be very versatile. It is whether the ultimate goal involves self-replication that draws the line between tools and intelligence.

Your discussion on intelligence and goal implied that an intelligent agent(LLM) would continously rely on another intelligent agent(human) for goal setting. It presents a difficulty: where should we draw the line to exclude 'non-intelligence' within such a hierarchical structure, since any intelligent agent can set goal for another and any agent accept goal seting would considered intelligent. To cutoff the goal setting chain, I propose categorizing all goals of an intelligent agent into two types: self-replication and all other subsidiary goals that facilitate the self-replication.

Expand full comment

Mark Williamson

Oct 6

I propose the goal should be to aim to maximise overall human wellbeing (current and future), with particular emphasis on minimising harm, drawing on the Benthamite principle from the Enlightenment ("greatest happiness of the greatest number") but with care weighting to ensure "average wellbeing" avoids compromising the welfare of a particular minority, as laid out by Richard Layard in "Happiness: lessons from a new science (2nd edition)". Or you could replace human wellbeing with "wellbeing of sentient beings" in the way Sam Harris does in the Moral Landscape

Expand full comment

Optimally Irrational

Artificial General Intelligence will likely…