Where in response to (@ [link] https://www.lesswrong.com/posts/qvgEbZDcxwTSEBdwD/deconfusing-ai-and-evolution?commentId=cAEFXWomnaxTB47if).
> No one can be sure what substrate FAAI will actually run on,
> or how it will be configured,
> or where it will run.
> Does this uncertainty affect the generalization of these arguments?
No.
The only requirement is that the FAAI substrate is "artificial",
which in this case, is merely to stipulate
that the FAAI has a different metabolic pathway/process
than that of carbon based biological mammals.
This is a very low bar for the applicability of the substrate needs argument.
Hence, the uncertainty mentioned is of no relevance.
> Is it required that the FAAI must be self-modifying?
Yes, insofar as "learning" at all is strictly equivalent to --
is a form of --
'self modification',
and that "learning", as "adaptation to changes in the environment"
is strictly required for FAAI persistence, capability persistence,
persistence in its ability to increase its capability, etc, etc.
> Does the substrate needs argument require any assumption
> that FAAI will take a specific form or configuration,
> or use traditional hardware?
No.
The substrate needs argument only assumes 'artificiality'
and some sufficient notion of 'intelligence'
and some reasonably long/wide interval of 'deployment'
for the notion of 'alignment' to be meaningful.
> Does the substrate needs argument require any assumption
> that a/the FAAI will directly share our environment?
No, but it does necessarily _include_ a consideration of the case
in which there is at least some type of interaction with our environment.
To assume otherwise, that the FAAI system has no intersection
or interaction with our environment, in any way at all,
is to make an assumption that strains credulity.
As if starting an argument by saying "assume that there are
nearly powerful aliens on living on some distant planet
at the other end of the universe (or in some other dimension)
that have mastered all learning about the physical universe...".
If they are not here, and have no influence on us, our ecosystem,
then why bother talking about it?
Arguing over boundary cases (it fully lives on the moon or in orbit)
is merely to be arguing over the 'degree' of interaction,
and trying to have that assumption be conditionally relevant
as a means for dis-considering the principle syllogism.
> If a highly intelligent and long-lived entity of any substrate
> wants to resist value drift for one billion years
> (for terminal or instrumental reasons),
> is there any physically possible way for it to do so?
If the "meaning" of "one billion years" is "consistent"
with "ordinary assumptions", then in the strict sense,
the answer is just "no".
However, you are welcome to quibble about the interpretation of time,
by doing something like "freeze the entity in suspended animation"
for nearly all of a billion years or so
and thus get 'perfect resistance' to all forms of value drift.
This is hardly an interesting case, however.
> Under what conditions should expect drift
> to inevitably occur (regardless of substrate)?
There are a number of specific conditions:
1; where there is a significant number of cycles
of 2; eventual complete self-replacement where
3; there is a need for the design/implementation
of the self replacement aspects to be adaptive
to 4; inherently unknowable/unpredictable aspects
of the environment in which 5; the FAAI operates,
and from which 6; the FAAI must draw its substance
and energy from, such that the FAAI is contingent
on its implementation of its adaptations so as to
be able to further continue to so operate.
> If value drift is strictly inevitable...
...which it is under any conditions that actually
matter for which the scope of the substrate needs arguments
is being asked to answer for...
> ...what might an aligned FAAI conclude
> is the best course of action?
Even if such a thing could be (speculated) predicted,
which it inherently cannot be,
why would this even matter?
> If we were to assume that each FAAI fully self destructs
> on some interval, and humanity then rebuilds them, each time,
> does that solve the substrate needs alignment problem?
Not really.
This just makes humanity a part of the substrate evolution process.
> Is the "humanity completely re-designed and re-makes the FAAI
> over and over again" a lower risk
> than running the same evolving FAII substrate forever?
No, not really.
> Is value drift is strictly inevitable?
For any meaningful notion of time, yes.
> what might misaligned FAAI philosophically conclude about itself?
Regardless of what we might speculate
about what the FAAI would conclude about itself,
or even what some future FAAI might actually conclude about itself,
the real question remains "does it matter?".
Part of what makes the substrate needs argument interesting
is that it does not depend in any way on such speculation
to reach its conclusions.
In this way, it is actually a very different argument, overall,
than anything that is like the 'instrumental convergence' arguments.
> Do individual humans even want to prevent their own value drift?
Usually, no.
> Is incoherence strictly inevitable?
That is a bit like asking "is entropy inevitable?".
Most closed form theorists will say that it is,
based strictly on only the 2nd law of thermodynamics.
> Given that enforcing some predefined reliable constraints
> will inherently limit a system's capacity to adapt
> to the complex nonlinear contexts of the world,
> at what point do these limits constrain a FAAI
> to such an extent
> that the FAAI cannot maintain full autonomy
> indefinitely?
The FAAI will fail to maintain full autonomy
(and thus will be unable to maintain itself, and continue, etc)
at that exact moment
in which those predefined 'aliment' constraints
act to prevent the FAAI from learning
what it needs to to adapt to
the (unpredictably) changed world/environment
in which the FAAI "lives" (operates, etc).
Over a long enough interval of interior and exterior changes,
given that the nature of the changes of that environment
will eventually exceed the artificially constrained learning envelope
of the 'aligned' (constrained) FAAI relative to the 'unaligned' FAAI,
then eventually the learning/adaptive unconstrained FAAI will
be all that remains.
Hence, the mere existence of such limits alone
inherently suggests that it is merely an extended eventuality,
sometime in the long term,
until they will end up being important.
> If the FAAI is designed in such a way
> as to be both 'fully self correcting over all scales'
> and also 'fully flexible in its functioning in a complete way',
> would that not allow it to achieve near-maximally-useful capability
> while containing enough redundancy to strongly resist drift?
No.
It is a false assumption
that the two requirements of
1; 'be fully self correcting over all scales'
and also
2; 'be fully and completely flexible in its functioning'
can co-exist;
they are inherently contrary requirements --
such that the perfection of either one
requires something of an omission (incompleteness) in the other.
Note that "Learning" helps us to close the gap with heuristics
so that things can be "maybe good enough in the short term",
but that is not the same as "alignment in the long term".
Basically, the very action of making it 'good enough'
in the short term is the very thing that makes it impossible
to have 'adequate alignment' 'in the long term',
for any senses of 'adequate', 'alignment', and 'long term'
that actually matters.
> What if the FAAI does not contain just one genotype,
> but all genotypes stored across all its hardware parts?
> What if each part holographically keeps a copy of all others,
> and all copies are constantly checked to prevent errors?
Even really good redundancy schemas do not help.
How does this system incorporate learning and adaptation?
At what level of abstraction does the error correction
give way to the need for the system to change,
along with its environment,
so as to continue to persist at all,
or increase its capability to adapt, generalize, etc?
> What if there is only one absolute central controlling agent?
> What if no other fully autonomous systems can rival the one?
Does not matter --
the underlying substrate needs argument still applies.
The argument makes no assumptions regarding 'agent' counts/notions.
> What if the FAAI consists of many agents working cooperatively?
See above.
> This feels to me like it proves too much.
Is this expression of an intuition or of a limit of understanding?
> Can an FAAI have any preference that is stable over time
> that isn't also about (entangled with) its own survival?
Not over the kinds of long intervals that matter.
Of course, it is also true that over shorter intervals,
where the rate of change of the inter-activeness
between the unpredictability of the environment
and its own substrate are 'sufficiently low' --
ie; non-interfering with its own survival/capability, etc,
then anything (any design element, intention, projection, etc)
can be considered to be "stable over time".
So then the question becomes "for how long is alignment required?".
If the argument of the trans-humanists is that "we should make FAAI now"
because "all of the trillions of all of future humanity
over all of the rest of time will depend on us doing that",
then the required interval of alignment is 'forever'.
> Is survival and its correlates the only possible coherent value set
> in our universe?
Yes, at one level, of description, it remains the case
that some analogue of 'social, survival, sexual'
is 'a convergent fixed point' in the total continuum
of all possible value structures.
But that triple includes a lot that is not obvious!
> Whatever is the controller of an FAAI
> would have to either become an FAAI
> or merge with an existing FAAI.
Yes, this has been identified mathematically.
> Redundancy can act as a simple error correction mechanism...
...for the category of simple errors.
Ie; things involving finite alphabets of symbols
with known contextual constraints.
But such simple techniques do not work for 'open' symbol sets,
and for which the context is either partially or fully
unknown or unknowable.
How do you use "simple error correct" for the problem of
"have we learned what we were supposed to learn?".
> What if the controller was immutable (unchanging)?
Then the controller cannot account for and constrain
the new behaviors that the FAAI has learned
about its environment that can be used to create unaligned actions.
Moreover, the fixed controller also cannot account for the FAAI
eventually learning completely how to circumvent that controller.
> What if the controller was itself the only FAAI?
> What if the FAAI was 'an intelligence inseparable from its alignment/goals'?
Are these alignment goals 'learned'
(and therefore inherently mutable/unstable
because 'learning' can always 'learn more')
or built-in at inception (immutable)?
If they are immutable, then the case of 'fixed controller' apples.
If they are mutable, then the case of past learning
not being a complete constraint on future learning applies.
Regardless, the condition of 'inseparability' becomes
one of attempting to join change and changelessness,
which is itself, inherently, a contradiction.
~ ~ ~