Where in response to (@ [link] https://www.lesswrong.com/posts/qvgEbZDcxwTSEBdwD/deconfusing-ai-and-evolution?commentId=cAEFXWomnaxTB47if). > No one can be sure what substrate FAAI will actually run on, > or how it will be configured, > or where it will run. > Does this uncertainty affect the generalization of these arguments? No. The only requirement is that the FAAI substrate is "artificial", which in this case, is merely to stipulate that the FAAI has a different metabolic pathway/process than that of carbon based biological mammals. This is a very low bar for the applicability of the substrate needs argument. Hence, the uncertainty mentioned is of no relevance. > Is it required that the FAAI must be self-modifying? Yes, insofar as "learning" at all is strictly equivalent to -- is a form of -- 'self modification', and that "learning", as "adaptation to changes in the environment" is strictly required for FAAI persistence, capability persistence, persistence in its ability to increase its capability, etc, etc. > Does the substrate needs argument require any assumption > that FAAI will take a specific form or configuration, > or use traditional hardware? No. The substrate needs argument only assumes 'artificiality' and some sufficient notion of 'intelligence' and some reasonably long/wide interval of 'deployment' for the notion of 'alignment' to be meaningful. > Does the substrate needs argument require any assumption > that a/the FAAI will directly share our environment? No, but it does necessarily _include_ a consideration of the case in which there is at least some type of interaction with our environment. To assume otherwise, that the FAAI system has no intersection or interaction with our environment, in any way at all, is to make an assumption that strains credulity. As if starting an argument by saying "assume that there are nearly powerful aliens on living on some distant planet at the other end of the universe (or in some other dimension) that have mastered all learning about the physical universe...". If they are not here, and have no influence on us, our ecosystem, then why bother talking about it? Arguing over boundary cases (it fully lives on the moon or in orbit) is merely to be arguing over the 'degree' of interaction, and trying to have that assumption be conditionally relevant as a means for dis-considering the principle syllogism. > If a highly intelligent and long-lived entity of any substrate > wants to resist value drift for one billion years > (for terminal or instrumental reasons), > is there any physically possible way for it to do so? If the "meaning" of "one billion years" is "consistent" with "ordinary assumptions", then in the strict sense, the answer is just "no". However, you are welcome to quibble about the interpretation of time, by doing something like "freeze the entity in suspended animation" for nearly all of a billion years or so and thus get 'perfect resistance' to all forms of value drift. This is hardly an interesting case, however. > Under what conditions should expect drift > to inevitably occur (regardless of substrate)? There are a number of specific conditions: 1; where there is a significant number of cycles of 2; eventual complete self-replacement where 3; there is a need for the design/implementation of the self replacement aspects to be adaptive to 4; inherently unknowable/unpredictable aspects of the environment in which 5; the FAAI operates, and from which 6; the FAAI must draw its substance and energy from, such that the FAAI is contingent on its implementation of its adaptations so as to be able to further continue to so operate. > If value drift is strictly inevitable... ...which it is under any conditions that actually matter for which the scope of the substrate needs arguments is being asked to answer for... > ...what might an aligned FAAI conclude > is the best course of action? Even if such a thing could be (speculated) predicted, which it inherently cannot be, why would this even matter? > If we were to assume that each FAAI fully self destructs > on some interval, and humanity then rebuilds them, each time, > does that solve the substrate needs alignment problem? Not really. This just makes humanity a part of the substrate evolution process. > Is the "humanity completely re-designed and re-makes the FAAI > over and over again" a lower risk > than running the same evolving FAII substrate forever? No, not really. > Is value drift is strictly inevitable? For any meaningful notion of time, yes. > what might misaligned FAAI philosophically conclude about itself? Regardless of what we might speculate about what the FAAI would conclude about itself, or even what some future FAAI might actually conclude about itself, the real question remains "does it matter?". Part of what makes the substrate needs argument interesting is that it does not depend in any way on such speculation to reach its conclusions. In this way, it is actually a very different argument, overall, than anything that is like the 'instrumental convergence' arguments. > Do individual humans even want to prevent their own value drift? Usually, no. > Is incoherence strictly inevitable? That is a bit like asking "is entropy inevitable?". Most closed form theorists will say that it is, based strictly on only the 2nd law of thermodynamics. > Given that enforcing some predefined reliable constraints > will inherently limit a system's capacity to adapt > to the complex nonlinear contexts of the world, > at what point do these limits constrain a FAAI > to such an extent > that the FAAI cannot maintain full autonomy > indefinitely? The FAAI will fail to maintain full autonomy (and thus will be unable to maintain itself, and continue, etc) at that exact moment in which those predefined 'aliment' constraints act to prevent the FAAI from learning what it needs to to adapt to the (unpredictably) changed world/environment in which the FAAI "lives" (operates, etc). Over a long enough interval of interior and exterior changes, given that the nature of the changes of that environment will eventually exceed the artificially constrained learning envelope of the 'aligned' (constrained) FAAI relative to the 'unaligned' FAAI, then eventually the learning/adaptive unconstrained FAAI will be all that remains. Hence, the mere existence of such limits alone inherently suggests that it is merely an extended eventuality, sometime in the long term, until they will end up being important. > If the FAAI is designed in such a way > as to be both 'fully self correcting over all scales' > and also 'fully flexible in its functioning in a complete way', > would that not allow it to achieve near-maximally-useful capability > while containing enough redundancy to strongly resist drift? No. It is a false assumption that the two requirements of 1; 'be fully self correcting over all scales' and also 2; 'be fully and completely flexible in its functioning' can co-exist; they are inherently contrary requirements -- such that the perfection of either one requires something of an omission (incompleteness) in the other. Note that "Learning" helps us to close the gap with heuristics so that things can be "maybe good enough in the short term", but that is not the same as "alignment in the long term". Basically, the very action of making it 'good enough' in the short term is the very thing that makes it impossible to have 'adequate alignment' 'in the long term', for any senses of 'adequate', 'alignment', and 'long term' that actually matters. > What if the FAAI does not contain just one genotype, > but all genotypes stored across all its hardware parts? > What if each part holographically keeps a copy of all others, > and all copies are constantly checked to prevent errors? Even really good redundancy schemas do not help. How does this system incorporate learning and adaptation? At what level of abstraction does the error correction give way to the need for the system to change, along with its environment, so as to continue to persist at all, or increase its capability to adapt, generalize, etc? > What if there is only one absolute central controlling agent? > What if no other fully autonomous systems can rival the one? Does not matter -- the underlying substrate needs argument still applies. The argument makes no assumptions regarding 'agent' counts/notions. > What if the FAAI consists of many agents working cooperatively? See above. > This feels to me like it proves too much. Is this expression of an intuition or of a limit of understanding? > Can an FAAI have any preference that is stable over time > that isn't also about (entangled with) its own survival? Not over the kinds of long intervals that matter. Of course, it is also true that over shorter intervals, where the rate of change of the inter-activeness between the unpredictability of the environment and its own substrate are 'sufficiently low' -- ie; non-interfering with its own survival/capability, etc, then anything (any design element, intention, projection, etc) can be considered to be "stable over time". So then the question becomes "for how long is alignment required?". If the argument of the trans-humanists is that "we should make FAAI now" because "all of the trillions of all of future humanity over all of the rest of time will depend on us doing that", then the required interval of alignment is 'forever'. > Is survival and its correlates the only possible coherent value set > in our universe? Yes, at one level, of description, it remains the case that some analogue of 'social, survival, sexual' is 'a convergent fixed point' in the total continuum of all possible value structures. But that triple includes a lot that is not obvious! > Whatever is the controller of an FAAI > would have to either become an FAAI > or merge with an existing FAAI. Yes, this has been identified mathematically. > Redundancy can act as a simple error correction mechanism... ...for the category of simple errors. Ie; things involving finite alphabets of symbols with known contextual constraints. But such simple techniques do not work for 'open' symbol sets, and for which the context is either partially or fully unknown or unknowable. How do you use "simple error correct" for the problem of "have we learned what we were supposed to learn?". > What if the controller was immutable (unchanging)? Then the controller cannot account for and constrain the new behaviors that the FAAI has learned about its environment that can be used to create unaligned actions. Moreover, the fixed controller also cannot account for the FAAI eventually learning completely how to circumvent that controller. > What if the controller was itself the only FAAI? > What if the FAAI was 'an intelligence inseparable from its alignment/goals'? Are these alignment goals 'learned' (and therefore inherently mutable/unstable because 'learning' can always 'learn more') or built-in at inception (immutable)? If they are immutable, then the case of 'fixed controller' apples. If they are mutable, then the case of past learning not being a complete constraint on future learning applies. Regardless, the condition of 'inseparability' becomes one of attempting to join change and changelessness, which is itself, inherently, a contradiction. ~ ~ ~