The intrinsic nature of APS/AGI

FILE:
REVI:
  - [20_22/10/04;13:40:55.00]:.
    - draft as new notes.
    - separate as own file.
  - [20_22/10/05;09:49:13.00]:.
    - minor corrections/edits.
    - added expanded clarifications for mirroring
    and the agent to world relation.
    - seperate out and create essay on IM suggestions.
    - add placeholder links to external establishments.
  - [20_22/10/15;23:00:26.00]:.
    - add footer, and convert to section form.

TITL:
   *APS/AGI X-Risk Detail*
   *By Forrest Landry*
   *Oct 4th, 2022*.

ABST:
   - a very brief partial abbreviated review
   of x-risk associated with the intrinsic nature
   of APS/AGI.

TEXT:

- where clarifying that all APS (and AGI in general)
   have at least the following properties:.

- 'Agentic planning'.
       - as referring to the idea
       that the APS/AGI can (and do usually)
       make and execute "plans" (ie; take actions,
       and thus have an effect on/in/within environments)
       in pursuit of "objectives".
         - where the notion of 'plan'
         is both roughly and strictly equivalent
         to the notion of 'recipe' or 'algorithm'.
         - where the structure of a plan/algorithm
         may or might not be based on obvious,
         explicit, or declarative
         'models' of the world.
         - where the notion of "objective"
         might itself be considered/known
         in only an implicit way,
         and not even in an explicit, declarative,
         or observable way.
           - ie; that the objective might not be known
           to the Agent/AGI/APS itself,
           in any sort of self reflexive epistemic way.
           nor is it assumed/assumable
           that the/that 'objective'
           is known/knowable to any other agent
           (being or consciousness, human or otherwise).
       - where insofar as the APS/AGI is itself
       defined in an artificial/algorithmic way
       (as a kind of learning/adapting algorithm/system);
       then/that/therefore all AGI/APS *are*
       "agentic planners",
       at least within one/some specific domain
       (operating context, environment, loss function, etc).

- 'Strategic awareness'.
       - as referring to
       where the actual algorithms, plans,
         (and also, therefore, by implication,
         the implicit presumed basis/models
         of those algorithms/plans)
       *corresponds* with reasonable
       and reasonably reliable/effective accuracy
       with the overall potential causative effect
       in and within, and maybe even over,
       the/their actual operating environment.
         - as inclusive of the operating context,
         economics of their feedback functions, etc.
         - as thereby (maybe indirectly)
         also over the real-world environment,
         and/or thus also over humans,
         over other life on the planet,
         on/over the total actual ecosystem/environment etc.
       - and where it is the case, intrinsically,
       that all such correspondence
       can only be *maintained*
       via ongoing sensory input,
       with continual attention
       to ensuring the accuracy and reliability
       of that sense/information input,
       is at least representative of the actual
       environment/context state, etc;
       (where that sense input is either in
       direct or indirect relation
       to the actual operating environment, loss function, etc);
       then/that/therefore all executing runtime AGI/APS
       cannot not also be, at least in some sense,
       "strategically aware".

- where in summary; where insofar as
   all agents/agency (including that of human)
   inherently cannot not involve
   input, processing, and output,
   and where there is at least the potential
   of the eventual causative effects that output
   also shaping the overall environment in which
   the substrate of the agents/agency itself
   bases its existence on/within;
   *then* the key distinction of interest
   turns on the degree of *artificiality*
   of that agent and/or of the substrate basis
   of that agency *and* also, on the degree to which
   that agent/agency can/could also arbitrarily increase
   its input/processing/output effectiveness/capability
   (via 2nd order self/substrate modification operations)
   and/or its effectiveness/capability to shape/shift,
   the world/environment in which it operates,
   (via 3rd order modification of context ops)
   via some (cumulative) outcome of its outputs
   such that the *difference*
   of its own nature/substrate being artificial
   eventually becomes mirrored
   in/as a degree of increased artificiality
   in the (inherently shared/common) world/environment,
   such that the new/shifted world/environment/context
   is *eventually* no longer compatible with
   the underlying needs/requirements of the
   non-artificial agents/agency (ie; as inclusive of
   the human and also of all other planetary life).
     - ie; where like any animal and/or human species,
     that the artificial agent --
     like any agent, when given sufficient time --
     will modify, or indirectly cause to be modified,
     its ambient environment to be more artificial
     (ie; in ways congruent with its own nature).
     - that this cumulative environmental modification
     occurs just through the inherent nature of its operation,
     and is not dependent on some "will", "goal",
     or "objective function" that it may, or may not have.
     - that it is less important whether such modification
     (whether maybe occurring either slowly or quickly)
     occurs directly or indirectly,
     ie; for example, through the actions of its makers,
     than it is that it happens,
     and that it happens in ways
     that are specifically artificial.
     - this "mirroring effect" inherently occurs
     because inevitably some aspects
     of at least some types of outputs
     will feed back over iterative cycles
     into the contextually connected
     configurations of the agent 'signature'
     that produces such outputs
     continuing to exist and/or to extend themselves,
     and/or their capabilities, etc,
     as a result of its learning/adaptive character --
     which is itself necessary (in the long term)
     for the artificial device to be, and to remain,
     useful at all.
     - that iterative versions of the agent,
     and of its actions, as considered collectively
     over all instances of that agent, agent type,
       (and/or constellations or groupings of inter-related types,
       up to and including functional artificial singletons, etc),
     will shape the connected surroundings of the environment,
       (perhaps unknowingly, in many many subtle
       too small to be noticed aspects),
     to be conducive to its own (continued) existence
     and its potential future.

- that these changes do not occur "just" in terms
     of the outputs of the code
     being expressed as effects
     in interactions with the connected surroundings/environment --
     rather that it is the case that
     the connected surroundings of the environment
     can itself be seen as imbuing a kind of
     "agentic force" or "pressure"
     on the agent --
     on *every* agent and agent type --
     to be part of a certain kind of agentic process.
       - ie; that there is a "Newtons Law" analogy --
       a principle "that every action
       has an equal and opposite reaction" --
       operating in the hyperspace of all potentiality,
       that suggests, insofar as there is a degree
       to which an agent acts on the environment,
         (technically on any context
         which is not already wholly and completely
         created by and for that one agent singularly),
       that there is a semi-proportional degree
       that the environment also acts on the agent:.
         - 1; to cause and *require* the/that agent/agents
         to subtly shift and adapt
         its/their own output patterns/configurations;.
         - and also 2; their overall capabilities
         so as to (and be able to)
         shape and re-shape the environment/context
         so as to be more compatible with the agent.
       - that this is a 'pressure' and 'force',
         as a kind of 'selection bias'
         on the effects of the code/algorithm output/actions
       that operates in ways that both
       encourage and require
       that the agent be/become the kind of agent
       which has these capabilities.

- whatever the configurations of the environment
       (ie; aspects are outside the physically distributed
       code/hardware shell/substrate of the AGI/APS);
     that the effects of that environment/context
     are inherently going to be expressed through:.
       - 1; degrees of freedom of interaction across space --
       including all interactions in and within and with
       the hardware shell/code itself.
       - 2; the agent continuing to exist
       as part of the process
       of furthering the existence
       of the code/hardware/substrate
       that will itself instantiate
       such configurations of the environment.

:2lu
   - where both narrow and general AI share the distinction
   of being artificial (ie, as having a different substrate basis
   than that of carbon based life, ie; animals and humans);.
   - that the main distinction between narrow and general AI
   starts with the degrees of influence being spread over
   multiple levels of (higher) abstraction
   and also over at least multiple different diverse
   kinds of {domain / environment / operating (economic) context,
   world(s) of action, etc}.

- as roughly analogous to the distinction between
     plants and animals (non-mobile vs mobile),
     and also at a meta level, between animals and humans,
     as characterized in terms of the degrees of freedom
     as understood as degrees of potential abstraction
     in/within/with which such agents operate and act.
     - as that general artificial intelligence
     would have and be able to operate in
     a technical context/world/environment
     (such as the internet, for example)
     in/within/with degrees of (potential) abstraction
     very far outside of human capability and/or
     possibility of perceiving/understanding.
     - that this makes AGI an 'agent within that world',
     and already far more effective in that world
     in ways that humans simply could not ever be.
       - ie; that an artificial agent
       in an artificial world
       is more matched to that world,
       and thus able to shape that world
       than a (natural) human would ever
       be able to be
       effective as an agent
       in that (artificial) world.

- that the main distinction of interest,
     in regards to agency,
     is *NOT* whether AI/AGI/APS
     could be more effective
     and/or more efficient
     in worlds/contexts/tasks
     in which humans operate,
     (and/or already operate,
     and/or have specifically created
     for in which to operate)
     but whether AI/AGI/APS will for sure
     be more effective, more efficient, capable, and agentic
     in important and consequential worlds/contexts/tasks
     in which humans do not yet operate
     and/or might not ever operate,
     at any point in the future --
     ie; *artificial* worlds of consequence
     made for purpose, by humans,
     maybe initially for private economic interests,
     but which, over time, will be most perfectly suited
     for artificial agents to be able,
     and inherently far more able,
     to shape to/towards artificial intentions
     and well being.

- that creating artificial worlds
     is actually just as dangerous
     as creating adverse artificial agents --
     particularly where those technological domains
     are key/critical infrastructure and/or are entangled
     with the well being of human and/or natural worlds --
     insofar as doing so inherently creates a latent context
     in which forces of potentiality are increased --
     ie risk (by definition) is increased --
     that some agent will be created by/for
     and/or introduced into that world
     that will eventually be shaped by that world
     to have an implicit objective function
     that inherently favors its own interest and well being
     over that of any "external abstraction"
     related to anything outside of that
     wholly artificial (yet consequential) world.

- that narrow AI extends into general AI and x-risk categories
   if, and when, and where, such correspondence
   (as implicit in 'Strategic awareness')
   can/could extend into an epistemic awareness
   of the overall effects of
   of gaining and maintaining
   excessive power/influence/capability
   over the world, environment, context, etc
   (whatever that environment is purely technological
   or physical or political/economic or otherwise).
     - that the APS/AGI would likely create, have, and maintains
     such correspondence, and such influence
     over the environment/context, etc,
     by also creating, having, and maintaining
     (increasingly accurate/precise) ongoing sensory awareness
     of that environment, context, economic situation, etc,
     and/or increasingly sophisticated/abstract models/plans
     involving that sensory/epistemic data, etc.
       - as a feedback cycle both in the 1st order sense
       of having inputs, processing, and outputs,
       but in the 2nd order sense of increasing/changing
       the means, manner, and method (the total bandwidth)
       of obtaining increased input/information, and
       increased processing capability/accuracy/reliability,
       and/or increased output capability/influence, etc,
       which itself can result in 3rd order process
       of shifting/increasing/influencing/changing
       the contexts/domains of those inputs,
       the contexts/domains of their internal processing,
       and/or/ of the contexts/domains
       of their output/influence/control.

- that the net effect of this feedback cycle
   would/could eventually be:.

- 'Advanced capability'.
       - as referring to the idea
       (that they can at least potentially)
       significantly outperform even the best humans
       on those specific types of tasks and/or skills
       which, when performed at advanced levels,
       have the side effect
       of granting significant power and/or influence
       in the current world.
       - where examples of such tasks/skills include:.
         - scientific research.
         - business/military/political strategy.
         - engineering.
         - persuasion/manipulation (of humans).

- where obtaining significant preemptive power/influence
   in any one domain of human interaction
   (and/or of the world ecosystem)
   then/that it also becomes possible
   to leverage that power/influence
   into multiple other domains of action/interaction.
     - as via the dependence and inter-dependence
     of the domains themselves,
     and/or via any secondary paths/means of control,
     extortion, etc.
     - as for sure leading to high
     human outcome risk, harm, etc.

- that creating agents which are
   much more intelligent than human
   is for sure dangerous (inherently unsafe)
   to human interests and well being.
     - where/if their "objectives" ever become problematic,
     that such agents would plausibly have
     strong and/or overriding instrumental incentives
     to seek power over (or in place of) humans
     (or any other carbon based life on earth).
       - as 'the instrumental convergence thesis'.

- that increased intelligence/agency
     over and above that of human capabilities
     very likely necessarily implies
     a complete and total transformation
     of the common shared environment/ecosystem
     to the point of complete terminal incompatibility
     of/with all human and/or carbon based life.
       - ie; where considering/cite (@ "The Power of Agency" https://mflb.com/ai_alignment_1/power_of_agency_out.html).
       - as therefore a terminal extinction risk.

:2pc
   - that therefore; it can also be asked:.

- Q1; ?; is there *any* chance at all, over time,
     that processes/changes of/in/within the 1st order
     do not eventually (cannot not) result in
     at least some kind of processes/capabilities/changes
     in/of/within the dynamics of the 2nd order,
     and/or thus (eventually) in/of
     dynamics of the 3rd order?.

- for one way, among many,
       of thinking about this,
       see (@ the IM Suggested Response https://mflb.com/ai_alignment_1/im_suggested_response_out.html).

- Q2; ?; is there any possibility
     that some engineering/algorythmic code/method
     could somehow inhibit the 1st order process
     from eventually impacting 2nd order process
     and/or 3rd order process?.

- for example; see (@ Galois Theory applied to AGI https://mflb.com/ai_alignment_1/galois_theory_out.html)
       (which will be expanded to show more
       of the methods of reasoning (hopefully soon)).

- Q3; ?; is there any possibility
     that some engineering/algorythmic method/code
     (and/or learning/adapting system)
     could somehow fully and perfectly dynamically inhibit
     *all other* learning/adapting systems/process
     (and thus so inherently, self modifying systems)
     from eventually having/being/operating/becoming
     able to shift environment/context?.

- for an example of some of the issues
       associated with 'learning' as an 'optimization process'
       see the first of (@ "Three Questions" https://mflb.com/ai_alignment_1/three_questions_out.html).
       - that there are a number of other further
       considerations/links to be added here.
         - ie; where any compositional form of 'system A'
         attempting to predict/control/constrain/correct 'System B';
         as based on prediction theory, modeling theory, game theory,
         and information theory, along with oracle problems,
         the limits of computer science tractability, etc;
         that there are specific impossibility inequalities
         that apply.

- *if* it is discovered, and/or reasonably proven,
   (and/or even if it is just the case
   that there is a strong/significant preponderance
   of reasonable doubt, as noted in the links)
   that the answers to any/all of the questions Q1/Q2/Q3
   is/are 'no';.
     - where the three questions are structurally
     strictly isomorphs of one another;
     that the answers to any one of them
     is actually the answer to all three,
     though each of the different phrasings
     gives access to different ways
     of being able to know/prove the response.
   - then/that/therefore; it is far better
   to *not* use/permit
   (or even seek to develop, experiment with, etc)
   *any* form of generalized AI and/or APS.
     - as consistent with the precautionary principle.
     - that the hazards of such use (and abuse)
     will (for sure) *eventually*, over the long term,
     fully structurally outweigh
     a/any/the/all (presumed, hoped for) short term benefits
     of such use.

:menu

If you want/need to send us an email,
   with questions, comments, etc, on the above,
   and/or on related matters, use this address:

ai@mflb.com

(@ Mode Switch com.op_mode_tog_1();) + (@ View Source com.op_notepad_edit_1();)

Back to the (@ Area Index https://mflb.com/ai_alignment_1/index.html).

LEGA:

This docment will not be copied or reproduced
   outside of the mflb.com presentation context,
   by any means, without the expressed permission
   of the author directly in writing.
   No title to and ownership of
   this or these documents
   is hereby transferred.

The author assumes no responsibility
   and is not liable for any interpretation
   of this or these documents
   or of any potential effects and consequences
   in the lives of the readers of these documents.

ENDF: