prev
000 of 000
next
FILE: REVI: - [20_22/11/29;10:11:53.00]:. - initial argument form draft. - [20_22/12/24;01:15:56.00]:. - setup for abstracted publication. TEXT: The 'threat model' described herein is something that is inherently and unavoidably involved in *all* AGI safety alignment work, inclusively, guaranteed. That *any* and *every* AGI 'safety/alignment' process is a special case of a 'causative feedback process'. - that there are no/zero exceptions to this truth. - as anything and everything that is true of the use of causation, as a feedback process, and/or an 'algorithm', also as a 'modeling process', and/or of 'signaling', inherent in any 'feedback', will for sure be true also of any AGI safety/alignment technique, method, proposal, methodology, algorithm, etc. That *any* and *every* 'causative _feedback_ process' depends on at least *all* of the following:. - 1; the completeness of the sense input. - as the input data that is being processed by or used by, the proposed AGI alignment technique/etc, *and* also as the input data that is intake into the AGI system, so that the _model_ of the key aspects of the AGI system has reasonably correct input data. - 2; the completeness of process modelability. - ie; that the proposed AGI alignment technique/etc, has to actually be that, some sort of algorithm, itself conditional on only causation and logic *and* that the AGI system itself has to be at least partially modelable, in whatever key aspects, are necessary to establish 'alignment'. - 3; the completeness of predictability. - ie; as the assumption that the model of the AGI system when also given the model of the input data will allow for sufficient predictability of whatever key aspects of the future AGI outputs will be, before the real (non-modeled) AGI would actually act, in a way that is assessable and comparable to desired outcome states. - 4; the completeness of some type of a comparison to a reference (where that reference implicitly defines whatever is the meaning of "aligned" or "safe"). - 5; the completeness of some type of signaling (ie; what is the "feedback", how the model itself works internally, how the model alerts the actuator so that the model 'controls' and constrains the AGI so that it is acting/operating in an aligned way). - 6; the completeness of some type of conditionalization (ie, what is "effectiveness") of at least one of the following AGI aspects:. - output actions. - internal process. - sense input. - ie; where/if the AGI conditionalization does not have sufficient "power" to actually constrain the AGI from doing bad things, then even a very good alignment algorithm/technique/process will not result in an aligned/safe AGI. That aspects 1 thru 6 (inclusive) are true of *any* and *every* 'causative feedback process'. - that there are exactly none/zero exceptions. - that *all* causative feedback process will have/require all six of these aspects (with no exceptions). Therefore, a/the/any/all AGI alignment/safety enforcement protocols, techniques, methodologies, etc, will also (for sure) be required to have and implement all six aspects (with no exceptions), *and*, where given that requirement, if *any* of these six aspects, for whatever reason, cannot be implemented, then that/therefore AGI alignment/safety cannot be implemented. Unfortunately, _every_single_one_ of the listed dependencies (all six of them) are individually impossible. (Ie; they cannot be 'complete enough' to achieve anywhere near the minimum threshold necessary required for AGI alignment/safety). Insofar as *all six* are necessary together to achieve AGI alignment, that there are effectively six, at least semi-independent, complete and whole arguments against there being *any* notion of AGI alignment, as based on nothing outside of the notions of causation, conditionalization, and/or logic (ie, all different manifestations of the same idea). It is a separate exercise to show, for each one of these six aspects (necessary to any feedback system), exactly why they fail, and cannot be implemented, due to logic of causation and the requirements of any rational notion of AGI safety/alignment. However, a lot of this has been done already elsewhere, and need not be repeated here. All that is necessary herein is to show that 'threat model' is actually directly and intrinsically connected to the very notion of what it could ever even possibly mean to be an "AGI Alignment Protocol", process, etc. As such, it becomes increasingly obvious that there is an inherent and unavoidable contradiction between any notion of 'AGI persistently existing' and 'AGI is safe'. These arguments apply, and are relevant to, the overall AGI alignment concept, in total, and cannot be escaped by any form of wishful thinking. ~ ~ ~ As a side note, in addition to all the above, (as if that was not already sufficient), the control and/or "error correction" system must itself not make mistakes/errors (ie, be/operate within reasonable limits, be itself predictable, etc). What tracks errors/mistakes in the system that is modeling the system to apply error correction to? Insofar as the same logic applies in proportion to the complexity of the error correction, as it does in the complexity of the AGI system itself, then there is a kind of regression cascade problem. For example, in any system that is complex enough to be Turing complete (which is not very much, overall, as compared to the complexity of the AGI system to be monitored/checked, etc) is also complex enough to implement some sort of virtual machine dynamic. Any VM is a strict indicator of impossibility in regards to predictability (ie, everything depends on the data, which immediately becomes code, and any amount of uncertainty in the data -- even a single bit change, results in drastically different outcomes; ie, near perfected unpredictability in the face of any single point errors at all). Basically, all sorts of Rice Theorem limits apply. ENDF:
prev
000 of 000
next