AI Control Methods

Posted by Eyituoyo Ogbemi on

AI Control Methods

How do you see the world adapting/evolving in an AI environment?


In terms of computer applications, we will see increasing application and adoption of machine learning (ML) and artificial intelligence (AI) techniques.


We already see this in shopping recommendations, games, and large social networks. Voice assistants such as Siri, Alexa, and Google Assistant use ML to perform natural language processing and classification to respond appropriately.


Such techniques will make interacting with devices increasingly seamless, which will ultimately make technology easier to use while making humans more efficient in finding and managing information.


Just like automation and the application of machine learning to businesses and business processes, there are many opportunities to apply similar techniques to government functions. Transportation, public utilities, and public health are all massive public-sector functions that could benefit from the application of ML and AI.


Stephen Hawking says there could be a robot apocalypse, what are risks associated with developing software capable of decision-making and ‘independent’ thought?


I’ve been careful to use the term machine learning over artificial intelligence because we have not yet achieved what could be considered ‘independent’ thought in computer software. Also, decision-making and ‘independent’ thought are two different concepts.


We now have computer-enhanced cars that can ‘decide’ to apply the brakes to avoid a collision (automatic emergency braking), but such a ‘decision’ is limited to a very specific situation. In my view, independent thought is associated with self-awareness and emotion. We have not achieved this type of AI to date and it seems we are a long way off.


We don’t know yet if it is even possible. I’m sceptical that we will be able to develop computers that achieve true independent thought. The complex interactions of our brain functions with our physiology seem truly difficult to replicate.


That said, using computers to automate complex physical systems, such as self-driving cars, that require ‘judgment’ will be tricky. For example, a self-driving car could have to decide between potentially harming the passengers or perhaps many more pedestrians.


Should the car protect the passengers at all costs, or try to minimize the total harm to all the humans involved even if that means harming the passengers? If you knew that a self-driving car was programmed to minimize the total harm to human life in certain situations, would you agree to allow the car to drive you?


So there are definitely risks in allowing software to control physical systems. I think our adoption of computer automation will be evaluated on a case-by-case basis. In some cases, increased automation will make human activities safer. In other cases, we may choose to continue to rely on human decision making but perhaps augmented with computer assistance.


Will AI be capable of governing (in civil disputes, for example)?


This gets back to the notion of judgment and, in addition, morality. If true AI is possible and it could assess situations and pass judgments, then we would evaluate the AI for governing just as we would human judges.


Just like human judges must convince the public regarding their suitability for upholding the law, so would AI judges. If they were to pass our current tests for appointing judges, then it may be acceptable to allow an AI to govern. That said, will AI be capable of such governing? I don’t think so.


AI has been used in recent elections (US/UK - to gauge popular policies and influence voters) will this be a theme in the future?


To be precise, I don’t think true AI was used in these situations, but rather the narrower application of machine learning. Humans controlled the execution of machine learning algorithms. AI did not choose by itself to influence elections. I believe machine learning will be increasingly used as a tool by humans to influence voters and to shape policy.


Do you think governance will need to adapt to handle AI? Is it even possible to regulate it?


AI in practice is really the application of algorithms to data in a process that is controlled by humans. So, in this sense governance needs to adapt to handle and regulate computer software that is used in activities that can impact human well-being such as voting machines, transportation, health systems, and many others.


Computer technology has advanced at such a rapid pace, government oversight has not been able to keep up. It is interesting to think that to build a bridge you must be a licensed mechanical engineer, however, software developers require no such license to work on many types of systems that can affect human life, such as medical devices.


Can we have governance for computer software without stifling innovation and delaying potential benefits to human life? I’m not sure.


Do you think governments will have a say on what technology is developed legally?

I think so indirectly in the sense that results or impact of technology will be allowed through the law, or not. Consider the murky area of drones. Governments will certainly have a say on how drone technology can be deployed and utilized.


The citizens will demand regulation through the democratic process. Although in the case of drones, the technology is so new and different, our understanding of their impact and laws to regulate them have not yet caught up, so this will take time.


Will robots ever be capable of committing a crime?


If you believe we can create conscious artificial intelligent robots that can have flaws just like humans, then yes, such robots could commit crimes just like humans. This would suggest that robots have emotions and selfish motives. I’m doubtful we will achieve such consciousness in digital technology.



Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals. If the superintelligence, therefore, decides to resist shutdown and modification, it would (again, by definition) be smart enough to outwit its programmers if there is otherwise a "level playing field" and if the programmers have taken no prior precautions. In general, attempts to solve the "control problem" after superintelligence is created, are likely to fail because a superintelligence would likely have superior strategic planning abilities to humans, and (all things equal) would be more successful at finding ways to dominate humans than humans would be able to post facto find ways to dominate the superintelligence. The control problem asks: What prior precautions can the programmers take to successfully prevent the superintelligence from catastrophically misbehaving?


Kill switch

Just as humans can be killed or otherwise disabled, computers can be turned off. One challenge is that, if being turned off prevents it from achieving its current goals, a superintelligence would likely try to prevent it's being turned off. Just as humans have systems in place to deter or protect themselves from assailants, such a superintelligence would have a motivation to engage in "strategic planning" to prevent itself from being turned off. This could involve:


Hacking other systems to install and run backup copies of itself, or creating other allied superintelligent agents without kill switches.

Pre-emptively disabling anyone who might want to turn the computer off.

Using some kind of clever ruse, or superhuman persuasion skills, to talk its programmers out of wanting to shut it down.


Utility balancing and safely interruptible agents

One partial solution to the kill-switch problem involves "utility balancing": Some utility-based agents can, with some important caveats, be programmed to "compensate" themselves exactly for any lost utility caused by an interruption or shutdown, in such a way that they end up being indifferent to whether they are interrupted or not. The caveats include a severe unsolved problem that, as with evidential decision theory, the agent might follow a catastrophic policy of "managing the news". Alternatively, in 2016, scientists Laurent Orseau and Stuart Armstrong proved that a broad class of agents, called "safely interruptible agents" (SIA), can eventually "learn" to become indifferent to whether their "kill switch" (or other "interruption switch") gets pressed.


Both the utility balancing approach and the 2016 SIA approach have the limitation that, if the approach succeeds and the superintelligence is completely indifferent to whether the kill switch is pressed or not, the superintelligence is also unmotivated to care one way or another about whether the kill switch remains functional, and could incidentally and innocently disable it in the course of its operations (for example, for the purpose of removing and recycling an "unnecessary" component). Similarly, if the superintelligence innocently creates and deploys superintelligent sub-agents, it will have no motivation to install human-controllable kill switches in the sub-agents. 


More broadly, the proposed architectures, whether weak or superintelligent, will in a sense "act as if the kill switch can never be pressed" and might, therefore, fail to make any contingency plans to arrange a graceful shutdown. This could hypothetically create a practical problem even for a weak AI; by default, an AI designed to be safely interruptible might have difficulty understanding that it will be shut down for scheduled maintenance at 2 a.m. tonight and planning accordingly so that it won't be caught in the middle of a task during a shutdown. The breadth of what types of architectures are or can be made SIA-compliant, as well as what types of counter-intuitive unexpected drawbacks each approach has, are currently under research.


Share this post



← Older Post Newer Post →


Leave a comment