Author Topic: Bringing Precision to the AI Safety Discussion (Read 504 times)

ExFreeper · « **on:** June 22, 2016, 03:33:10 pm »

http://www.youtube.com/watch?v=f71RwCksAmI

Bringing Precision to the AI Safety Discussion

Tuesday, June 21, 2016

Posted by Chris Olah, Google Research

We believe that AI technologies are likely to be overwhelmingly useful and beneficial for humanity. But part of being a responsible steward of any new technology is thinking through potential challenges and how best to address any associated risks. So today we’re publishing a technical paper, Concrete Problems in AI Safety, a collaboration among scientists at Google, OpenAI, Stanford and Berkeley.

While possible AI safety risks have received a lot of public attention, most previous discussion has been very hypothetical and speculative. We believe it’s essential to ground concerns in real machine learning research, and to start developing practical approaches for engineering AI systems that operate safely and reliably.

We’ve outlined five problems we think will be very important as we apply AI in more general circumstances. These are all forward thinking, long-term research questions -- minor issues today, but important to address for future systems:

Avoiding Negative Side Effects: How can we ensure that an AI system will not disturb its environment in negative ways while pursuing its goals, e.g. a cleaning robot knocking over a vase because it can clean faster by doing so?

Avoiding Reward Hacking: How can we avoid gaming of the reward function? For example, we don’t want this cleaning robot simply covering over messes with materials it can’t see through.

Scalable Oversight: How can we efficiently ensure that a given AI system respects aspects of the objective that are too expensive to be frequently evaluated during training? For example, if an AI system gets human feedback as it performs a task, it needs to use that feedback efficiently because asking too often would be annoying.

Safe Exploration: How do we ensure that an AI system doesn’t make exploratory moves with very negative repercussions? For example, maybe a cleaning robot should experiment with mopping strategies, but clearly it shouldn’t try putting a wet mop in an electrical outlet.

Robustness to Distributional Shift: How do we ensure that an AI system recognizes, and behaves robustly, when it’s in an environment very different from its training environment? For example, heuristics learned for a factory workfloor may not be safe enough for an office.

We go into more technical detail in the paper. The machine learning research community has already thought quite a bit about most of these problems and many related issues, but we think there’s a lot more work to be done.

We believe in rigorous, open, cross-institution work on how to build machine learning systems that work as intended. We’re eager to continue our collaborations with other research groups to make positive progress on AI.

https://research.googleblog.com/2016/06/bringing-precision-to-ai-safety.html

http://www.youtube.com/watch?v=tV8EOQNYC-8

Concrete Problems in AI Safety

Abstract

Rapid progress in machine learning and artificial intelligence (AI) has brought increasing attention to the potential impacts of AI technologies on society. In this paper we discuss one such potential impact: the problem of accidents in machine learning systems, defined as unintended and harmful behavior that may emerge from poor design of real-world AI systems.

We present a list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function (“avoiding side effects” and “avoiding reward hacking”), an objective function that is too expensive to evaluate frequently (“scalable supervision”), or undesirable behavior during the learning process (“safe exploration” and “distributional shift”). We review previous work in the se areas as well as suggesting research directions with a focus on relevance to cutting-edge AI systems. Finally, we consider the high-level question of how to think most productively about the safety of forward-looking applications of AI.

Introduction

The last few years have seen rapid progress on long-standing, difficult problems in machine learning and artificial intelligence (AI), in areas as diverse as computer vision, video game playing, autonomous vehicles, and Go. These advances have brought excitement about the positive potential for AI to transform medicine, science , and transportation, along with concerns about the privacy, security, fairness, economic, and military implications of autonomous systems, as well as concerns about the longer-term implications of powerful AI. The authors believe that AI technologies are likely to be overwhelming ly beneficial for humanity, but we also believe that it is worth giving serious thought to potential challenges and risks. We strongly support work on privacy, security, fairness, economics, and policy, but in this document we discuss another class of problem which we believe is also relevant to the societal impacts of AI: the problem of accidents in machine learning systems. We define accidents as unintended and harmful behavior that may emerge from machine learning systems when we specify the wrong objective function, are not careful about the learning process, or commit other machine learning-related implementation errors.

There is a large and diverse literature in the machine learning community on issues related to accidents, including robustness, risk-sensitivity, and safe exploration; we review these in detail below. However, as machine learning systems are deployed in increasingly large-scale, autonomous, opendomain situations, it is worth reflecting on the scalability of such approaches and understanding what challenges remain to reducing accident risk in modern machine learning systems. Overall, we believe there are many concrete open technical problems relating to accident prevention in machine learning systems.

There has been a great deal of public discussion around accidents. To date much of this discussion has highlighted extreme scenarios such as the risk of misspecified objective functions in uperintelligentagents. However, in our opinion one need not invoke these extreme scenarios to productively discuss accidents, and in fact doing so can lead to unnecessarily speculative discussions that lack precision, as noted by some critics. We believe it is usually most productive to frame accident risk in terms of practical (though often quite general) issues with modern ML techniques. As AI capabilities advance and as AI systems take on increasingly important societal functions, we expect the undamental challenges discussed in this paper to become increasingly important. The more successfully the AI and machine learning communities are able to anticipate and understand these fundamental technical challenges, the more successful we will ultimately be in developing increasingly useful, relevant, and important AI systems.

Our goal in this document is to highlight a few concrete safety problems that are ready for perimentation today and relevant to the cutting edge of AI systems, as well as reviewing existing literature on these problems. In Section 2, we frame mitigating accident risk (often referred to as “AI safety” in public discussions)” in terms of classic methods in machine learning, such as supervised classification and reinforcement learning. We explain why we feel that recent directions in machine learning, such as the trend toward deep reinforcement learning and agents acting in broader environments, suggests an increasing relevance for research around accidents. In Sections 3-7, we explore five concrete problems in AI safety. Each section is accompanied by proposals for relevant experiments. Section 8 discusses related efforts, and Section 9 concludes.

https://arxiv.org/pdf/1606.06565v1.pdf

http://www.youtube.com/watch?v=Lb2T6McJnso

Author Topic: Bringing Precision to the AI Safety Discussion (Read 504 times)

ExFreeper

Bringing Precision to the AI Safety Discussion