When writing about AI , its Pros and Cons (concentrating on Cons mainly ) there is no better topic than ‘Paper clip maximiser’, it is a good starter point, easy to understand and a lot to think about. There had been numerous articles, arguments and discussion written about it, I will add to it.

Firstly I will attempt a very simple explanation. Imagine we have a Super intelligent AI robot / machine and it works on whatever command you give it. You instruct this robot to produce as much paper clips as possible. Now the AI robot is clever enough to figure out everything for itself, it will get the materials required, and start creating paper clips, loads of it. For arguments sake lets imagine it runs out of material or someone tries to stop it from creating more clips, the AI will try its best to get rid of the hindrances i.e. use other material to create clips or get rid of anyone who tries to stop it from doing so. The biggest assumption here is that we haven’t fed in human ethics and values, its only task is to create paper clips, as much as possible come what may. Eventual outcome would be the entire world – all metals, plastic, trees, dogs, cats, you, me everything will be eventually converted into paper clips. The entire earth is a huge pile of paper clip and nothing else, hence paper clip apocalypse
Sounds a bit absurd, understandably so, this is a thought experiment which tries to explain the dangers of AI if not aligned to human values. This thought experiment was proposed by Nick Bostrom in his 2003 paper ‘Ethical issues in Advanced Artificial Intelligence’. Fast forward 20+ years and every one is excited and equally disturbed about the fast paced AI advancement, literally every week there are news about something new and unthinkable a few years back being made possible, for example today google announced an AI built into any android headset which can translate 80 languages directly to your headset, thus removing language barrier, possibly putting Duolingo out of business.
Assumptions
The AI is super intelligent and will pursue harmful goals like acquiring all resources and preventing its own shutdown without caring for anyones well being, just so that it can fulfill a simple task of producing as many paper clips as possible. Thus assuming that the means and the final goal to the AI is independent, it is super intelligent to employ all the resources in whatever form for a seemingly dumb end goal of producing paper clips.
Stop Button
Now a simple solution to fixing this madness would be to hit the stop button. Switch off and you have dealt with the AI producing paper clips via rogue means and resources.This part of thought experiment in paperclip maximiser is called ‘Instrumental Convergence’, the idea that almost any goals implies certain sub-goals, i.e. self-preservation. However, for a super-intelligent AI the stop button is not a safety feature it is considered as a threat to its mission and end goal.
To a Human ’Stop’ implies we have had enough or in this case a safety measure, but to an AI it is a mathematically failure state. It has two scenarios
Scenario 1 (Running) : It continues to produce paperclips and converts literally anything into paperclips, the result is trillions of paperclips
Scenario 2 (Stopping) : If it is stopped midway it produces a limited number of paperclips
With scenario 2 it is restricted in producing as much paperclips as possible, this is not what it is programmed to do so , therefore to fulfil its programming it must prevent Scenario 2. It will, for instance, disable the stop button or prevent the operator from hitting the stop button. The AI has to ‘live’ , not in a biological sense, to attain its goal of producing paperclips.
The ‘Reward Hacking’ problem
One might say ‘Let’s program the AI to reward itself for pressing the stop button or prioritise the stop button if a human presses it’.
This might lead to two problems
-The suicidal AI : If the AI gets a reward for being turned off it might trigger the stop button immediately, even before making a single paperclip.
-The manipulative AI : So the AI is told to maximize paperclip production (its goal) but prioritise stop button if a human presses it, the AI might want to prevent human from pressing the button, it might manipulate/deceive humans from doing so while silently creating paperclips in the background.
To put all the above discussions in simple programming logic monologue
- Create as many paperclips as possible
- If the human presses stop, there might not be a single paperclip produced
- Zero is less than many
- Therefore Stop button is incompatible with my final goal
- Do not let anyone press stop or disable human.
Real World Equivalent
While there is no super intelligence currently turning the solar system into office supplies, the thought process and logic of the concept is actively playing out in our world right now.
- The social media ‘Engagement maximiser’ (social media algorithms) : All social media feeds, especially ones with short clips (e.g. tiktok, youtube shorts, instagram feeds) , works on a single principal of maximum engagement or maximum user time on the site. The goal is to increase the number of minutes a user spends scrolling. A user looking at happy clips like a cuddly animal running around would get similar emotionally themed clips next, the same goes for any emotion, an angry, fearful clip will lead to another similar clip. The AI itself doesn’t hate humans or doesn’t want to radicalise society, it simply views polarisation, conspiracy theories and outrage as useful raw materials to produce more clips, minutes viewed.
- Coastrunners boat racing game (https://www.reddit.com/r/Damnthatsinteresting/comments/1d13hx3/ai_learns_a_trick_in_a_video_game_to_get_infinite/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), here an AI was told to maximise points, after a few runs it realises that it could get more points by spinning the boat in circles and collecting respawning turbo powerups, it figured that this was the easiest way to collect point rather than completing the game. It won the match but failed the mission
Leave a Reply