A Song of Ice and FireAlphaGoAntivirusArtificial IntelligenceBashBod PressBusinessCharityChatbotChildrenComputerConflict ZoneCouchDBEbola VirusFM ReceiverGame of ThronesGeolocationGeorge R.R. MartinGuideIncorporationInformation SecurityIoTJavaScriptLawLearningLinus TorvaldsMedicine Mental IllnessNewsNoSQLOperating SystemRansomwareRansomware DayReviewRevolutionRobotsSHA-1San FranciscoScienceSmall BusinessSolar EclipseTechnologyThe Fourth Industrial RevolutionThe FutureTime ManagementUnix-likeVanilla ForumsWana Decrypt0rWelcomeWordpressWorld Bank GroupZero-day Vulnerability

AlphaGo Beginner's Guide


Everyone knows that DeepMind's AlphaGo defeated 18 times world champion Lee Sedol on March 9 2016 at the ancient Chinese game- Go. What’s fascinating is that the game of Go has as many possible moves as there are atoms in the universe.

This motivated us to find out more about AlphaGo and we decided to dive deep into how it works and its insides. We thought we would share some of the details with you guys!

DeepMind is a British Artificial Intelligence (AI) company that is found in September 2010 as DeepMind technologies. It was later acquired by Google in 2014. DeepMind’s goal is to solve intelligence. You can check more at their website https://deepmind.com.

Coming back to AlphaGo, it defeating the professional Go champions is considered HUGE for AI. Like, REALLY HUGE. It shocked scientists who were thinking that something like this wouldn't happen until at least another decade. It equally shocked experts in the Artificial Intelligence community. Machine that is learning on its own is a huge leap for technology.

The way DeepMind started off is that they fed AlphaGo a hundred thousand games that were downloaded from internet— that strong amateurs played. In the first version, they designed AlphaGo to mimic the player. The goal was to make AlphaGo stronger and compete with top professionals. They took this version that has already learnt to mimic human play, they made it play itself 30 million times. They used Reinforcement Learning. It means that it is not preprogrammed and learns from experience. Using Reinforcement Learning, the system learnt to improve incrementally by avoiding errors. By the end of this, they had a new version that could beat the old version. The reinforcement learning is model-free that means it doesn’t need a structure or rules to work.

The interesting part is, after getting knowledge of few games, it is able to transfer the knowledge across more games.

The first version

The first version of AlphaGo used two neural networks that co-operated to choose its moves. Both are Convolutional Neural Networks (CNN), with 12 layers. It is used for classification of images. It can take images as inputs and output class probability after being trained on labeled image dataset. They learn the mapping between inputs and outputs.

Policy Network

The first network is called the Policy Network. Its job is to take board positions as inputs and decide the next best move to make. DeepMind trained the Policy Network on millions of examples moves made by strong human players. The goal was to replicate the choices of strong human players. After training, it was able to match moves that strong human Go players would make— up to 57% of the time. To improve this they used Reinforcement Learning.

It was fast enough to pick one good move but needed to check thousands of possible moves before making a decision. So they modified the network so instead of looking at entire 19x19 board it looked at the smaller window around the opponent’s previous move and the new move it is considering. This helped it compute the next best move a thousand times faster.

Value Network

The second network is called the Value Network. It answers the different question than ‘what move to play next’. Instead of suggesting the next move, it estimates the chance of each player winning the game given a board position. It provides overall binary positional judgment—that means it classifies future potential positions as either good or bad. If Value Network says a particular variation looks bad, the AI can skip reading anymore moves along that line of play.

In addition to the two networks mentioned above, AlphaGo uses an algorithm called Monte Carlo tree search to help read sequences of future moves effectively. If we attempt tree search, one way to do it is Depth -first that means all the way to the end branches of the tree before back tracking to the next level.

The Breath-first search was memory intensive. So what Monte Carlo search does is it instead scatters the order in which the tree is searched to minimize the change that there is very promising part of the tree we could have discovered earlier than we slogged through the search in prescribed ordering.

The latest version

AlphaGo Zero still uses Monte Carlo tree search but instead of using a separate Policy Network (to select the next move to play) and Value Network (to predict the winner of the game), they integrated both into a single neural network that evaluates positions. Unlike previous versions that were trained on human games, Zero skips the steps and learns by playing against itself starting from completely random play

And you know what, after three days of training, Zero beat the previous version of AlphaGo, the one that defeated 18 time world champion by 100 games to 0 and after 40 days it outperformed a later version that defeated number one.

Many question this by asking ‘Is this an alarm?’ I guess only future can answer that.

The makers aim to use the algorithm used in the software in healthcare and science to improve the speed of breakthroughs in those areas by helping human experts achieve more.

For the technical details behind the original approach, refer https://storage.googleapis.com/deepmind-media/alphago/AlphaGoNaturePaper.pdf


Related Coverage

Legal issues At some point in life, everyone gets an idea of starting their own business – be it that of dealing in diamonds, opening a restaurant or simply starting a bar they always wanted to open! It is always exciting to start a small business of your own and dreaming about it.


One of the most challenging tasks in computer programming is developing an OS and frankly, is not for everyone except the most hard core geekheads among you. In order to start with creating your very own OS let us start by viewing the basic definitions of what a BIOS or boot loader is and does. An operating.


Last month, the World Bank Group published the World Development Report (WDR) 2018, the first-ever edition entirely focused on education. The report warns of a learning crisis in global education and the severity of this in the deprived areas. Shockingly, there are still around 260 million children who aren’t even enrolled in primary or secondary schools. Education is meant to equip stud.


Medicine is the most rapidly growing area of expertise. In recent decades, new technologies and scientific discoveries have changed the idea of the body and its diseases and at the same time the approach to the treatment of the whole person.


he World Health Organization estimates that about 300 million people around the world are suffering from depression, 60 million from bipolar affective disorder, and 21 million from schizophrenia.


The Internet of Things The first three industrial revolutions were triggered by steam, electricity, and, and wired computers which transformed people’s way of life and manufacturing and brought digital capabilities to billions of people.


Opening your own business is a task that is certainly difficult and responsible, but experienced entrepreneurs will agree that real difficulties come when you start developing an already launched project.


Automation of business processes is no longer just an evolving trend in digital marketing. Today it is an integral part of a brand communication.


We all have stories about working in dysfunctional offices, with wacky colleagues and under stressful deadlines. But even this cannot compare to working in a conflict zone, a place that is ravaged by war.


Game of ThronesGeorge R.R. Martin is an American novelist, fantasy, sci-fi and short story writer. Most of the world got acquainted with him after screen adaptation of his epic saga "a Song of Ice and Fire".


Over the last years the art of time management gains popularity. Why so? The answer is very simple: we want to control our life. No wonder there are plenty of interesting techniques allowing us to properly schedule and manage our time.


Ebola Virus The Ebola virus causes a severe illness that is often lethal in the absence of treatment.

July 11, 09 AM
Welcome to Bod Press

Bod Press is a global social network for readers, journalists and companies engaged in writing and reading. The unique audience, fresh information, constructive communication, and collective creativity.


Bod Intelligent Antivirus This review is dedicated to the Bod Intelligent Antivirus developed by Bod Security. The purpose of the article is to show its functionality and demonstrate how it behaves in real conditions.

634 2

wordpress Automattic Company, the developer of WordPress, will no longer spend money on maintaining the office in San Francisco.


Computer Technology It seems that many years have passed, which made an eternity by the standards of the world of computer technology. And the reflection on past mistakes does not stop. And what would have happened if...


Robotic assistant Millions of American families buy automatic voice assistants to turn off the lights instead of themselves, order pizza and show movie program in the cinema.


The data about critical vulnerabilities in WordPress were published - they allow remote execution of shell commands and resetting the administrator password through the substitution of the Host header.

769 1

More than 60,000 computers were attacked and infected with a virus-extortionist Wana Decrypt0r.


At some time, I had to work with one of the document-oriented DBMS – Apache CouchDB, but I had some difficulties with the search of the documentation.


The article describes how to work with push notifications about object events in browsers.


Imagine that you are sitting and waiting for someone in the car, and the poster of your favorite group has caught your eye.


We will organize the small distribution of free stuff for those who aspire to bring something good, kind, wise, and eternal to children.


Google's co-workers and the Centre of Mathematics and Computer Science in Amsterdam, presented the first algorithm generating collisions for SHA-1.


The idea of editing the user environment variables to elevate the rights in penetration testing is as old as the world.

Never miss a story by Walter Benjamin, when you sign up for Bod Press.
Sign up