at the beginning of last lecture we

talked about how board games of this

nice contained environment for

performing machine learning and studies

and artificial intelligence and today

we're gonna talk about go as a

supervised learning problem so in

particular notice that that's what it is

a supervised learning problem so we have

a board state which contains a bunch of

points that are open some have white

stones on them and some stones Bach have

black stones on them and the goal of an

of any put go player is to find what the

next move is so if you're a top

professional hoeing boshu Saku you'll

have found that this is the correct

answer in this board position I'm

guessing many of you did not actually

find that move but we can think of it

this way right so we can take each board

position and represent it as a vector so

the point 1 1 contains 0 because it

doesn't have any stones on it and maybe

the next position has a white stone on

it and then a black stone so we can take

that and we can say white stones are a 1

and black stones are a negative 1 and

then list all 361 points as this long

vector of zeros ones and negative ones

and now given that list we want a

program that takes that as input and

outputs the next move to make so we can

see this input-output relationship that

already me mentioned in the beginning

now the go has been difficult for a

while because humans tend to think about

go in this methane's called shapes and

proverbs so we see the shape up here

called the bamboo joint and that's and

the proverb goes that the bamboo joint

is a strong shape similarly this Panucci

shape that's listed below it that's

known as this very strong shape that's

good if you can make it or we can have

Proverbs that relate to how you should

move like honey at the head of two

stones but then we also have more of

these meta proverbs like the enemy's

point is yours

meaning if your opponent wants to go

somewhere it's probably a good idea for

you to think that about that point

yourself or then these meta proverbs

like don't trust proverbs blindly and so

what we might want to take these kinds

of ideas and code them into a computer

program you can imagine that it's very

difficult to write a line of code that

says don't follow pry verbs blindly

similarly go players think about this

concept of OG a lot which loosely

translates to risk

so a board position might be good in the

short term but it leaves a lot of OG so

it leaves a lot of yourself open to a

lot of risk and so go is about balancing

gaining points now versus making sure

your position isn't too weak and this is

these kinds of thoughts are what we need

a computer to do if it needs to play go

effectively so we need some kind of

supervised learning algorithm that can

kind of take these principles and

translate them into effective play but

then coding these directly can be very

difficult and is very difficult and so

kind of the ultimate go question for a

machine playing go program it's like

well we have this blank board how do i

play and win or in contrast if you're

white how do I defend against that black

winning and so that's kind of the

high-level idea is every move we want to

take this vector that encodes the

position of the board and output the

next the next move to make