[ Help | Earliest Comments | Latest Comments ]

[ List All Subjects of Discussion | Create New Subject of Discussion ]

[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Comments/Ratings for a Single Item

~~Earlier~~ ⇧Reverse Order⇩ Later⇧ Latest⇩

Alpha Zero plays chess[Subject Thread] [Add Response]

Aurelian Florea wrote on Sat, Dec 9, 2017 07:54 AM UTC:

Here is a commentary of one of the games in the 10-0 win of alpha zero against 3rd place in this year's TCEC Stockfish: https://www.youtube.com/watch?v=4ebzevCLGbQ

I was curious if it can be modified to play other chess variants. I'm quite sure it is. There are probably legal matters to take in consideration but besides that programming wise, it should be just a matter of feasible time :)!

You may find the whole match on chess24.com :

https://chess24.com/en/watch/live-tournaments/alphazero-vs-stockfish/1/1/1

@Greg & @ HG

Any ideas about it, You two are probably the leading people in CV computer play :)!

I vowed to do something like alpha go for CV for one year now but still have not started :)! I guess that boat has sailed, maybe not :( :), don't know :)!

Greg Strong wrote on Sat, Dec 9, 2017 11:58 PM UTC:

Retrofitting an existing high-quality engine to play variants is problematic. It is not so bad for variants that are very close, like Fischer Random. Even things like Siriwan that are played on 8 x 8 might not be tooo bad. But adding the ability to change board size to a program which was created without that in mind would be very difficult. And it probably would not be worth the time. The really top-level engines like Stockfish are so good because they have lots and lots of code that very specifically designed and optimized to play Chess. Most of this code is, at best, throw-away for playing variants, and at worst, downright harmful and hard to remove.

Best to create from scratch. Want to dip your toes into these waters, I would recommend you download the source code for SjaakII, and see if you can improve it, or at least understand it.

Cheers :)

H. G. Muller wrote on Sun, Dec 10, 2017 08:40 AM UTC:

I think you are a bit too pessimistic on this. There actually exists a version of Stockfish that can play variants, some very different from Chess (Crazyhouse). And it blows away everything that was written before. So the general search strategy must count for a lot. And the Stockfish developmet system also includes advanced tuning methods.

Aurelian Florea wrote on Sun, Dec 10, 2017 10:14 AM UTC:

HG & Greg both,

That was mostly about Alpha0 rather than Stockfish, which is a machine learning program :)! Totally diferent monster. They changed a program desinged to play go into a program that play chess and trained it for only 2 hours. I think it is obvious that the diffenece between Go and Chess are far greater than let's say between chess and Grand Chess or Omega chess. This is what I was asking about. I hope I have made myself a bit clearer :)!

Greg Strong wrote on Mon, Dec 11, 2017 12:56 AM UTC:

It's certainly possible I'm too pessimistic. Playing Crazyhouse is pretty impressive as Stockfish has no concept of drops, and that's a huge change. That said, it's still an 8x8 game. Stockfish has bitboards deeply embedded throughout. That said, it is very object-oriented and template-driven, so replacing the bitboard class with one that uses more bits might not be too bad. But I still think it actually would be harder to make it play Omega Chess than Crazyhouse. Anyway, Stockfish's selectivity is incredible. It makes sense to me to just design an engine with universality in mind and then copy the fantastic elements of Stockfish's search algorithm.

Regarding AlphaChess, apparently this is a completely different beast. I should have read the article before commenting :) I might have some thoughts after I read it.

Joe Joyce wrote on Mon, Dec 11, 2017 04:19 AM UTC:

AlphaZero is a neural net which learns by playing against itself, starting with random moves and working up to a rating of over 3300, iirc. It runs on some very fancy hardware, so its learning time is misleading. It played millions of games against itself to learn Go. I'm curious about just what it learns when it's teaching itself a game. How dependent is it on the exact board geometry and each player moving only one (or a few) pieces per turn? If the game is a very large (~1000 - 10,000 squares or more) massively multi-move abstract strategy war game with a board that can change between games, does that sort of thing make any significant difference, or merely add some time to the AI learning process?

Kevin Pacey wrote on Mon, Dec 11, 2017 05:16 AM UTC:

Perhaps board game players might bear in mind that self-teaching AI is not quite perfect yet (if it ever will be), it seems, as there are still driverless vehicle crashes making the news now and then.

Aurelian Florea wrote on Mon, Dec 11, 2017 07:49 PM UTC:

@Greg Strong

It is named Alpha Zero. Also if you find good materials please share, I could not :(!

@Joe Joyce

I don't think scalability is a problem, not a big one. Not even changing boards or weird piece properties, as long as they can be approximated with number these algorithms are fine. And honestly to my mind everything could be approximated with numbers. The technical term is that the Stone-Weierstrass Theorem should be applicable and then it works. Things probably could go wrong initially sometimes but it will always be a matter or training time, and it most likely will never be matter of decades. Of course one could imagine something totally unfeasible for any hardware from this universe like a game on a 10 billion by 10 billion board, but I think that is too far fetched :)!

@Kevin Pacey

These algorithms are indeed not perfect as by definition they are heuristic, so they will never try to achieve perfect play, but by an statistically relevant sample they choose a very likely very good solution. To my knowledge there was only one self driving car accident ever on a public road in more than 1 billion kilometers. And it was not a sole software glitch but rater a poor visibility problem leading to not enough information.

Kevin Pacey wrote on Mon, Dec 11, 2017 07:54 PM UTC:

Here's a link to a Google search result on 'self-driving car accidents'. Note, however, that at least some were contributed to by the human in the vehicle at the time:

https://www.google.ca/search?source=hp&ei=0-EuWr_MI8vcjwSI-aXoDQ&q=self-driving+car+accidents&oq=self-driving+car+accidents&gs_l=psy-ab.3..0l2j0i22i30k1l8.3636.17029.0.18692.26.22.0.4.4.0.292.4491.0j1j18.19.0....0...1c.1.64.psy-ab..3.23.4691...46j0i131k1j0i46k1.0.SLyPINMzj88

Aurelian Florea wrote on Mon, Dec 11, 2017 07:57 PM UTC:

I was saying about full control by the software. Anyway I doubt we should prolong this discussion here as is not the objective of this website. If anymore talk please email me :)!

Greg Strong wrote on Tue, Dec 12, 2017 04:51 PM UTC:

I don't think a comparison to self-driving cars is valid. Go and Chess are games of perfect knowledge and clearly defined rules. Self-driving cars certainly don't have perfect knowledge, and, while driving does have rules, not all players follow them (at least until we reach 100% self-driving.)

H. G. Muller wrote on Tue, Dec 12, 2017 08:33 PM UTC:

I have no doubt that AlphaZero could easily do most Chess variants. My previous posting in this thread was a reaction to Greg's remarks on Stockfish. There boards larger than 8x8 would indeed be a problem. For AlphaZero, not at all.

As to the maximum capacity in terms of board size: I am sure there is one in the current system. But I am also pretty sure expading those limits would just be a matter of recompiling the software, and perhaps throwing more hardware against the problem. Note that the effort on Chess used overwhelming computing power by doing things in parallel that could just as easily have been done sequentially. Like generating the self-play games. As a result it took only 4 hours for the machine to teach itself to play Chess at the 3000+ Elo level starting from just the rules, rather than 2 years.

Everything will just get slower if you would be trying larger games. Bigger doesn't always mean more complex, though, and I can imagine that there are large games that do not need much finesse to play, and still can be learned in a small number of self-play games. (I imagine something like Checkers on a very large board.)

Game play by the trained machine would also become slower if the number of moves per typical position goes up. This will expand both the game tree necessary to see the essential tactics, as well as the neural network needed to guide the search. But of course all other methods to play the game, such as human thinking, will suffer similarly.

There is one problem: AlphaZero will be able to master any given Chess variant quickly, but after that, it still cannot tell you how it should be played. Even for simple things like piece values, you would have to reverse-engineer those from the neural net, by presenting it sets of positions with material imbalances, and looking how this affects the win probabilities that it predicts on average for those positions.

Joe Joyce wrote on Tue, Dec 12, 2017 09:23 PM UTC:

It's true that humans don't handle ever more complex calculations, but it's also true that humans are good at pattern recognition. Further, a highly complex situation where there are many many equivalent moves, one that effectively precludes good forecasting of enemy replies, would, I think, prevent Alpha Zero from becoming significantly better than all humans. In a purely combinatorial abstract strategy military or military-economic conflict game, where mathematical chaos is how the massively multimove game 'works' in a military sense, there isn't a good way to project future game states, and this I believe would keep a calculating machine from becoming significantly better than all humans to the extent that a human or human team could win against the AI. This is what I'm curious about. Is there a ceiling to ability in complex enough abstracts and does this mean humans can win against the best machines in such games?

Aurelian Florea wrote on Wed, Dec 13, 2017 07:07 AM UTC:

@Joe Joyce,

Well first is rateher obvious there is a ceiling, due to given hardware to human performance. Yes human are indeed very scalable, but the point of AlphaZero, is that neral networks are too. I'm not sure how scalable but probably roguhly the same as humans. From a sportmanship point of view computer games are again superior as they ca just play continuosly where humans need rules preventing getting tired. I think a 9 rounds swiss tournament of renn chess for example would probably take around 20 days for humans. And that is no in any way an extreme example.

Aurelian Florea wrote on Wed, Dec 13, 2017 07:10 AM UTC:

I did forsee a dificulty even for AlphaZero that I see it has not been commented yet. I think one can craft weird peice properties that are diffuclt to teach. I can't think at something that will surelly work, but I could :)!

Aurelian Florea wrote on Wed, Dec 13, 2017 10:31 AM UTC:

More on the topic from GM Pepe Cuenca :)!

https://www.youtube.com/watch?v=9CoNk3EYOpc

V. Reinhart wrote on Wed, Dec 13, 2017 11:53 PM UTC:

A few comments ago someone asked about materials on AlphaZero. Here is an academic paper, with several authors. Not sure how many (or if all) were funded by Deepmind (which is owned by Google, and created AZ):

https://arxiv.org/pdf/1712.01815.pdf

Most new technologies seem to first be used for military applications, and then general consumer products. I'm surprised AZ appeared so quickly in the chess-playing world. We aren't insignificant!

Joe Joyce wrote on Thu, Dec 14, 2017 09:12 AM UTC:

Aurelian, I've read the first part of the paper V. Reinhart linked a bit after our comments. My math was always bad, but I think this is a relevant paragraph in the paper:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root s root to leaf. Each simulation proceeds by selecting in each state s a move a with low visit count, high move probability and high value (averaged over the leaf states of simulations that selected a from s) according to the current neural network fθ. The search returns a vector π representing a probability distribution over moves, either proportionally or greedily with respect to the visit counts at the root state.

I believe that it would take a truly remarkable neural net to significantly outperform all humans either individually or as teams playing as a general staff, because the sheaves of probability explode from each potential group of moving pieces interacting with each different board or even different entry squares or entry times presented.

Let me offer you a link to a website under construction that steps through the first "day" of a purely combinatorial abstract strategy combat simulation, which includes 24 sequential "daylight" turns alternating between blue and red, and a lesser number of "night" turns to finish all combat, separate the 2 sides, "rally" troops - return 1/3rd of each side's losses to the owning player to drop by friendly leaders. Marked reinforcements come in between turns 8 & 9 (4 turns for each side) on their assigned entry areas, are unmarked and move normally from the start of the next daylight turn. The sequence above is repeated again, with on-board sides each being reinforced twice, once on daylight turns 29/30 and again on 39/40. After a second night, a 3rd day with no reinforcements is played. If none of the 3 criteria for victory has been achieved by either player, both lose. Otherwise, a victor or a draw is determined.

http://anotherlevel.games/?page_id=193 (please wait for it to load - thanks! Said it's under construction!)

Note terrain blocks movement and is completely variable. There are a handful of elements I put in each version of the scenario, a "city" of around 10 squares in the center of the board, a "mountain in the northwest quadrant of the board, a "forest" in the south, a "ridge" running from NE to SE of the city's east edge, a light scattering of terrain to break up and clog up empty areas on the board, and a dozenish entry areas. Nothing need be fixed from game to game. How does even a great neural net do better than any human or team every single time? There are far too many possibilities for each game state, and truly gigantic numbers of game states, in my semi-skilled opinion.

Aurelian Florea wrote on Thu, Dec 14, 2017 11:44 AM UTC:

@Joe Joyce,

You don't need a remarcable nerural net, just a big one :)!

Let me put it this way to you. You know when you go to jobs interviews and they ask you what experience you have? Well such a program has somewhere above thousands of chess lifetimes of experience. Is that remarcable? Maybe!... That is probably for everyone to himself to decide :)!

More, you seem not to understand that these algorithms are highly scalable. Size, bellow the point of ridiculouseness (which could be around an 1600 squares board maybe) is almost irrelevant. Yes tricky rules like restrictions to capture are harder to grasp, and we may come up with more tricks to make it even more difficult :)! But at the end of the day these are akin to academical parlour tricks, nothing to difficult to grasp :)!

Joe Joyce wrote on Fri, Dec 15, 2017 12:31 AM UTC:

I agree, Aurelian. I think it's obvious that neural nets could 'easily' (with much hardware, time, and $$) play games like I've described at human level, and possibly a bit beyond. My point is that there are far too many indeterminacies for even the best neural nets to successfully predict game states (ie: what the opponent, or even the AI itself, will do in a couple of turns) for the software to consistently outperform the best humans or human teams. The game tree for even a specific game of Macysburg (a 32x32 abstract strategy war game riff on the Battle of Gettysburg during the American Civil War) is ridiculous. If AlphaZero depends in part on the exact board configuration, that can/does change significantly game to game. And predicting future game states does not work except in the most limited of circumstances. The best the AI can achieve is a generalized knowledge of how terrain affects movement and combat. It can apply those rules very well in limited situations and be a brilliant tactician, but so can humans. The AI clearly has the potential to be better at tactics, but how much better? And I don't think the AI can be significantly better in strategy without teaching us more about strategy. I think that people find it very hard to understand the total range of possibilities. The game starts with about 42 pieces on the board, all of which can move every turn, if they have a nearby leader. And there are 3 reinforcement turns which bring in another ~42 pieces each time. Excpect to have ~100 pieces maneuvering in the middle of the game. Exactly where each type of piece stands each turn, the exact order in which they are moved, exactly where terrain is in relation to each piece, as well as what the terrain is - different pieces get different effects - determine what attacks can be made each turn, and changing any of those conditions changes what happens *each turn*. I maintain that unless quantum computers work exactly as advetrised, the AI *cannot* effectively predict future game states to any overwhelmingly useful degree. Thus, based on monte carlo statistical approaches, such Ais can be at best only marginally better than the best humans/human teams.

Aurelian Florea wrote on Fri, Dec 15, 2017 05:34 AM UTC:

@Joe Joyce

I'm sorry to say that to you again, but you don't seem to grasp the fact that these algoriths are highly scalable. Once again: SIZE DOES NOT MATTER. If you make the game significantly bigger, you would most certaintly would make them further away than perfect play, but also even better than humans. With 100+ pieces on a 32x32 board and multiple moves/ turn no human ca even grasp tactical implications in 5-6 turns. But such an AI will have enough experience by shear mountains of trial and error, where it will always put itself in very favorable situations. Hardware was important in the sense that with '90s hardware ML would have been useful. Now that we have passed that barrier it is usefull anyway. More is better still, as reasonalbe training times may be produced the the deciding factor is that we have passed that threshhold :)!

H. G. Muller wrote on Fri, Dec 15, 2017 10:08 AM UTC:

Note that AlphaZero is not just a neural network. It is a tree search guided by a NN, the NN being also used for evaluation in the leaf nodes. The tactical abilities are mainly dependent on the search. The NN is just good at deciding which positions require further search to resolve the tactics.

It is certainly true that more complex games need larger search trees to resolve their tactics, and that larger boards with more pieces will also require larger width of the NN to interpret the position. All of this increases the required computing power. But humans suffer from larger complexity too. So AlphaZero might not get nearly as close to perfect play in a complex war game as it can in a simple game like Chess. But it is not possible to draw any conclusions from that o how it will fare against strong human opponents. The way it 'thinks' is actually quite close to how humans approach such games. So you would expect it to suffer equally as human opponents. Then it just matters who has the most computing power. In Chess AlphaZero was examining 80,000 positions per second, which is far more than any human could do.

Aurelian Florea wrote on Fri, Dec 15, 2017 11:52 AM UTC:

I really have to read that article :)!

V. Reinhart wrote on Fri, Dec 15, 2017 05:16 PM UTC:

As for AlphaZero (AZ) playing chess against humans this much is pretty clear:

Stockfish >> human
AZ >> Stockfish

So obviously:

AZ >> human

("Stockfish" denotes the chess engine supported by a typical desktop CPU. Its performance against AZ with stronger hardware has not been tested)

Two comments I have about AZ:

1) AZ (currently) requires supercomputer-equivalent support (application-specific devices its developers call TPUs or "tensor processing units").

2) AZ and its related programs have also become very good at playing Shogi and Go. I don't see any reason why it could not master every chess variant I've ever seen. It's just the time (programming of the rules), and the required hardware that would deter its developers from doing this. There is certainly many other things for neural networks to be studying, so i don't anticipate AZ will "invade" the chess variant world.

H. G. Muller wrote on Fri, Dec 15, 2017 07:52 PM UTC:

Well, the 64-cores (or was it in reality 32 cores, with hyper threading?) setup used for Stockfish in the match was a bit more powerful that the 'typical PC', which nowadays is only 4 cores.

Note that the TPUs are not really more powerful than top-of-the-line CPU chips, in terms of number of transitors, or power consumption. It is just that they do completely different things Things useful for running AlphaZero. If AlphaZero would have to run on an ordiary CPU, it would be orders of magnitude slower. OTOH, if Stockfish would have had to run o a TPU, it probably would not be able to run at all.

But as applications using eural nets get more common, it is conceivable that future PCs will have a built-i TPU as a standard feature. There has been a time that floatig point calculations were considered such a difficult and specialized task that you needed a separate co-processor chip for them (the 8087) next to the CPU (he 8086). From the 80486 on, the floating-point uit was included in the CPU chip. TPUs might go the same way: first available on a plug-in card about as expensive as your motherboard (+ components), such as powerful video cards for gaming now, then as a add-on feature on the motherboard itself, where you just plug in the optional chip, and then integrated on the chip itself. There is a limit to the usefuless of ever more cores for the average PC user; having more than two cores is already a dubious asset for most people. Having 2 cores plus a TPU would probably be much better, when neural networks will get more commoly used in software.

25 comments displayed

~~Earlier~~ ⇧Reverse Order⇩ Later⇧ Latest⇩

Permalink to the exact comments currently displayed.