Modificato da wetware (20. Dicembre 2012, 02:23:11)
Czuch: I agree with those who've stated that your opponent should have doubled at every opportunity at this match score, beginning with their very first turn. Each opportunity they missed was a mistake. At this point--now that they've gained just a single point--they should be thinking: "If only I had doubled sooner, and my opponent had taken! If I'd won, I'd only have been 2 points away from victory--and that would have meant my only needing to win 1 more game, because I'll double immediately in the next game, too."
I believe that you were correct to pass when you did, Czuch. At that point, your opponent was about a 60% favorite to win the game. You must also consider the relatively few--but exceedingly painful--gammon losses that could instantly lose the match with the cube on 2. All considered, it's a bit too much risk for you to be taking. By passing, you only allow your opponent's match winning chances to creep up slightly--from 31% when this game began, to 32.26% now that the match score has reached 3 away / 1 away. You're not sacrificing much by doing that.
toedder: there are various MET (Match Equity Tables). gnubg--and any other capable playing software--should let you specify which MET you want to use for cube decisions. The MET are tables showing the theoretically expected match winning percentages from various match scores...assuming equal-strength opponents.
The MET are less important at scores such as your 17 away / 16 away. But I'm currently trying to memorize those values out to about 7 away / 7 away for real-life tournament play.
toedder: discussion of a similar position can be found on pp. 66-68 of Improve Your Backgammon by Lamford and Gasquoine.
However, their initial example is at a much different match score: with the cube already "yours" at 2, and with you leading in the match by a score of 3 away / 5 away. Given that situation, the correct cube action is: No Double / Take. (By their methods, you would need to be a 62.5% favorite to bear off in that situation for the redouble to be correct.)
They do also discuss a general approach to the cube decision, given any particular match score. (So, I'd love to know the exact match score in your case--just how early in the match was it?)
Also mentioned in the book is the correct cube action if the position were played as a money game: double/redouble/take
wetware: I've abandoned my efforts to gather any more data relating to 2nd roll behavior. Have submitted a Bug Tracker entry containing a summary of my 2008-2009 games, despite BK's a priori claim that problem reports involving dice are groundless.
Thanks to those of you who helped with the statistical analysis or provided other insights.
No signs of anything unusual in the individual dice values from the opening rolls of my 702 games from 2008. The distribution looks normal to me.
INDIVIDUAL DICE: 1 on die = 232 occurrences 2 on die = 245 occurrences 3 on die = 234 occurrences 4 on die = 236 occurrences 5 on die = 231 occurrences 6 on die = 226 occurrences
I'm not as confident about the paired values observed in these 702 games. But I'm not alarmed by these, either.
I think most of us didn't suspect that there was anything odd about opening rolls, considered in isolation. But whether we were skeptical or not on that point...we really had no data. But now we do. Sometimes it's worthwhile to gather data that helps us eliminate areas of concern; it can help us identify exactly what is broken here.
Next on my to-do list: analysis of my next to last pairs of rolls from my games from 2009. Between the first pairs of rolls, we've already seen greatly excessive reappearance of at least 1 die from the opening rolls. What I'd like to learn: whether there is any similar excess evident between rolls that occur later in games, or whether the focus needs to be placed upon the second rolls of games--and the routines that generate them. I have no preconception here. I'm going where the data leads.
[Raw data for what follows is available on request. Just send me a message with your email address.]
Some figures based upon every one of my year 2008 BrainKing backgammon games (n=702) in which at least 2 rolls are saved in the system:
Out of 702 games played, the average expectation for the number of games in which opener's and responder's rolls will be identical = 39 games. Observed number of games in which identical rolls were seen=123 games.
Roughly 3 times the expected frequency!
Out of 702 games played, the average expectation for the number of games in which responder's dice will both be different from opener's dice = 312 games. Observed number of games in which this occurred=156 games. (Somebody should double-check this. It's very close if not correct, but I'm tired.)
playBunny: your speculation about the exceptions possibly being caused by the "swap dice" function is intriguing! My data (selected by you and shown below) showed no exceptions. I rarely click "swap dice" on the opening rolls...only when required, to get past an opponent's point.
playBunny wrote: "If one of the dice is always the same and the other is a fair roll then duplication of both would occur with a frequency of 1/6 rather than the 1/18 that's expected, so 3 times more often."
Some error like that could explain all the other figures seen so far: the excessive exact matches, the excessive near misses, etc.
And IF in 1/2 (exactly or approximately) of cases, 1 of the opener's dice is being "re-used" to generate responder's roll...we would overall also expect to see (exactly or approximately) 1/2 the number of expected cases where BOTH of responders dice differ from opener's roll.
And that is in fact what the data showed for my 2009 games: Average expectation of responder's dice both differing from opener's dice out of 137 played = 60.888_ games. Observed number=28 games
playBunny: I've mentioned the excessive frequency of "near-misses" below (when discussing my 2009 data). But that could result, as you suggest, solely from the excessive re-appearance of just 1 of the dice in the responder's roll. In my 2008 data, the frequency of responder's dice not matching either of the opener's dice was only about 1/2 of expectation.
No doubt that the re-appearance of one of the dice is excessive. Later today, I'll have a better idea just how excessive it is. And I will take a look to see whether the "other" die in such cases appears to be completely independent, or also shows signs of unusual influence.
2 other notes regarding the exclusive focus upon the the first 2 rolls of the game:
Psychological: I think humans tend to notice/remember items that appear near the beginnings or ends of lists or sequences. It's an effect seen in some memory tasks. That might have been a factor here. I think that repeated rolls would more easily get our attention when they occur from the commonly-seen, symmetrical, initial position. Typically, we don't have much complicated stuff to think about during the opening rolls--maybe trying to remember what's best in a GG situation--so we can afford to think about other stuff...such as the frequencies and patterns of rolled dice.
Practical: As an investigator, I can be more confident that games will contain at least 2 rolls. That doesn't always happen, due to timeouts, etc. But it makes data capture much easier.
wetware: analysis of my year 2008 games is still in progress. 313 of ~700 games complete. (My earlier guesstimate of ~1000 games was off the mark.)
So far I see no sign of anything unusual in the individual dice values on the opening roll. (I'll wait for the values from all ~700 games before I look for any pairwise strangeness.)
Here were the individual dice frequencies from the first 313 games:
1 on die = 111 occurrences 2 on die = 102 occurrences 3 on die = 104 occurrences 4 on die = 100 occurrences 5 on die = 116 occurrences 6 on die = 93 occurrences
Early results from 2008 show an excess (approximately 4 times the mean expected frequency) of responder rolls exactly matching opening rolls. I'm still aiming to finish and report tomorrow. Will save my raw observed values as a text file, for anyone who'd like them. File will also include--for each game--the URL of the page showing the opening and responding dice rolled. You'll be able to check my transcription error rate :-)
playBunny: Look on the bright side, pB: we've been given a new, unannounced game/puzzle/variant!
It's quite a bit like backgammon. It's similar enough, in fact, that those who prefer to make their moves and cube decisions just as their bots dictate (these players know who they are) can continue to do so with success. Others can continue playing, without ever realizing that it's a subtle variant--but is not standard backgammon. Still others eventually wake up to the fact that the normal rules no longer fully apply, and that adjustments will be necessary to maximize their equity during play.
But it's a significant challenge to determine exactly which adjustments are needed. So it also has an added element of mystery: requiring clues, evidence, and deduction.
I rather like the social element of this new game: players working together as a team to find the solution.
nabla: But that would be too easy...cheating me of all the fun of capturing these values by hand! :-)
As I mentioned to you in a message, let me now state on the board:
"...based in part on Alan's recent comments, I'm tempted to also capture the actual rolls for aggregate analysis. Based on observed rolls from 1000+ of my games [all of year 2008], we should then have a fairly solid idea whether or not there's a problem with the basic frequency of the dice rolled. It would be good to know whether we should suspect that a fault lies there, or whether we should spend time looking elsewhere."
Resher: Thanks so much for the SD calculations! I knew that was critical for us to express just how extreme these results are, but I've forgotten too much of my statistics coursework and tools.
FYI: my next data set (matches from 2008) will be ~10 times the size of the results I reported from 2009.
Thanks, guys! I am now planning to conduct a similar review of my 124 matches from 2008. Expected completion: some time this weekend.
Alan: I didn't care about the order of the dice for these purposes. (I.e., I considered 52 and 25 to be a match.)
Another oddity noticed during review, but not yet reported: the "misses" show a strong tendency to be "near-misses". For example, if an opening roll of 42 is not exactly matched by responder's roll, the responder's roll will show excessively high occurrences of 52, 43, 41, or 32. That is to say: even when you do manage to miss, you're too often "off by 1".
The most extreme outlier was a match with Hrqls (Backgammon (Hrqls vs. wetware) ), in which 9 games out of 14 were exact matches, and the remaining 5 non-matches were all of the "off by 1" variety. Note: I'm inclined to consider 1 to be "next" to 6. I think it's reasonable, especially if some "remainder" function is at play in the dice generation routine(s), as is often the case.
[cross-posting from the "Feature requests" board, where a discussion developed recently]
I wish the system made it easier to gather the data that would demonstrate the severity of this problem. As I've said elsewhere (I think), it could also help pinpoint when the problem began (or was it always this way?), which should help identify what changed to cause it. I also wish the problem occurred 100% of the time, which would make it impossible to dismiss. As it is, we have only the beginnings of statistical support for our claim, and gathering the aggregate data that would provide more solid support is a daunting task...for us the members.
Fencer is quite correct about the groundless wailing over poker or dice cheats. There's plenty of that online; such complainers are easy to find, and the vast majority deserve our scorn and satire. But the genuine exceptions (in online poker especially) that have come to light should give one pause.
Hats off to those who stood their ground in such cases, labored to gather the data, and revealed the truth at last.
(added comment Wednesday evening / Thursday morning): I plan to make one last good-faith effort to present some useful data. I plan to examine all of my backgammon (but no variants) games stored here from 2009--that's 13 matches, and a total of 137 games in which at least 2 rolls took place. I have no reason to believe that that year is appreciably better or worse than any other year of mine. I'm sorely tempted to separately track the opening rolls that I "won" and "lost", but I don't plan to do so for this exercise. If I'm understanding the situation correctly (and there are good "numbers" people here who will be able to correct me if I'm wrong), the responder's roll should theoretically match the opener's roll 1 in 18 times (or 5.555_ %), on average. I'm also going to report the percentage of responder's rolls where both dice differ from the opener's roll. Theoretically, I think that should happen 4 times in 9 (or 44.444_%), on average. But I think the actual observed value is going to shock and convince even the most diehard skeptics here.
Average expectation of opener's and responder's both dice exactly matching out of 137 played = 7.6111_ games. Observed number=38 games (27.737%)
That's about 5 times the expected frequency!
Average expectation of responder's dice both differing from opener's dice out of 137 played = 60.888_ games. Observed number=28 games (20.438%)
That's less than half of what's expected.
If I feel energetic, I'll analyze my 124 matches from 2008!
Thom27: I agree, and have spent an inordinate amount of time analyzing results here. (I wish it were easier to do, or that the flaw was ALWAYS evident, or the underlying pattern easier to discern.) I'll continue playing less and less here, until it's remedied. (I want to spend more time playing and studying--not investigating an unusual system flaw.) I'm generally an outspoken skeptic of various "dice cheating" claims, but what's happening here is too much of an outlier to be dismissed cavalierly.
I also tried (unsuccessfully) to identify WHEN this skewness began. I hoped that by doing so, I might help the powers-that-be to identify some code or procedural change that might have triggered it.
Argomento: Re: Most games are begin with same rolling dice numbers..
Pedro Martínez: I can't provide statistics, but my impression is that the bias is well worth investigating. I think it happens often enough that some of the opening rolls should be played sub-optimally here.
spirit_66: I suppose someone could create something for the "Zillions of Games" engine that would play Crowded. From what I've seen, Zillions had backgammon, nack-, hyper-, and "deadgammon" (no idea what that is) capability.
playBunny: In such cases, I hate the pretense and the unnecessary effort, pB. Why should I have to go to the trouble of logging in, managing vacation days, and my remaining time, if my opponent is actually gnubg...which I could play at any time without those added steps? And to think I may even have exchanged pleasantries with opponents who did little more than relay moves that were chosen by their program--well, I can't say it makes me furious, but I'm clearly not pleased by such actions.
I must say, however, that I haven't seen very much of that here. I was mistaken when my earlier message said "...coming here to BK". I ought to have corrected that. Sure, there are problems here, but I've seen worse elsewhere--and that's what was on my mind when I wrote.
aaru: At some point, although absolute proof may still be out of reach, it is more reasonable to conclude that cheating has taken place. On another site, I analyzed several players' matches. In one case, I found at least 3 lengthy (21-point) matches where the player had 0 moves marked very bad, 0 moves marked bad, and just 1 marked doubtful--at gnubg's default settings.
If I wanted to play against gnubg, I could do that without coming here to BK.
I've just noticed some absences from the upper tier of the ratings list. I'm not sure when it happened, but I'm glad it finally did. (It was long overdue.)
AlliumCepa: I *think* that they're trying to find some means of analyzing the rolls themselves--in isolation--apart from the way they might be used in backgammon. I suspect there are (or were, a while ago) some fairly regular departures from randomness in the routine(s) that govern the rolls here. But analyzing and demonstrating that would have been difficult.
Argomento: Generate A Backgammon Board Interactively
I like this tool. GABBI is--according to their web page--"...a free web utility that allows you to create and save a backgammon board diagram. You can create the board manually and/or import/export to or from gnubg."
Thad: No dice change here, Thad...as far as I know :-)
I was just exploring the specific difference that would have resulted from the kind of cheating that took place, according to a news story that I provided a link to (below) a few days ago.
That news story wasn't too clear as to whether just 1 die or both of them were changed, nor did it specify how or when the change was made.
Thad: And will our maths experts please tell me the new average # of backgammon pips per roll if one die has been changed from a 2 to a 5; and if both dies have been changed from 2's to 5's? I haven't had enough of my morning coffee to contemplate the new permutations.
As we know, the average for a normal pair at backgammon is 8 1/6...
Thad: My suspicion was that they'd just make a switch (of one die? or maybe both--but that would seem too obvious, wouldn't it?) whenever a game was about to become a straight race. But it might be interesting to test whether a switch made at the start would be a help, as Thad suggested.
Modificato da wetware (11. Gennaio 2008, 06:37:51)
alanback: Yessir! And for Cloning Backgammon, too, it would be helpful to differentiate between the cloned checkers on the bar and the ones sent there by being hit.
Those of us who've played real-life tournaments do have substantial chunks of opening theory memorized--first moves for sure, plus most of the replies (and sometimes even more deeply than that). This includes the proper cube action in some cases. This opening theory is also "tuned" to the current match score, and in some cases even to one's knowledge of an opponent's playing style.
For purposes of learning, I'd suggest trying to think things through yourself and making your move--then consulting references immediately afterward, while the position and roll is still fresh in your mind.
I hope folks like alanback and playBunny weight in on this subject; they know a lot more than I do.
Puckish: I think that cheating is far from obvious in the case you described (compared with the evidence nabla was able to provide). You probably wouldn't like to play against me--I often take a lot of time to evaluate tough double offers, or when deciding to make such offers. I spent nearly an hour on one this morning, and would hate to think that anyone would conclude that I'm obviously cheating. Some of us just work hard.
Thad: That could also prove a bit tricky. You'd need something extra to tell the difference between a completed match (shown as 1-0) and a match that currently stands at a score of 1-0. A cute solution might involve current match scores being shown on "mouse-over" of each dash or of each table cell. (But at this point, I'm hesitant to suggest adding much more HTML code to our pages!
nabla: Ah, so it is! I'd searched for an existing bug report, but obviously not well enough.
Have just started playing at DailyGammon (as "wetware" there, and also at NetGammon). Will check their multi-game .MAT output shortly, though I presume it's correct. Since I now play enough bg matches here that are worth analyzing afterward, I might try creating a small Windows executable to auto-reformat BK's .MAT output (no matter how long a match), to make it compliant.
Modificato da wetware (19. Ottobre 2007, 05:23:10)
Has anyone here seen strange results when exporting .MAT files from multi-game BrainKing matches, and then importing those files into gnubg? I don't think gnubg is reading them properly. The program can follow the individual game move sequences, and allow navigation from game to game, but the running match scores are seriously messed up. (For example, it appears that a player can actually lose points from game to game!) I think it's also not always correct about which player is which.
I'm very skeptical about overall match analyses as a result, because I don't think gnubg is consistently evaluating the same person's play throughout a match!
Anyway, would love to hear your experiences. I'm fairly new to gnubg and the .MAT file format.