BrainKing - Diskussionsforen (Backgammon)

Thema: Re: Ratings

tonyh: The problem is, and it has been talked about a lot in the past, is all games on BrainKing currently use the same rating system, which is one designed for Chess games which is (for sake of argument) is 95% skill, 5% luck

So when you apply that same rating system to games like Backgammon (65% skill, 35% luck), Battleboats (25% skill, 75% luck), Ludo (30% skill, 70% luck), Dice Poker (35% skill, 65% luck), etc.... anyway, using the same rating system designed for a mostly skill game does not produce the same results for luck games.

PLEASE NOTE: I'm not trying to start an argument about how much luck/skill goes into each game - the % that I wrote above are just quick numbers I made up.

What the site really needs is at least 2 different rating systems - 1 for mostly skill games (chess, checkers), and 1 for games that deal more with luck (dice games).

Hopefully some day Fencer will add that (and go back and recalculate ratings from the start).

9. November 2010, 15:17:09

Pedro Martínez

Thema: Re:

paully: At dailygammon.com, not only you get more points for winning larger point games than for single games, but you also get banned for no reason.

9. November 2010, 15:17:00

Bwild

Thema: Re: Ratings

playBunny:

9. November 2010, 15:02:24

Thema: Re: Ratings

Verändert von playBunny (9. November 2010, 15:04:05)

tonyh: The solution is simple. First, find a Fencer who cares. Second, ....... Actually I'm not sure what comes next, we've never got that far.

9. November 2010, 12:37:48

Thema: Re:

pgt: at dailygammon.com you get more points for winning larger point games than for single games

9. November 2010, 12:12:51

pgt, Neuseeland, Brain Springer (Knight), männlich

pgt

Yes, the system is rubbish. I just checked out a waiting game, and I stood to lose 83 if I lost the game, and get 1 point if I won. Why would I even bother? What encouragement is there for good players to play against novices.
And winning (or losing) by a gammon or backgammon should involve a premium. It's ridiculous that the same points are used when on wins by a backgammon, and when there is just one point difference at the end

9. November 2010, 10:40:35

tonyh

Thema: Ratings

I am playing someone who is rated 180 points below me. If he wins, he gets 12 points; if I win, I get 4 points. So over 10 games, I would need to win 8 games to get 32 points and make a net profit of 8 points. That is just not going to happen against a reasonably skilled player; there is just too much luck in backgammon games. It means that I am virtually forced to play opponents at my current rating and so my choice of opponents is very resricted.
A much more sensible approach would be to have 7/9 range for a difference of 150 rating points and 6/10 difference for all the rest.

23. Oktober 2010, 20:32:35

Thema: Re:

wetware: I've abandoned my efforts to gather any more data relating to 2nd roll behavior. Have submitted a Bug Tracker entry containing a summary of my 2008-2009 games, despite BK's a priori claim that problem reports involving dice are groundless.

Thanks to those of you who helped with the statistical analysis or provided other insights.

22. Oktober 2010, 20:55:52

Thema: Re:

Czuch: I think they have a good 12 months left in them yet

22. Oktober 2010, 17:42:29

I like it fast, 5-point Hyper, 1st November 2010

Thema: Fast Hypergammon 2010

A monthly series of fast-paced Hypergammon tourneys

I like it fast, 5-point Hyper, 1st December 2010

22. Oktober 2010, 17:23:32

Czuch

Just over 5 year now for this match....

Backgammon (cardinal vs. Baked Alaskan)

24. September 2010, 01:20:32

I like it fast, 5-point Hyper, 1st October 2010

Thema: Fast Hypergammon 2010

A monthly series of fast-paced Hypergammon tourneys

I like it fast, 5-point Hyper, 1st November 2010

16. August 2010, 05:15:33

rod03801

Thema: Re: Fast whale sheet censoring, August 2010

paully: This is a board about backgammon, not moderating. The rules are in place, and easy to follow.

16. August 2010, 05:11:33

Thema: Re: Fast whale sheet censoring, August 2010

rod03801: I have no issue about if another mod would have edited it or not. I am just stating that I think it was a silly moderation IMHO. Each person has their own opinion right. Obviously yours differs from mine. Thats ok, you are a different person than I am.

16. August 2010, 05:09:16

rod03801

Thema: Re: Fast whale sheet censoring, August 2010

paully: Any other mod would have moderated it too. Take it to PM, thank you.

16. August 2010, 05:05:05

Thema: Re: Fast whale sheet censoring, August 2010

playBunny: LOL
Maybe Rod is just from a protected childhood or something.
The SH word is far far from swearing

sayings such as "no sh** sherlock" ........."holy sh**" etc make the word fairly common place.

16. August 2010, 04:39:10

Thema: Re: Fast whale sheet censoring, August 2010

playBunny: I cant believe that the sh word is deemed as swearing. That surprised me. I should have said whale dung

15. August 2010, 23:32:19

Thema: Re: Fast whale sheet censoring, August 2010

paully: comparing a Boeing 747 compared to whale .

Didn't you say something else there? I think it got shovelled. ;o)

15. August 2010, 16:21:30

Thema: Re: Fast hypergammon climb, July 2010

Verändert von paully (15. August 2010, 16:49:00)

playBunny: well, considering my probably 6-0 lead in a 9 point match I think I have a 50-50 chance given your skill level compared to mine is similar to comparing a Boeing 747 compared to whale .

My comment was completely due to the "wick" at the end of your link. That was a little uncalled for don't you think? Especially considering you "knew" it wasn't bogus !!!

15. August 2010, 09:48:18

Thema: Re: Fast hypergammon climb, July 2010

paully: I find it interesting you would post such a post to make people think I did some weird stuff to climb.

Lol. I find it interesting that you think I posted it for that reason. ;-)

It's an impressive climb, almost 350 points in two weeks, and a sight to behold - for those who can see it! Lol. ;-P

If I thought it was due to cheating then I'd have said so. If others want to think that, given just a graph and no comment from me, well, it's their choice to jump to uninformed conclusions. Intelligent observers look for more evidence than a line on a graph, they look for the reason for the line on the graph.

In contrast to me not saying anything about your climb being bogus (because it isn't), you have clearly suggested malice and spite as my motivation. Don't you think that such an accusation is rather incompatible with publically suggesting a phone call? LOL A good time to call? A good time is when you're not being such a flaming galah! ;O)

ps. I'll allow that using a wink smiley with the graph was slightly suggestive. ;-)

pps. Being a much under-rated player in the context of the BrainKiing Chessgammon rating formula is bad enough but, with your paranoid projection as well, I think there's good reason to boot you from the tourney!

I won't, of course, 'cos I'm a nice bunny and I like you. But only on one condition.... You have to beat me in our match at Pocket-Monkey! Do you reckon you can do it? ;o)

15. August 2010, 03:08:37

Thema: Re: Fast Hypergammon 2010

playBunny: that link doesnt work for me being a pawn I dont have access to my graphs and you still own all my won membership.

I find it interesting you would post such a post to make people think I did some weird stuff to climb. I think you know me better than that or is it out of spite coz we havent spoken on the phone for a number of weeks LOL

Speaking of which, we should do another phone hookup soon. Lemme know when a good time is.

Og an FWIW, my rating is still far below what it should be

15. August 2010, 02:16:45

Thema: Re: Fast Hypergammon 2010

paully: woo hoo I made it into the tournament !!!

That sure was a fast climb!

15. August 2010, 01:39:13

Thema: Re: Fast Hypergammon 2010

playBunny: woo hoo I made it into the tournament !!!

14. August 2010, 14:55:46

Thema: Re: Fast Hypergammon 2010

paully: actually my rating plummeted when I left for a while and my games timed out

14. August 2010, 14:52:34

Thema: Re: Fast Hypergammon 2010

paully: More like BrainKing's rating formula is too backgammon-unfriendly for you to enter.

14. August 2010, 14:50:07

Thema: Re: Fast Hypergammon 2010

playBunny: oops looks like my rating is too low to enter

14. August 2010, 12:38:48

I like it fast, 5-point Hyper, 1st September 2010

Thema: Fast Hypergammon 2010

A monthly series of fast-paced Hypergammon tourneys

I like it fast, 5-point Hyper, 1st Ocrober 2010

24. Juli 2010, 17:29:37

Thema: Re:

No signs of anything unusual in the individual dice values from the opening rolls of my 702 games from 2008. The distribution looks normal to me.

INDIVIDUAL DICE:
1 on die = 232 occurrences
2 on die = 245 occurrences
3 on die = 234 occurrences
4 on die = 236 occurrences
5 on die = 231 occurrences
6 on die = 226 occurrences

I'm not as confident about the paired values observed in these 702 games. But I'm not alarmed by these, either.

PAIRED DICE:
21 = 54 occurrences
31 = 35 occurrences
32 = 65 occurrences
41 = 48 occurrences
42 = 38 occurrences
43 = 54 occurrences
51 = 37 occurrences
52 = 39 occurrences
53 = 42 occurrences
54 = 64 occurrences
61 = 58 occurrences
62 = 49 occurrences
63 = 38 occurrences
64 = 32 occurrences
65 = 49 occurrences

I think most of us didn't suspect that there was anything odd about opening rolls, considered in isolation. But whether we were skeptical or not on that point...we really had no data. But now we do. Sometimes it's worthwhile to gather data that helps us eliminate areas of concern; it can help us identify exactly what is broken here.

Next on my to-do list: analysis of my next to last pairs of rolls from my games from 2009.
Between the first pairs of rolls, we've already seen greatly excessive reappearance of at least 1 die from the opening rolls. What I'd like to learn: whether there is any similar excess evident between rolls that occur later in games, or whether the focus needs to be placed upon the second rolls of games--and the routines that generate them. I have no preconception here. I'm going where the data leads.

21. Juli 2010, 23:09:13

Thema: Re: Adios, null hypothesis!

playBunny: Thanks for the explanation :-)

21. Juli 2010, 00:03:28

Thema: Re: Adios, null hypothesis!

alanback: I described simulation in my earlier post. Backgammon (playBunny, 2010-07-18 14:28:23) If you are simulating the real dice action at the start of the game then each player occasionally will get the same dice, as they do in real life, and the roll will have to be done over. If you play GnuBg then you'll see that it does this, for example "A new session has been started --- GnuBg rolls 4, playBunny rolls 4 --- GnuBg rolls 2, playBunny rolls 4". It's not strictly necessary to go through those motions, as a binary coin toss will suffice, but that's how the GnuBg programmers did it and maybe Fencer liked the idea too.

20. Juli 2010, 18:16:08

Thema: Re: Adios, null hypothesis!

Resher: Why would anyone want to re-roll if the first rolls were the same? And what do you mean by "simulated"? I personally never had a vision of Fencer rolling an actual pair of dice every time I click ...

20. Juli 2010, 14:20:45

Resher

Thema: Re: Adios, null hypothesis!

wetware: Your expected numbers of the different types of responder roll look right to me. Using them, I get a chi-squared statistic of 274, when I'd expect a figure of 13.8 or above for only 1 in a 1000 samples (years in this case) if the dice rolls were totally independent and generated fairly. So we're talking odds of many, many millions to one against this being the case.

I think by now that most of us are agreed on this being caused by non-independence of the opening rolls rather than non-fair "dice" being used. But data and stats are fascinating, so feel free to produce more!

Hopefully some of your analysis will give someone some insight as to when at least part of the opener's roll is used as part of the responder's roll too. My guess would be that the actual rolling is simulated and so re-rolls will be generated if both players are assigned the same first roll, and it's this re-rolling that isn't working properly. I think pB's already suggested this. Hard to test though ....

19. Juli 2010, 05:51:50

Thema: Adios, null hypothesis!

[Raw data for what follows is available on request. Just send me a message with your email address.]

Some figures based upon every one of my year 2008 BrainKing backgammon games (n=702) in which at least 2 rolls are saved in the system:

Out of 702 games played, the average expectation for the number of games in which opener's and responder's rolls will be identical = 39 games.
Observed number of games in which identical rolls were seen=123 games.

Roughly 3 times the expected frequency!

Out of 702 games played, the average expectation for the number of games in which responder's dice will both be different from opener's dice = 312 games.
Observed number of games in which this occurred=156 games. (Somebody should double-check this. It's very close if not correct, but I'm tired.)

That's exactly half of what's expected.

18. Juli 2010, 18:49:25

Thema: Re: Dupled, dupled dice are trouble.

wetware: I think humans tend to notice/remember items that appear near the beginnings or ends of lists or sequences. It's an effect seen in some memory tasks.

Yes, respectively, the latency and recency effects.

18. Juli 2010, 17:54:22

Thema: Re: Dupled, dupled dice are trouble.

Verändert von wetware (18. Juli 2010, 18:31:42)

playBunny: your speculation about the exceptions possibly being caused by the "swap dice" function is intriguing! My data (selected by you and shown below) showed no exceptions. I rarely click "swap dice" on the opening rolls...only when required, to get past an opponent's point.

18. Juli 2010, 17:38:31

Thema: Re: Dupled, dupled dice are trouble.

Verändert von wetware (18. Juli 2010, 17:40:40)

playBunny wrote: "If one of the dice is always the same and the other is a fair roll then
duplication of both would occur with a frequency of 1/6 rather than the
1/18 that's expected, so 3 times more often."

Some error like that could explain all the other figures seen so far: the excessive exact matches, the excessive near misses, etc.

And IF in 1/2 (exactly or approximately) of cases, 1 of the opener's dice is being "re-used" to generate responder's roll...we would overall also expect to see (exactly or approximately) 1/2 the number of expected cases where BOTH of responders dice differ from opener's roll.

And that is in fact what the data showed for my 2009 games:
Average expectation of responder's dice both differing from opener's dice out of 137 played = 60.888_ games.
Observed number=28 games

18. Juli 2010, 15:31:04

Thema: Re: Dupled, dupled dice are trouble.

playBunny: I've mentioned the excessive frequency of "near-misses" below (when discussing my 2009 data). But that could result, as you suggest, solely from the excessive re-appearance
of just 1 of the dice in the responder's roll. In my 2008 data, the frequency of responder's dice not matching either of the opener's dice was only about 1/2 of expectation.

No doubt that the re-appearance of one of the dice is excessive. Later today, I'll have a better idea just how excessive it is. And I will take a look to see whether the "other" die in such cases appears to be completely independent, or also shows signs of unusual influence.

2 other notes regarding the exclusive focus upon the the first 2 rolls of the game:

Psychological: I think humans tend to notice/remember items that appear near the beginnings or ends of lists or sequences. It's an effect seen in some memory tasks. That might have been a factor here. I think that repeated rolls would more easily get our attention when they occur from the commonly-seen, symmetrical, initial position. Typically, we don't have much complicated stuff to think about during the opening rolls--maybe trying to remember what's best in a GG situation--so we can afford to think about other stuff...such as the frequencies and patterns of rolled dice.

Practical: As an investigator, I can be more confident that games will contain at least 2 rolls. That doesn't always happen, due to timeouts, etc. But it makes data capture much easier.

18. Juli 2010, 14:28:23

Thema: Re: Dupled, dupled dice are trouble.

playBunny: I agree with Thad that the number generator itself is probably okay and that it's the use that's at fault.

I think that the problem is only on the opening rolls. I know that the checking of pairs of dice within games hasn't been done yet but I suspect that a high occurence of duplication, such as there is for the opening rolls, would have been noticed much sooner than now and by many more people.

I know that I'd occasionally notice that the opponent's opening dice came out the same as mine, or vice versa, but I never went beyond that, to seeing it as a pattern. If it were happening throughout the game then I'm sure that I would have noticed and other, more observent people, would have seen it sooner.

So, assuming that it is an opening rolls issue, we must be looking for code that is special to the start of a game. One obvious contender is the rolling of dice for who goes first. In real backgammon, each player rolls a dice and the one with the higher value gets both to play with. After that first move the two players pick up their individual dice and thereafter take care of their own rolls.

If I were coding a backgammon server then I wouldn't bother with that. I'd simply toss a binary digit to see who was to start and then roll the starting player's dice using the same code as every other roll. However, if I were to code a simulation of the real live start action then there'd be the opportunity for error.

What might happen then is that I use one dice from each player for the first player's roll but then re-use one of those dice for the second player, presumably the dice that they rolled to see who started.

This, if Fencer is doing such a simulation, is the prime suspect for the bug. If you look at the example matches below then you can see clearly that there's at least one common dice in the majority of games. It's somthing that should be frequent (55% - the same odds as getting one man off the bar into a home table with 2 points open) but not that frequent.

If one of the dice is always the same and the other is a fair roll then duplication of both would occur with a frequency of 1/6 rather than the 1/18 that's expected, so 3 times more often.

The interesting thing about this bug is that there are exceptions. Although there are none in wetware's matches below, there are a few in mine and more in Resher's. That must be caused by something. One theory is that perhaps when the starting player swaps the dice before moving this somehow breaks the connection between the forst and second players' dice. I can't think why that should be the case and haven't played any matches that I can test the theory with.

18. Juli 2010, 14:23:10

Thema: Re: Dupled, dupled dice are trouble.

wetware: Are you, by any chance, recording how often the opponent gets one of the starter's dice? In the example games that I did back in November, in one of the matches that occurred with every single game

And here are the first 5 matches on your finished games page. Red denotes one or two common dice, bold black shows where there are no common dice.

Backgammon (blue sky vs. wetware)
46 ... 61
35 ... 63
23 ... 42
53 ... 53
15 ... 52
53 ... 53
63 ... 63

Backgammon (wetware vs. blue sky)
62 ... 62
41 ... 51
41 ... 51
64 ... 64
23 ... 22
65 ... 55

Backgammon (blue sky vs. wetware)
61 ... 26
34 ... 33
31 ... 31

Backgammon (blue sky vs. wetware)
12 ... 21
64 ... 64
65 ... 65

Backgammon (wetware vs. mertos)
12 ... 11
21 ... 21
12 ... 21
25 ... 62
12 ... 21
65 ... 65

And here are the first 7 hypergammon matches off mine:

Hyper Backgammon (playBunny vs. Sarah)
32 ... 31
56 ... 51
12 ... 51
24 ... 62
56 ... 52
35 ... 51

Hyper Backgammon (playBunny vs. Varazslo)
23 ... 42
34 ... 33
13 ... 31

Hyper Backgammon (playBunny vs. Petromil)
63 ... 63
16 ... 61
34 ... 33

Hyper Backgammon (sascham vs. playBunny)
35 ... 63
14 ... 41

Hyper Backgammon (playBunny vs. Ian C)
35 ... 63
43 ... 22
42 ... 32
15 ... 62
52 ... 53
63 ... 65

Hyper Backgammon (Doris vs. playBunny)
63 ... 63
25 ... 62

Hyper Backgammon (Helena vs. playBunny)
36 ... 41
12 ... 11
41 ... 41

And here's a single 11-pointer of Resher's

Backgammon (Karol G. vs. Resher)
63 ... 11
43 ... 45
61 ... 51
23 ... 51
41 ... 11
63 ... 51
34 ... 62
63 ... 52
65 ... 51
65 ... 52
63 ... 62
12 ... 52
64 ... 42
41 ... 51

18. Juli 2010, 07:06:27

Thema: Re:

wetware: analysis of my year 2008 games is still in progress. 313 of ~700 games complete. (My earlier guesstimate of ~1000 games was off the mark.)

So far I see no sign of anything unusual in the individual dice values on the opening roll. (I'll wait for the values from all ~700 games before I look for any pairwise strangeness.)

Here were the individual dice frequencies from the first 313 games:

1 on die = 111 occurrences
2 on die = 102 occurrences
3 on die = 104 occurrences
4 on die = 100 occurrences
5 on die = 116 occurrences
6 on die = 93 occurrences

Early results from 2008 show an excess (approximately 4 times the mean expected frequency) of responder rolls exactly matching opening rolls. I'm still aiming to finish and report tomorrow. Will save my raw observed values as a text file, for anyone who'd like them. File will also include--for each game--the URL of the page showing the opening and responding dice rolled. You'll be able to check my transcription error rate :-)

16. Juli 2010, 15:51:04

Thema: Re: Opening rolls for Brainking dice

playBunny: Look on the bright side, pB: we've been given a new, unannounced game/puzzle/variant!

It's quite a bit like backgammon. It's similar enough, in fact, that those who prefer to make their moves and cube decisions just as their bots dictate (these players know who they are) can continue to do so with success. Others can continue playing, without ever realizing that it's a subtle variant--but is not standard backgammon. Still others eventually wake up to the fact that the normal rules no longer fully apply, and that adjustments will be necessary to maximize their equity during play.

But it's a significant challenge to determine exactly which adjustments are needed. So it also has an added element of mystery: requiring clues, evidence, and deduction.

I rather like the social element of this new game: players working together as a team to find the solution.

And it's a whodunit.

16. Juli 2010, 15:33:47

Thema: Re:

nabla: But that would be too easy...cheating me of all the fun of capturing these values by hand! :-)

As I mentioned to you in a message, let me now state on the board:

"...based in part on Alan's recent comments, I'm tempted to also capture the actual rolls for aggregate analysis. Based on observed rolls from 1000+ of my games [all of year 2008], we should then have a fairly solid idea whether or not there's a problem with the basic frequency of the dice rolled. It would be good to know whether we should suspect that a fault lies there, or whether we should spend time looking elsewhere."

16. Juli 2010, 12:49:19

nabla

Thema: Re: Opening rolls for Brainking dice

Thad: That is my guess as well. It is much more probable that the error lies in wrongly reusing already used dice than in the random number generator.

16. Juli 2010, 12:47:33

nabla

Thema: Re:

grenv: Even easier and more reliable, since all games all recorded : querying the database for all pair of first roll and second roll in every game since x months. I'd expect the results to be overwhelmingly abnormal.

16. Juli 2010, 07:41:08

Thad

Thema: Re: Opening rolls for Brainking dice

My guess would be that the random number generating function is fine. After all, think how hard it would be try and write a random number function but actually write something that produces the results we are seeing. The bad code would be obvious. Instead, I suspect that the code is not being called properly. Consider this outline for the code:

Whose turn? - Player 1
What do we need to do (accept double, accept draw, roll dice, etc.)
Roll dice
Show dice
Player 1 moves
Player 2's turn
What do we need to do (accept double, accept draw, roll dice, etc.)
Show dice
Player 2 moves

It would be pretty easy to bury a bug that could give us the results we are seeing with something like this.

16. Juli 2010, 01:06:29

grenv

Surely the easiest way to analyze this is by getting the source code and running a large number of tests... Fencer? I suggest making at least the random number generating code available so others can analyze properly. Either it isn't a defect and can be proved, or it is and can be fixed.

15. Juli 2010, 19:59:09

Thema: Re: Opening rolls for Brainking dice

Resher: Thanks for the analysis.

Of course, there's no reason to believe this phenomenon is limited to opening rolls. In general, it seems one should bear in mind the enhanced probability of duplicate or similar rolls in planning strategy.

Has anyone run a test on the distribution of single die rolls? One way that these observed deviations from the norm could arise would be if, say, the chance of rolling a 4 on a single die was significantly higher than it should be. Depending on the pseudo RNG that is used, this might be a simpler explanation than any theory involving pairs of dice.

15. Juli 2010, 19:50:56

Resher

Thema: Re: Opening rolls for Brainking dice

Verändert von Resher (16. Juli 2010, 15:51:07)

I've performed a Chi-squared test on the hypothesis that the probabilities of responder's first roll having 0, 1 and 2 dice the same as opener's roll are as they should be, that is 16/36, 18/36 and 2/36 respectively.

This is a test with 2 degrees of freedom, so the chi-squared statistic has:
a 5% chance of exceeding 6.0 if the probabilities are correct,
a 1% chance of exceeding 9.2
a 0.5% chance of exceeding 10.6, and
a 0.1% chance of exceeding 13.8

alanback's result (from 55 games) is 9.2. A result this high or greater would happen only 1% of the time, so this is enough to cause suspicion that the dice aren't following the desired probabilities. But it's not proof. Also, this test is reckoned to give very accurate results only if the expected outcomes are all greater than 5. So, with our smallest probability being 2/36, this means we need at least 90 games in our sample for me to be happy beyond reasonable doubt about the conclusion.

So, moving on to my results (100 games), I get a statistic of 102.

And lastly, wetware's results (137 games) give a statistic of 139.

Remember, if the dice rolls are working properly, there's only a 1 in a 1000 chance of this chi-squared statistic being 13.8 or higher in any individual test, so the conclusion can't really be in doubt - something is wrong somewhere.

15. Juli 2010, 19:21:18

Thema: Re: Opening rolls for Brainking dice

playBunny: I think we all know the answer to that.

Just out of curiosity I looked at the 55 games in matches I have completed in 2010. There are 8 in which the first two rolls were the same (same two dice, order not considered), versus the predicted 3 and change. Both dice different, predicted 24, actual 19.

15. Juli 2010, 19:08:41