[Feature Request] Add the matchID or another unique identifier to the replay
Hey,
this is not really a request for the s2protocol, but for the contents of a replay. Currently is quite hard to say determine that two or more replays are from the same match; all from a different match participant (player, observer, etc). There is no matchID in it, the matchtime is based on the users PC, gameloops can differ depending on when a player leaves a match (especially for team matches), etc... There must be a matchID somewhere in the Battlenet-universe for a match. Is it possible to include this in the replay or some other unique hash, to identify the match the replay belongs to?
Thanks and greetings,
Andy!
I think the PC vs server time is the key issue here. If any two replays with the same participants have overlapping time ranges we can assume that they are the same replay, but when user PC times are not in sync we run into problems. I've raised this issue with @koalaling in the past. I hope that they are able to make that addition in a future patch.
That said, the above approach still works 99% of the time since most people seem to sync their PC clocks to the internet. You could augment it further by also using the initData['lobbyState']['randomSeed']. It seems highly unlikely to me that two replays with a set of players will use the same 32 bit random seed. Even less likely if you require that those two games be within some reasonable time range of each other (to account for bad PC times).
If you use the random seed I would think it almost certain that you wouldn't get any false positives. False negatives could maybe be merged via a more manual process when they occur.
Hmm, thanks for the tip with the randomSeed, will check this the next days! Unfortunately, not everybody syncs his computertime with a timeserver. I checked three 2on2 matches to test it and every participant had a different time in it, especially because the time is in 100 nanoseconds, as you know ;) So you always have a few nanoseconds difference, even if you sync with a timeserver. Maybe this "gap" gets lost if you convert it to seconds and drop the nanoseconds... But because of rounding issues, there will be a second difference which will cause "another replay" :/ Maybe the randomSeed helps here, maybe in combination with other values like the map and/or playerList, as you mentioned!
The other thing you can do to help deal with this, is take the time and buffer it by a few min, Doing that seemed to really help me out. The reason your going to see a different time on every player is because they don't all save it at the same time, which is why i added like a 5 min buffer to my logic to weed them out.
But ya having a match id would be helpful, and save some guessing.
I agree that having a synced time in the replay would be fantastic. Next platform patch for them will almost certainly be more than a month out though so even if this gets included it won't be any time soon.
It doesn't matter that the times are not exact. The recorded time is the end of the game, use the game length to work backwards and get a start time. Then you can check replays for an overlap in start/end times and a common set of players. Even if a player's clock is off by 10 minutes there will still be overlap on a 15 minute game. See here.
@GraylinKim: Yeah, the randomSeed is the same in all replays for a my testmatches and different for all matches, so this seems to work! :) Thanks for that! I think I may combine it with the mapname and build a hash over it, so I have a unique identifier (per match)! But what is the randomSeed anyway? :)
The Starcraft II game engine is deterministic. Given the same inputs it will produce the exact same game every single time. That's how the replays work and why they can be so compact.
By basing all pseudo-random numbers in the game off of this fixed random seed the game client can generate the same exact random numbers in the replay as it did in the original game without having to record them all. Likewise for games with multiple clients (e.g. all ladder games), it allows each computer to roll random numbers independently without fear that the numbers will be different on computer A vs computer B.
Imagine for example that critter movement is based on random numbers. If the random numbers are not synced the critter will be in different locations on each client and a valid marine kill in one will be completely invalid in another one. This divergent game state is super deadly so they always make sure to sync on a random seed before starting any game.
Anyway, glad I could help.
Aaah, ok :) All clear now, thanks! Btw, I realized it a bit different. I didn't use the map name because this can be in the users language. So if a Korean and a German play together, my hash would be different, because for the German it's "Starstation" and for the Korean whatever it is in korean symbols ;) I used the players userIds + their toons + the random seed. This makes a semi-unique hash per match, which is ok to determine the rounds of a bestOfX match, but not OK to have a overall unique matchId replacement... Here's the code: https://github.com/TurtleEntertainment/aiur/blob/master/teSc2ReplayParser.py#L287 Thanks again and hopefully Blizzard / @koalaling can provide us with a matchId in the future!? :)
Just an FYI hotslogs.com uses random seed + player IDs to determine unique matches. Across millions of replays there were only a handful of duplicates with random seed by itself, so I think this is fine.