PV, value and comment parser in CSA
I attached a comment, pv and value parser in CSA. It would be useful for the analysis of the playout in floodgate. Example of io is shown below
input
'バージョン
V2.2
'対局者名
N+NAKAHARA
N-YONENAGA
'棋譜情報
'棋戦名
$EVENT:13th World Computer Shogi Championship
'対局場所
$SITE:KAZUSA ARC
'開始日時
$START_TIME:2003/05/03 10:30:00
'終了日時
$END_TIME:2003/05/03 11:11:05
'持ち時間:25分、切れ負け
$TIME_LIMIT:00:25+00
'戦型:矢倉
$OPENING:YAGURA
'平手の局面
P1-KY-KE-GI-KI-OU-KI-GI-KE-KY
P2 * -HI * * * * * -KA *
P3-FU-FU-FU-FU-FU-FU-FU-FU-FU
P4 * * * * * * * * *
P5 * * * * * * * * *
P6 * * * * * * * * *
P7+FU+FU+FU+FU+FU+FU+FU+FU+FU
P8 * +KA * * * * * +HI *
P9+KY+KE+GI+KI+OU+KI+GI+KE+KY
'先手番
+
'指し手と消費時間(optional)
+2726FU
'** 22 -8384FU
T12
-3334FU
T6
'** 0 +2625FU -8384FU +6978KI -8485FU +3938GI -7172GI +9796FU
+7776FU
'using csa format is a kind of torment!
%TORYO
output
{'moves': ['2g2f', '3c3d', '7g7f'], 'sfen': 'lnsgkgsnl/1r5b1/ppppppppp/9/9/9/PPPPPPPPP/1B5R1/LNSGKGSNL b - 1', 'names': ['NAKAHARA', 'YONENAGA'], 'win': 'b', 'values': [[22], [0], []], 'comments' : ['', '', ["'using csa format is a kind of torment!"]], 'pvs': [[['8c8d']], [['2f2e', '8c8d', '6i7h', '8d8e', '3i3h', '7a7b', '9g9f']], []], 'comments': [[], [], ["'using csa format is a kind of torment!"]]}
Discussion
As far as I know, CSA format does not provide strict rules to determine which comments contain PV and values. This PR supports CSA files generated by floodgate. Comment parser in this PR would raise error when comment format is different from floodgate's format. Therefore, it is preferable to skip the comment parser when it raises errors.
This parser tries to support MultiPV and multiline comments to gather the information by list. However, floodgate does not support MultiPV and multiline comments.
@qhapaq-49 Thank you for making the pull request. Sorry for my late review.
This pull request is very helpful to parse floodgate CSAs. Also, I've read your tweet. https://twitter.com/Qhapaq_49/status/1543977762826113025
The discussion section is very helpful. My suggestion is, we can introduce a parser option and set floodgate mode explicitly, introduce a new dict key floodgate and set all the values from the comment under the key.
Other comments,
-
valueis too common name. Butevaluation_valueis too long. I don't have a good idea. -
commentscould be sparse. It could bedict. If the comment parser is only for reading floodgate logs like the above suggestion and the log has many comments, keeping the currentlistis okay.
Thank you for your comment and I am sorry for my late response. I agree your suggestion. I will push PR again after I have made an update based on your advice.