KataGo Feature Request: Add Search and Bulk Download Options to Contributions Page

Dear David,

I am writing to request two new features for the contributions page on the KataGo training website (https://katagotraining.org/contributions/):

A search box to filter game files (especially the training game files).
An option to select and download multiple game files at once (if feasible).

Reason for Request

Currently, the contributions page lists all uploaded self-play game files, but finding specific games or downloading multiple files individually is time-consuming. A search function would allow users to locate games by criteria such as model's name, komi value, game type, or game outcome. And a bulk download option would streamline the process of retrieving multiple files, especially for users who need these games for local training (like me).

Benefits

Improved User Experience: Users can quickly find and access desired games, and I believe it would also be helpful for those who just want enjoy some AI games.
Enhanced Training Resources: Easier access to multiple games enriches local training datasets, potentially improving model performance.

Specific Use Case

I aim to use these contributed games as loadable resources for local KataGo self-play training. Adding a search box and bulk download option would significantly simplify collecting and utilizing these valuable resources.

If possible, I’d appreciate it if you or some other contributors could consider implementing these features. Thank you for your time and for maintaining this amazing project!

Mar 09 '25 10:03 anonym-g

Oh, I just found out the https://katagoarchive.org/kata1/index.html, the page is a bit rudimentary, and I didn't look carefully after clicking into it before.

Mar 09 '25 12:03 anonym-g

I wrote a python program to filter the 7komi, draw games of 25/2/23 (currently strongest model's selfplay games), and found that each .sgf file could be mapped to the .npz file by name, e.g., file 0CA1B3F4617F63B18BCD501E6070D0D5.sgf and file 0CA1B3F4617F63B18BCD501E6070D0D5.npz, so, are they generated by the same game? I'd like to simply reuse these .npz file as the training data. It should not result in significant overfitting, since there are only 154 games (25/2/23) are 7komi, draw, and the total training games' num of that day is 17755.

The python script I used to filter out the .sgfs:

# filter_sgf.py

import argparse
import os
import shutil

# python filter_sgf.py --source <源目录> --target <目标目录> [--komi <贴目值>] [--result <结局类型>] [--size <棋盘大小>]
# python filter_sgf.py --source "G:\Projects\KataGo\Training\Trained Models\TrainingGames\2025-02-23sgfs\kata1-b28c512nbt-s8326494464-d4628051565" --target "G:\Projects\KataGo\Training\Trained Models\TrainingGames\kata1-b28c512nbt-s8326494464-d4628051565\Draw" [--komi 7] [--result 0] [--size 19]

# 0CA1B3F4617F63B18BCD501E6070D0D5.npz
# 0CA1B3F4617F63B18BCD501E6070D0D5.sgf

def main():
    # 创建命令行参数解析器
    parser = argparse.ArgumentParser(description='筛选并移动 .sgf 文件的命令行工具')
    parser.add_argument('--source', required=True, help='源目录路径（包含 .sgf 文件）')
    parser.add_argument('--target', required=True, help='目标目录路径（筛选后文件存放位置）')
    parser.add_argument('--komi', default='7', help='贴目值，例如 7')
    parser.add_argument('--result', default='0', help='结局类型，例如 0')
    parser.add_argument('--size', default='19', help='棋盘大小，例如19')


    # 解析用户输入的参数
    args = parser.parse_args()

    source_dir = args.source
    target_dir = args.target
    komi = args.komi
    result = args.result
    size = args.size

    # 检查源目录和目标目录是否存在
    if not os.path.exists(source_dir):
        print(f"错误：源目录 {source_dir} 不存在")
        return
    if not os.path.exists(target_dir):
        print(f"错误：目标目录 {target_dir} 不存在")
        return

    # 遍历源目录
    for root, _, files in os.walk(source_dir):
        for file in files:
            if file.endswith('.sgf'):
                file_path = os.path.join(root, file)
                with open(file_path, 'r', encoding='utf-8') as f:
                    content = f.read()
                    # 检查文件是否符合筛选条件
                    if f'KM[{komi}]' in content and f'RE[{result}]' in content and f'SZ[{size}]' in content:
                        target_path = os.path.join(target_dir, file)
                        if os.path.exists(target_path):
                            print(f"File {file} already exists in target directory, skipping.")
                        else:
                            shutil.copy2(file_path, target_path)
                            print(f"Copied file {file} to {target_dir}")

if __name__ == '__main__':
    main()

Mar 09 '25 13:03 anonym-g

Yes, the npz and sgf files correspond exactly.

I will be curious to see what your results are. I would expect that the training you are proposing to do has a good chance to either

Make no difference because it is too small of a number of games or you don't train enough
Or, if you do train for long enough, will eventually harm the model's strength by diminishing its ability to judge positions.

The reason is that if you do a lot of training that only includes draws and never give any training examples that aren't draws, the easiest way for the value head of model to perfectly fit that data is just to predict "all positions are always draws". After enough training, this will likely damage the ability of the value head to actually judge positions and to tell which positions are better and worse. If you start from an already-trained official model, this damage might take a while to happen since the model has built up a strong prior to judge positions correctly from all of its past training, but I would expect it to happen eventually.

Mar 09 '25 13:03 lightvector

Actually, I would like to use draw & +0.5 at the same time (especially 6.5-W+0.5 and 7.5-B+0.5). If the model achieved draw with 7komi(which gives 5-5 win rate at the beginning), it means it either:

made no mistakes from beginning to end, or
made some completely acceptable mistakes, or even better,
made severe mistakes twice, once as PW and once as PB.

So I think it would be helpful for it to develop the ability to play the proper move under different circumstances, as I mentioned yesterday at https://github.com/lightvector/KataGo/issues/1033#issuecomment-2708365132.

Besides, I'd like to use the .sgf files as local selfplay's loadable resources, so to simulate the replaying process of human players. With this process, the hope is the model might find out the potential blind spot, which is the case 3, so to efficiently improve its ability, since this will be a "very special positions it hasn't seen before", as you mentioned https://github.com/lightvector/KataGo/issues/1033#issuecomment-2708343321

Mar 09 '25 14:03 anonym-g

since this will be a "very special positions it hasn't seen before"

I think you misunderstand what I meant. The model has definitely seen lots of draw games, and 7 komi games before, these are not special positions at all. I'm referring more to things like Igo Hatsuyoron 120 which features a massive hanezeki, or artifically constructed seki shapes that never occur in real games, positions where the entire board is a single gigantic capturing race with a huge spiral, or positions where black has stones all around the entire first line and the komi is 360 so that black wins only if they kill all of white' stones.

These kinds of positions are extremely different than the ones that occur in real games, and normally the model is horribly bad at them but can improve rapidly with training. If you are merely talking about ordinary games with a certain komi, it's possible you could get moderate improvements via finetuning, and we also know a decent amount of improvement is possible via learning rate drops (the LR is being kept deliberately high in official models since slower LR drops tend to be better in the long term), but aside from those factors you are unlikely to get very large improvements.

Mar 09 '25 14:03 lightvector

My inspiration actually comes from a model game, presented below (strongest - strongest +100 local training) at that time the setting is 10s per move, the best choice of green was not found by the two models during the play. The strongest model from the official website took about 20 seconds to find the best choice of green, but still choose the curve as the first choice after backing a move.

So what I think is that each draw game might has a potential deciding move, and the model did not found such move during its initial play. Then these games could be viewed as the auto generated Igo Hatsuyoron problem (I didn't familiar with its English name, so I didn't mentioned it earlier). If it found the deciding move, it might improve a lot; if it didn't, there won't be a great loss.

Mar 09 '25 15:03 anonym-g

Now I have several locally trained (fine tuned) models (filtered by gatekeeper):

25_3_Custom_kata1-b28c512nbt-s8326501440-d6998.bin.gz(accpeted; 100 local games)

25_3_Custom_kata1-b28c512nbt-s8326540224-d35275.bin.gz(accpeted; 500 local); "500"

25_3_Custom_kata1-b28c512nbt-s8326592768-d57019.bin.gz(rejected; 900 local); "900"

25_3_Custom_kata1-b28c512nbt-s8326818144-d384117.bin.gz(rejected; 900 local + 4000 retraining); "900+"

25_3_Custom_kata1-b28c512nbt-s8326877824-d407968.bin.gz(accepted; 900 local + 4000 retraining + 500 local ~ 4000 retraining + 1400 local); "1400+"

25_3_Custom_kata1-b28c512nbt-s8327062496-d437830.bin.gz(accepted; 900 + 4000 + 1000 ~ 4000 + 1000); "1900+"

And I will simply call the current official strongest model "the strongest" later. The strongest I used was downloaded about a week ago, so there might be some small differences between it and the strongest on website today.

The gatekeeper's log for these models:

500 (150 total):

2025-03-10 01:34:17+0800: Data write loop cleaned up and terminating for kata1-b28c512nbt-s8326501440-d6998 vs kata1-b28c512nbt-s8326540224-d35275
2025-03-10 01:34:17+0800: Candidate won match, score 75.000 to 71.000 in 146 games, accepting candidate kata1-b28c512nbt-s8326540224-d35275

~ 73±2

900 (150 total):

2025-03-10 18:06:43+0800: Data write loop cleaned up and terminating for kata1-b28c512nbt-s8326540224-d35275 vs kata1-b28c512nbt-s8326592768-d57019
2025-03-10 18:06:43+0800: Candidate lost match, score 74.500 to 75.500 in 150 games, rejecting candidate kata1-b28c512nbt-s8326592768-d57019

~ 75±0.5

900+ (200 total):

2025-03-11 11:10:07+0800: Data write loop cleaned up and terminating for kata1-b28c512nbt-s8326540224-d35275 vs kata1-b28c512nbt-s8326818144-d384117
2025-03-11 11:10:07+0800: Candidate lost match, score 87.500 to 100.500 in 188 games, rejecting candidate kata1-b28c512nbt-s8326818144-d384117

~ 94±6.5

1400+ (200 total):

2025-03-12 07:02:05+0800: Candidate won match, score 100.500 to 93.500 in 194 games, accepting candidate kata1-b28c512nbt-s8326877824-d407968
2025-03-12 07:02:07+0800: Moving G:\Projects\KataGo\Training\BaseDir\modelstobetested\kata1-b28c512nbt-s8326877824-d407968 to G:/Projects/KataGo/Training/BaseDir/models//kata1-b28c512nbt-s8326877824-d407968

~ 97±3.5

"Retraining" used the draw/+0.5 games' .npz from contributors. The first three models were trained using some .sgfs as loadable resources. Statistics:

2025-02-16 ~ 2025-02-23 (selfplay games generated by the strongest)

Draw: 1319; 64, 150, 145, 174, 230, 128, 130, 145, 153

7.5Komi:
B+0.5: 361; 22, 39, 44, 51, 51, 42, 44, 36, 32

W+0.5: 927; 40, 111, 87, 93, 84, 151, 115, 118, 128

6.5Komi:
W+0.5: 598; 27, 74, 73, 91, 98, 55, 78, 53, 49

B+0.5: 1214; 63, 128, 134, 137, 122, 206, 149, 133, 142

=> 1319; 959; 2141; 4419

The 900 happens to defeat the strongest brutally with 10s per move

Normally, on my computer, 30s per move would be more objective. Further tests has shown that 900 wasn't very stable with 10s per move (yet the strongest does). However, it obviously hit the blind spot of the strongest.

Since 900 didn't pass the gatekeeper, I used the 500 with 30s per move in further test. Under such time limit, the model can generally achieve an average of 100k calculations per move on my computer. numSearchThreads was optimized for all models used during test.

I've attached .sgf files below. The model's file is to big for github to upload.

I'm not sure if these are small improvements or substantial ones. If there is a way for me to upload the model, you may want to do more tests by your own.

900 as Black (7Komi, 10s/move):

25_3_Custom(B)_25_2_Strongest.zip

500 as Black (7.5Komi, 30s/move, W+R; B+1.5):

25_3_Custom(B)-25_2_Strongest.zip

Mar 10 '25 11:03 anonym-g