AGE Plese give out a Language file

title. My friends and I are Non-native English speakers (Chinese actually), so we really need a user-friendly of translate for the AGE, including UI and so on. I think this will be able to help more Non-native English people to use AGE, and to create more wonderful mods. Plese take it in consideration, thank you. :-)

Nov 12 '24 15:11 fengsuiyvmin

Hi,Chinese VERSION download link: https://pan.baidu.com/s/1ouDGUzf3GubZ9nrrD0pBkA?pwd=v749 code:v749 Much better than the officials one!

Nov 18 '24 03:11 bingbingwangyyds

This would be a lot of work, but a nice idea.

Apr 25 '25 05:04 Tapsa

This would be a lot of work, but a nice idea.

yes. thinking about the people all around the world: Frenchies, Germans, Russians, Arabics... they all just need to copy and modify one sigle language file -- Json or xml maybe -- to fit in their own language.

Apr 25 '25 05:04 fengsuiyvmin

你可以将代码中的字面两导出到Excel表格第一列, 从长到短排序, 然后第二列用AI翻译文档翻译成译文, 然后根据Excel文档替换代码中的字面量然后编译即可. 我已经成功编译了一份AGE中文版.

May 13 '25 02:05 cute-cirno

I created a script that can export the literal text in the code to Excel, then use paid AI to automatically translate it, and finally the script replaces the corresponding content according to the translation. The effect is very good, and in theory it can be translated into any language. (But pay attention to the encoding format of the save)

May 13 '25 02:05 cute-cirno

I wrote a column about how to compile AGE easily. The link is https://www.bilibili.com/opus/1064303891098632195

May 13 '25 02:05 cute-cirno

I wrote a column about how to compile AGE easily. The link is https://www.bilibili.com/opus/1064303891098632195

非常感谢。翻译的部分我自己能做，主要是没有办法把UI的字符串提取出来。有你的脚本应该没问题了。要是能发Github上以飨众人就再好不过了。:3 Thanks a lot. I can do the translate job myself. The only problem to me is I can't get the strings of UI. With your script I think it will be solved. whish you can publish it on Github for the world people. :3

May 13 '25 02:05 fengsuiyvmin

That works, and is the way the Chinese version has been maintained, but it would be much simpler if there was proper language swapping support. I've started to implement that a couple of times, but I've had to put it away for fixing other things.

May 13 '25 03:05 Tapsa

Are you sure you want to translate manually? There are so many entries, it's quite painful. This is a prototype version, you can try it out. This is the extraction script:

#################################################

import os
import pandas as pd
from tqdm import tqdm
from collections import Counter


def extract_strings_from_cpp_files(folder_path, output_excel="extracted_strings.xlsx"):
    """
    递归扫描文件夹中的指定C++文件，提取字符串常量并导出到Excel

    Args:
        folder_path: 要扫描的文件夹路径
        output_excel: 输出的Excel文件名
    """
    print(f"开始递归扫描文件夹: {folder_path}")

    # 指定的文件列表
    target_files = [
        "Units.cpp",
        "UnitLine.cpp",
        "Terrains.cpp",
        "TerrainRestrictions.cpp",
        "TerrainBorders.cpp",
        "TechTrees.cpp",
        "Techs.cpp",
        "Sounds.cpp",
        "SaveDialog.cpp",
        "Research.cpp",
        "PlayerColors.cpp",
        "Other.cpp",
        "OpenSaveDialog.cpp",
        "OpenDialog.cpp",
        "Lists.cpp",
        "Graphics.cpp",
        "General.cpp",
        "CustomTextControls.cpp",
        "Civs.cpp",
        "Animation.cpp",
    ]

    # 递归查找指定文件
    cpp_files = []
    print("正在递归搜索指定文件...")

    for root, dirs, files in os.walk(folder_path):
        for file in files:
            if file in target_files:
                full_path = os.path.join(root, file)
                cpp_files.append(full_path)
                print(f"找到文件: {full_path}")

    if not cpp_files:
        print(f"在 {folder_path} 及其子文件夹中未找到指定的文件")
        return

    print(f"找到 {len(cpp_files)} 个指定的C++文件，开始提取字符串...")

    # 用于存储提取的所有字符串
    all_strings = []
    # 用于统计每个字符串出现的文件数量
    string_occurrences = Counter()
    # 用于跟踪每个字符串首次出现的文件
    string_sources = {}

    # 遍历所有找到的cpp文件
    for cpp_file in tqdm(cpp_files, desc="处理文件"):
        try:
            # 读取文件内容
            with open(cpp_file, "r", encoding="utf-8", errors="ignore") as file:
                content = file.read()

            # 先移除C++注释，防止处理注释中的字符串
            content = remove_cpp_comments(content)

            # 查找所有字符串常量（注意处理转义引号）
            strings = extract_cpp_strings(content)

            # 如果找到字符串，记录它们
            for s in strings:
                # 跳过空字符串
                if not s:
                    continue

                # 记录字符串出现次数和来源
                string_occurrences[s] += 1
                if s not in string_sources:
                    string_sources[s] = os.path.basename(cpp_file)
                    all_strings.append(s)

        except Exception as e:
            print(f"处理文件 {cpp_file} 时出错: {e}")

    if not all_strings:
        print("未提取到任何字符串")
        return

    # 创建DataFrame
    data = {
        "原文": all_strings,
        "原文长度": [len(s) for s in all_strings],  # 添加长度列用于排序
        "译文": [""] * len(all_strings),  # 空列，用于填写译文
        "出现次数": [string_occurrences[s] for s in all_strings],
        "首次出现文件": [string_sources[s] for s in all_strings],
    }

    df = pd.DataFrame(data)

    # 主要按字符串长度降序排列，次要按出现次数降序排列
    df = df.sort_values(by=["原文长度", "出现次数"], ascending=[False, False])

    # 保存前删除辅助的长度列
    df = df.drop(columns=["原文长度"])

    # 保存到Excel
    try:
        df.to_excel(output_excel, index=False)
        print(f"\n提取完成！共提取了 {len(all_strings)} 个唯一字符串")
        print(f"结果已保存到: {output_excel}")
    except Exception as e:
        print(f"保存Excel文件时出错: {e}")


def remove_cpp_comments(code):
    """
    移除C++代码中的所有注释
    """
    # 状态变量
    in_string = False
    in_multiline_comment = False
    in_singleline_comment = False

    result = []
    i = 0

    while i < len(code):
        # 检查字符串状态
        if not in_multiline_comment and not in_singleline_comment:
            if code[i : i + 1] == '"' and (i == 0 or code[i - 1 : i] != "\\"):
                in_string = not in_string

        # 检查注释
        if not in_string:
            # 检查多行注释开始
            if code[i : i + 2] == "/*" and not in_singleline_comment:
                in_multiline_comment = True
                i += 1
            # 检查多行注释结束
            elif code[i : i + 2] == "*/" and in_multiline_comment:
                in_multiline_comment = False
                i += 1
            # 检查单行注释
            elif code[i : i + 2] == "//" and not in_multiline_comment:
                in_singleline_comment = True
                i += 1
            # 检查单行注释结束（换行）
            elif code[i : i + 1] == "\n" and in_singleline_comment:
                in_singleline_comment = False

        # 如果不在注释中，保留字符
        if not in_multiline_comment and not in_singleline_comment:
            result.append(code[i])

        i += 1

    return "".join(result)


def extract_cpp_strings(code):
    """
    从C++代码中提取字符串常量
    处理转义引号等特殊情况
    """
    strings = []
    i = 0
    while i < len(code):
        # 如果找到起始双引号
        if code[i] == '"':
            start = i + 1  # 跳过引号
            i += 1
            current_str = []

            # 读取引号中的内容，直到找到匹配的结束引号
            while i < len(code):
                # 检查转义序列
                if code[i] == "\\" and i + 1 < len(code):
                    # 添加转义字符和下一个字符
                    current_str.append(code[i : i + 2])
                    i += 2
                    continue
                # 检查结束引号
                elif code[i] == '"':
                    break
                # 添加普通字符
                current_str.append(code[i])
                i += 1

            # 将收集的字符组合成字符串并添加到结果中
            if current_str:
                strings.append("".join(current_str))

        i += 1

    return strings


if __name__ == "__main__":
    # 获取脚本所在目录
    script_dir = os.path.dirname(os.path.abspath(__file__))

    print("C++字符串提取工具 - 递归搜索版")
    print("----------------------------")
    print(f"默认将扫描脚本所在目录: {script_dir}")

    # 询问用户是否使用默认目录
    choice = input("是否扫描此目录? (y/n, 默认为y): ").strip().lower()

    if choice == "n":
        folder_path = input("请输入要扫描的文件夹路径: ").strip()
        if not os.path.isdir(folder_path):
            print(f"错误: {folder_path} 不是有效的文件夹路径")
            exit(1)
    else:
        folder_path = script_dir

    # 询问输出Excel文件名
    output_file = input(
        "请输入输出Excel文件名 (默认为extracted_strings.xlsx): "
    ).strip()
    if not output_file:
        output_file = "extracted_strings.xlsx"
    elif not output_file.endswith(".xlsx"):
        output_file += ".xlsx"

    # 执行提取
    extract_strings_from_cpp_files(folder_path, output_file)

##########################################################

This is the replace script:

##########################################################

import os
import pandas as pd
import re


def replace_strings_in_files(root_dir=".", debug_mode=True):
    """
    递归扫描指定目录，根据Excel表格替换目标文件中引号内的原文为译文
    排除"None"字符串的替换，防止代码结构被破坏
    最终以GB 2312编码保存文件

    Args:
        root_dir (str): 要扫描的根目录，默认为当前目录
        debug_mode (bool): 是否启用调试模式，显示更多诊断信息
    """
    # 目标文件列表
    target_files = [
        "Units.cpp",
        "UnitLine.cpp",
        "Terrains.cpp",
        "TerrainRestrictions.cpp",
        "TerrainBorders.cpp",
        "TechTrees.cpp",
        "Techs.cpp",
        "Sounds.cpp",
        "SaveDialog.cpp",
        "Research.cpp",
        "PlayerColors.cpp",
        "Other.cpp",
        "OpenSaveDialog.cpp",
        "OpenDialog.cpp",
        "Lists.cpp",
        "Graphics.cpp",
        "General.cpp",
        "CustomTextControls.cpp",
        "Civs.cpp",
        "Animation.cpp",
        "CustomWidgets.cpp",  # 添加了CustomWidgets.cpp
    ]

    # 不替换的特殊字符串列表
    excluded_strings = ["None"]  # 可根据需要添加其他需要排除的字符串

    # 读取Excel文件
    excel_path = os.path.join(root_dir, "extracted_strings.xlsx")
    if not os.path.exists(excel_path):
        print(f"错误：找不到Excel文件 {excel_path}")
        return

    try:
        df = pd.read_excel(excel_path)

        # 确保Excel有至少两列
        if len(df.columns) < 2:
            print("错误：Excel文件应至少包含两列（原文和译文）")
            return

        # 创建替换对照表，排除译文为空的项和特定的字符串
        exact_replacement_dict = {}  # 精确匹配（保留所有空格）
        normalized_replacement_dict = {}  # 规范化（去除前导和尾随空格）
        skipped_count = 0
        excluded_count = 0
        space_prefixed_count = 0

        for idx, row in df.iterrows():
            # 检查第一列是否存在
            if pd.isna(row[0]) or str(row[0]) == "":
                continue

            # 保留两种版本的原文
            original_raw = str(row[0])  # 保留所有空格
            original_normalized = original_raw.strip()  # 去除前导和尾随空格

            # 排除特定字符串
            if original_normalized in excluded_strings:
                excluded_count += 1
                if debug_mode:
                    print(f"排除特定字符串: '{original_normalized}'")
                continue

            # 检测是否有前导空格或尾随空格
            has_leading_space = original_raw.startswith(" ")
            has_trailing_space = original_raw.endswith(" ")

            if has_leading_space or has_trailing_space:
                space_prefixed_count += 1
                if debug_mode:
                    print(
                        f"条目 #{idx + 1} 有{'前导' if has_leading_space else ''}{'和' if has_leading_space and has_trailing_space else ''}{'尾随' if has_trailing_space else ''}空格: '{original_raw}'"
                    )

            # 严格检查第二列是否为空
            if (
                pd.isna(row[1])
                or str(row[1]).strip() == ""
                or str(row[1]).lower() in ["nan", "none"]
            ):
                skipped_count += 1
                continue

            # 同样保留两种版本的译文
            translation_raw = str(row[1])  # 保留所有空格
            translation_normalized = translation_raw.strip()  # 去除前导和尾随空格

            # 确保原文和译文不同
            if original_normalized != translation_normalized:
                # 替换任何换行符为字面\n (适用于Excel中的真实换行)
                translation_raw = translation_raw.replace("\n", "\\n").replace(
                    "\r", "\\r"
                )
                translation_normalized = translation_normalized.replace(
                    "\n", "\\n"
                ).replace("\r", "\\r")

                # 同时保存两个版本到不同的字典
                exact_replacement_dict[original_raw] = translation_raw
                normalized_replacement_dict[original_normalized] = (
                    translation_normalized
                )

        print(f"共加载了 {len(exact_replacement_dict)} 条有效翻译")
        print(f"跳过了 {skipped_count} 条空译文条目")
        print(f"排除了 {excluded_count} 条特定字符串")
        print(f"发现 {space_prefixed_count} 条带有前导或尾随空格的条目")

        # 输出前10个替换项供检查
        if len(exact_replacement_dict) > 0 and debug_mode:
            print("\n样本替换项（前10个，显示空格）:")
            for i, (orig, trans) in enumerate(
                list(exact_replacement_dict.items())[:10]
            ):
                # 使用可视化表示空格的方式显示
                orig_vis = orig.replace(" ", "·")
                trans_vis = trans.replace(" ", "·")
                print(f"{i + 1}. 原文: '{orig_vis}' -> 译文: '{trans_vis}'")

        # 统计替换情况
        files_processed = 0
        files_modified = 0
        replacements_made = 0

        # 递归遍历目录
        for root, _, files in os.walk(root_dir):
            for file in files:
                # 只处理目标文件
                if file in target_files:
                    file_path = os.path.join(root, file)
                    files_processed += 1

                    # 读取文件内容
                    try:
                        with open(file_path, "r", encoding="utf-8") as f:
                            content = f.read()
                    except UnicodeDecodeError:
                        try:
                            with open(file_path, "r", encoding="latin1") as f:
                                content = f.read()
                        except Exception as e:
                            print(f"无法读取文件 {file_path}: {str(e)}")
                            continue

                    # 替换内容
                    new_content = content
                    file_replacements = 0
                    file_replacements_log = []

                    # 使用行为单位处理，避免跨行匹配导致的问题
                    lines = new_content.split("\n")
                    for i, line in enumerate(lines):
                        # 对每行中的引号字符串进行替换
                        position = 0
                        new_line = line
                        while position < len(new_line):
                            # 匹配一对引号及其内容
                            match = re.search(r'"([^"]*)"', new_line[position:])
                            if not match:
                                break

                            # 提取引号内的内容
                            quoted_text = match.group(1)
                            quoted_text_normalized = quoted_text.strip()

                            # 跳过特定字符串
                            if quoted_text_normalized in excluded_strings:
                                position += match.end()
                                continue

                            # 检查是否需要替换
                            replacement = None
                            if quoted_text in exact_replacement_dict:
                                replacement = exact_replacement_dict[quoted_text]
                            elif quoted_text_normalized in normalized_replacement_dict:
                                replacement = normalized_replacement_dict[
                                    quoted_text_normalized
                                ]

                            if replacement:
                                # 构建替换后的文本
                                start_pos = position + match.start()
                                end_pos = position + match.end()
                                new_line = (
                                    new_line[:start_pos]
                                    + f'"{replacement}"'
                                    + new_line[end_pos:]
                                )

                                # 记录替换情况
                                file_replacements += 1
                                file_replacements_log.append(
                                    f"'{quoted_text}' -> '{replacement}'"
                                )

                                # 更新位置到替换后的位置
                                position = start_pos + len(f'"{replacement}"')
                            else:
                                # 如果不需要替换，移动到匹配结束位置
                                position += match.end()

                        # 更新处理后的行
                        lines[i] = new_line

                    # 将行重新组合为文件内容
                    new_content = "\n".join(lines)

                    # 特定文件的代码修改 (CustomWidgets.cpp)
                    if file == "CustomWidgets.cpp":
                        brush_pattern = r"dc\.SetBrush\(static wxBrush\(wxSystemSettings::GetColour\(wxSYS_COLOUR_HIGHLIGHT\), wxBRUSHSTYLE_SOLID\)\);"
                        brush_replacement = r"static wxBrush highlightBrush(wxSystemSettings::GetColour(wxSYS_COLOUR_HIGHLIGHT), wxBRUSHSTYLE_SOLID);\ndc.SetBrush(highlightBrush);"

                        if re.search(brush_pattern, new_content):
                            new_content = re.sub(
                                brush_pattern, brush_replacement, new_content
                            )
                            print(f"{file_path}: 完成特定代码修改")

                    # Other.cpp中的LangCharset替换
                    if file == "Other.cpp":
                        # 确保只替换独立的LangCharset，不影响其他可能包含此子串的标识符
                        lang_charset_pattern = r"\bLangCharset\b(?!\.c_str\(\))"
                        lang_charset_replacement = r"LangCharset.c_str()"

                        if re.search(lang_charset_pattern, new_content):
                            count = len(re.findall(lang_charset_pattern, new_content))
                            new_content = re.sub(
                                lang_charset_pattern,
                                lang_charset_replacement,
                                new_content,
                            )
                            print(
                                f"{file_path}: 替换了 {count} 处 LangCharset -> LangCharset.c_str()"
                            )

                    # 如果文件内容有变化，以GB 2312编码写回文件
                    if new_content != content:
                        try:
                            with open(file_path, "w", encoding="gb2312") as f:
                                f.write(new_content)
                            files_modified += 1
                            replacements_made += file_replacements
                            print(
                                f"{file_path}: 完成了 {file_replacements} 处替换，已保存为GB 2312编码"
                            )
                            # 输出详细日志
                            if debug_mode and file_replacements_log:
                                print("  替换详情:")
                                for log_entry in file_replacements_log[
                                    :10
                                ]:  # 只显示前10条替换记录
                                    print(f"  - {log_entry}")
                                if len(file_replacements_log) > 10:
                                    print(
                                        f"  - ...等共 {len(file_replacements_log)} 处替换"
                                    )
                        except UnicodeEncodeError:
                            print(
                                f"警告：{file_path} 无法以GB 2312编码保存，尝试使用替代方案"
                            )
                            # 使用errors='replace'或'ignore'选项处理无法编码的字符
                            with open(
                                file_path, "w", encoding="gb2312", errors="replace"
                            ) as f:
                                f.write(new_content)
                            print(f"  - 已用替代字符保存文件")

        # 打印统计信息
        print("\n替换完成！")
        print(f"扫描文件数: {files_processed}")
        print(f"修改文件数: {files_modified}")
        print(f"总替换次数: {replacements_made}")

    except Exception as e:
        print(f"发生错误: {str(e)}")
        import traceback

        print(traceback.format_exc())


if __name__ == "__main__":
    replace_strings_in_files()

##########################################################

Put it in the root directory of AGE, extract it first, fill in the translation and then replace it. !!! Note: Due to the writing operation, please back up the folder in advance. Also, this script will save the replaced code file in GB 2132 character set.

May 13 '25 03:05 cute-cirno

This is my Excel doc for Chinese:

extracted_strings.xlsx

May 13 '25 03:05 cute-cirno

Sure, but the idea is to extract all of the text into one file for the app to read from and only translate that.

May 13 '25 03:05 Tapsa

You only need to prepare one document for each language. After each update, if the string exists in the previous table, the old translation text is used, that is, a differential update is adopted.

May 13 '25 03:05 cute-cirno

The translation platform I use is "otranslator", and the terminology translations are quite accurate.

May 13 '25 03:05 cute-cirno

Are you sure you want to translate manually? There are so many entries, it's quite painful. This is a prototype version, you can try it out. This is the extraction script:

#################################################

import os
import pandas as pd
from tqdm import tqdm
from collections import Counter


def extract_strings_from_cpp_files(folder_path, output_excel="extracted_strings.xlsx"):
    """
    递归扫描文件夹中的指定C++文件，提取字符串常量并导出到Excel

    Args:
        folder_path: 要扫描的文件夹路径
        output_excel: 输出的Excel文件名
    """
    print(f"开始递归扫描文件夹: {folder_path}")

    # 指定的文件列表
    target_files = [
        "Units.cpp",
        "UnitLine.cpp",
        "Terrains.cpp",
        "TerrainRestrictions.cpp",
        "TerrainBorders.cpp",
        "TechTrees.cpp",
        "Techs.cpp",
        "Sounds.cpp",
        "SaveDialog.cpp",
        "Research.cpp",
        "PlayerColors.cpp",
        "Other.cpp",
        "OpenSaveDialog.cpp",
        "OpenDialog.cpp",
        "Lists.cpp",
        "Graphics.cpp",
        "General.cpp",
        "CustomTextControls.cpp",
        "Civs.cpp",
        "Animation.cpp",
    ]

    # 递归查找指定文件
    cpp_files = []
    print("正在递归搜索指定文件...")

    for root, dirs, files in os.walk(folder_path):
        for file in files:
            if file in target_files:
                full_path = os.path.join(root, file)
                cpp_files.append(full_path)
                print(f"找到文件: {full_path}")

    if not cpp_files:
        print(f"在 {folder_path} 及其子文件夹中未找到指定的文件")
        return

    print(f"找到 {len(cpp_files)} 个指定的C++文件，开始提取字符串...")

    # 用于存储提取的所有字符串
    all_strings = []
    # 用于统计每个字符串出现的文件数量
    string_occurrences = Counter()
    # 用于跟踪每个字符串首次出现的文件
    string_sources = {}

    # 遍历所有找到的cpp文件
    for cpp_file in tqdm(cpp_files, desc="处理文件"):
        try:
            # 读取文件内容
            with open(cpp_file, "r", encoding="utf-8", errors="ignore") as file:
                content = file.read()

            # 先移除C++注释，防止处理注释中的字符串
            content = remove_cpp_comments(content)

            # 查找所有字符串常量（注意处理转义引号）
            strings = extract_cpp_strings(content)

            # 如果找到字符串，记录它们
            for s in strings:
                # 跳过空字符串
                if not s:
                    continue

                # 记录字符串出现次数和来源
                string_occurrences[s] += 1
                if s not in string_sources:
                    string_sources[s] = os.path.basename(cpp_file)
                    all_strings.append(s)

        except Exception as e:
            print(f"处理文件 {cpp_file} 时出错: {e}")

    if not all_strings:
        print("未提取到任何字符串")
        return

    # 创建DataFrame
    data = {
        "原文": all_strings,
        "原文长度": [len(s) for s in all_strings],  # 添加长度列用于排序
        "译文": [""] * len(all_strings),  # 空列，用于填写译文
        "出现次数": [string_occurrences[s] for s in all_strings],
        "首次出现文件": [string_sources[s] for s in all_strings],
    }

    df = pd.DataFrame(data)

    # 主要按字符串长度降序排列，次要按出现次数降序排列
    df = df.sort_values(by=["原文长度", "出现次数"], ascending=[False, False])

    # 保存前删除辅助的长度列
    df = df.drop(columns=["原文长度"])

    # 保存到Excel
    try:
        df.to_excel(output_excel, index=False)
        print(f"\n提取完成！共提取了 {len(all_strings)} 个唯一字符串")
        print(f"结果已保存到: {output_excel}")
    except Exception as e:
        print(f"保存Excel文件时出错: {e}")


def remove_cpp_comments(code):
    """
    移除C++代码中的所有注释
    """
    # 状态变量
    in_string = False
    in_multiline_comment = False
    in_singleline_comment = False

    result = []
    i = 0

    while i < len(code):
        # 检查字符串状态
        if not in_multiline_comment and not in_singleline_comment:
            if code[i : i + 1] == '"' and (i == 0 or code[i - 1 : i] != "\\"):
                in_string = not in_string

        # 检查注释
        if not in_string:
            # 检查多行注释开始
            if code[i : i + 2] == "/*" and not in_singleline_comment:
                in_multiline_comment = True
                i += 1
            # 检查多行注释结束
            elif code[i : i + 2] == "*/" and in_multiline_comment:
                in_multiline_comment = False
                i += 1
            # 检查单行注释
            elif code[i : i + 2] == "//" and not in_multiline_comment:
                in_singleline_comment = True
                i += 1
            # 检查单行注释结束（换行）
            elif code[i : i + 1] == "\n" and in_singleline_comment:
                in_singleline_comment = False

        # 如果不在注释中，保留字符
        if not in_multiline_comment and not in_singleline_comment:
            result.append(code[i])

        i += 1

    return "".join(result)


def extract_cpp_strings(code):
    """
    从C++代码中提取字符串常量
    处理转义引号等特殊情况
    """
    strings = []
    i = 0
    while i < len(code):
        # 如果找到起始双引号
        if code[i] == '"':
            start = i + 1  # 跳过引号
            i += 1
            current_str = []

            # 读取引号中的内容，直到找到匹配的结束引号
            while i < len(code):
                # 检查转义序列
                if code[i] == "\\" and i + 1 < len(code):
                    # 添加转义字符和下一个字符
                    current_str.append(code[i : i + 2])
                    i += 2
                    continue
                # 检查结束引号
                elif code[i] == '"':
                    break
                # 添加普通字符
                current_str.append(code[i])
                i += 1

            # 将收集的字符组合成字符串并添加到结果中
            if current_str:
                strings.append("".join(current_str))

        i += 1

    return strings


if __name__ == "__main__":
    # 获取脚本所在目录
    script_dir = os.path.dirname(os.path.abspath(__file__))

    print("C++字符串提取工具 - 递归搜索版")
    print("----------------------------")
    print(f"默认将扫描脚本所在目录: {script_dir}")

    # 询问用户是否使用默认目录
    choice = input("是否扫描此目录? (y/n, 默认为y): ").strip().lower()

    if choice == "n":
        folder_path = input("请输入要扫描的文件夹路径: ").strip()
        if not os.path.isdir(folder_path):
            print(f"错误: {folder_path} 不是有效的文件夹路径")
            exit(1)
    else:
        folder_path = script_dir

    # 询问输出Excel文件名
    output_file = input(
        "请输入输出Excel文件名 (默认为extracted_strings.xlsx): "
    ).strip()
    if not output_file:
        output_file = "extracted_strings.xlsx"
    elif not output_file.endswith(".xlsx"):
        output_file += ".xlsx"

    # 执行提取
    extract_strings_from_cpp_files(folder_path, output_file)

##########################################################

This is the replace script:

##########################################################

import os
import pandas as pd
import re


def replace_strings_in_files(root_dir=".", debug_mode=True):
    """
    递归扫描指定目录，根据Excel表格替换目标文件中引号内的原文为译文
    排除"None"字符串的替换，防止代码结构被破坏
    最终以GB 2312编码保存文件

    Args:
        root_dir (str): 要扫描的根目录，默认为当前目录
        debug_mode (bool): 是否启用调试模式，显示更多诊断信息
    """
    # 目标文件列表
    target_files = [
        "Units.cpp",
        "UnitLine.cpp",
        "Terrains.cpp",
        "TerrainRestrictions.cpp",
        "TerrainBorders.cpp",
        "TechTrees.cpp",
        "Techs.cpp",
        "Sounds.cpp",
        "SaveDialog.cpp",
        "Research.cpp",
        "PlayerColors.cpp",
        "Other.cpp",
        "OpenSaveDialog.cpp",
        "OpenDialog.cpp",
        "Lists.cpp",
        "Graphics.cpp",
        "General.cpp",
        "CustomTextControls.cpp",
        "Civs.cpp",
        "Animation.cpp",
        "CustomWidgets.cpp",  # 添加了CustomWidgets.cpp
    ]

    # 不替换的特殊字符串列表
    excluded_strings = ["None"]  # 可根据需要添加其他需要排除的字符串

    # 读取Excel文件
    excel_path = os.path.join(root_dir, "extracted_strings.xlsx")
    if not os.path.exists(excel_path):
        print(f"错误：找不到Excel文件 {excel_path}")
        return

    try:
        df = pd.read_excel(excel_path)

        # 确保Excel有至少两列
        if len(df.columns) < 2:
            print("错误：Excel文件应至少包含两列（原文和译文）")
            return

        # 创建替换对照表，排除译文为空的项和特定的字符串
        exact_replacement_dict = {}  # 精确匹配（保留所有空格）
        normalized_replacement_dict = {}  # 规范化（去除前导和尾随空格）
        skipped_count = 0
        excluded_count = 0
        space_prefixed_count = 0

        for idx, row in df.iterrows():
            # 检查第一列是否存在
            if pd.isna(row[0]) or str(row[0]) == "":
                continue

            # 保留两种版本的原文
            original_raw = str(row[0])  # 保留所有空格
            original_normalized = original_raw.strip()  # 去除前导和尾随空格

            # 排除特定字符串
            if original_normalized in excluded_strings:
                excluded_count += 1
                if debug_mode:
                    print(f"排除特定字符串: '{original_normalized}'")
                continue

            # 检测是否有前导空格或尾随空格
            has_leading_space = original_raw.startswith(" ")
            has_trailing_space = original_raw.endswith(" ")

            if has_leading_space or has_trailing_space:
                space_prefixed_count += 1
                if debug_mode:
                    print(
                        f"条目 #{idx + 1} 有{'前导' if has_leading_space else ''}{'和' if has_leading_space and has_trailing_space else ''}{'尾随' if has_trailing_space else ''}空格: '{original_raw}'"
                    )

            # 严格检查第二列是否为空
            if (
                pd.isna(row[1])
                or str(row[1]).strip() == ""
                or str(row[1]).lower() in ["nan", "none"]
            ):
                skipped_count += 1
                continue

            # 同样保留两种版本的译文
            translation_raw = str(row[1])  # 保留所有空格
            translation_normalized = translation_raw.strip()  # 去除前导和尾随空格

            # 确保原文和译文不同
            if original_normalized != translation_normalized:
                # 替换任何换行符为字面\n (适用于Excel中的真实换行)
                translation_raw = translation_raw.replace("\n", "\\n").replace(
                    "\r", "\\r"
                )
                translation_normalized = translation_normalized.replace(
                    "\n", "\\n"
                ).replace("\r", "\\r")

                # 同时保存两个版本到不同的字典
                exact_replacement_dict[original_raw] = translation_raw
                normalized_replacement_dict[original_normalized] = (
                    translation_normalized
                )

        print(f"共加载了 {len(exact_replacement_dict)} 条有效翻译")
        print(f"跳过了 {skipped_count} 条空译文条目")
        print(f"排除了 {excluded_count} 条特定字符串")
        print(f"发现 {space_prefixed_count} 条带有前导或尾随空格的条目")

        # 输出前10个替换项供检查
        if len(exact_replacement_dict) > 0 and debug_mode:
            print("\n样本替换项（前10个，显示空格）:")
            for i, (orig, trans) in enumerate(
                list(exact_replacement_dict.items())[:10]
            ):
                # 使用可视化表示空格的方式显示
                orig_vis = orig.replace(" ", "·")
                trans_vis = trans.replace(" ", "·")
                print(f"{i + 1}. 原文: '{orig_vis}' -> 译文: '{trans_vis}'")

        # 统计替换情况
        files_processed = 0
        files_modified = 0
        replacements_made = 0

        # 递归遍历目录
        for root, _, files in os.walk(root_dir):
            for file in files:
                # 只处理目标文件
                if file in target_files:
                    file_path = os.path.join(root, file)
                    files_processed += 1

                    # 读取文件内容
                    try:
                        with open(file_path, "r", encoding="utf-8") as f:
                            content = f.read()
                    except UnicodeDecodeError:
                        try:
                            with open(file_path, "r", encoding="latin1") as f:
                                content = f.read()
                        except Exception as e:
                            print(f"无法读取文件 {file_path}: {str(e)}")
                            continue

                    # 替换内容
                    new_content = content
                    file_replacements = 0
                    file_replacements_log = []

                    # 使用行为单位处理，避免跨行匹配导致的问题
                    lines = new_content.split("\n")
                    for i, line in enumerate(lines):
                        # 对每行中的引号字符串进行替换
                        position = 0
                        new_line = line
                        while position < len(new_line):
                            # 匹配一对引号及其内容
                            match = re.search(r'"([^"]*)"', new_line[position:])
                            if not match:
                                break

                            # 提取引号内的内容
                            quoted_text = match.group(1)
                            quoted_text_normalized = quoted_text.strip()

                            # 跳过特定字符串
                            if quoted_text_normalized in excluded_strings:
                                position += match.end()
                                continue

                            # 检查是否需要替换
                            replacement = None
                            if quoted_text in exact_replacement_dict:
                                replacement = exact_replacement_dict[quoted_text]
                            elif quoted_text_normalized in normalized_replacement_dict:
                                replacement = normalized_replacement_dict[
                                    quoted_text_normalized
                                ]

                            if replacement:
                                # 构建替换后的文本
                                start_pos = position + match.start()
                                end_pos = position + match.end()
                                new_line = (
                                    new_line[:start_pos]
                                    + f'"{replacement}"'
                                    + new_line[end_pos:]
                                )

                                # 记录替换情况
                                file_replacements += 1
                                file_replacements_log.append(
                                    f"'{quoted_text}' -> '{replacement}'"
                                )

                                # 更新位置到替换后的位置
                                position = start_pos + len(f'"{replacement}"')
                            else:
                                # 如果不需要替换，移动到匹配结束位置
                                position += match.end()

                        # 更新处理后的行
                        lines[i] = new_line

                    # 将行重新组合为文件内容
                    new_content = "\n".join(lines)

                    # 特定文件的代码修改 (CustomWidgets.cpp)
                    if file == "CustomWidgets.cpp":
                        brush_pattern = r"dc\.SetBrush\(static wxBrush\(wxSystemSettings::GetColour\(wxSYS_COLOUR_HIGHLIGHT\), wxBRUSHSTYLE_SOLID\)\);"
                        brush_replacement = r"static wxBrush highlightBrush(wxSystemSettings::GetColour(wxSYS_COLOUR_HIGHLIGHT), wxBRUSHSTYLE_SOLID);\ndc.SetBrush(highlightBrush);"

                        if re.search(brush_pattern, new_content):
                            new_content = re.sub(
                                brush_pattern, brush_replacement, new_content
                            )
                            print(f"{file_path}: 完成特定代码修改")

                    # Other.cpp中的LangCharset替换
                    if file == "Other.cpp":
                        # 确保只替换独立的LangCharset，不影响其他可能包含此子串的标识符
                        lang_charset_pattern = r"\bLangCharset\b(?!\.c_str\(\))"
                        lang_charset_replacement = r"LangCharset.c_str()"

                        if re.search(lang_charset_pattern, new_content):
                            count = len(re.findall(lang_charset_pattern, new_content))
                            new_content = re.sub(
                                lang_charset_pattern,
                                lang_charset_replacement,
                                new_content,
                            )
                            print(
                                f"{file_path}: 替换了 {count} 处 LangCharset -> LangCharset.c_str()"
                            )

                    # 如果文件内容有变化，以GB 2312编码写回文件
                    if new_content != content:
                        try:
                            with open(file_path, "w", encoding="gb2312") as f:
                                f.write(new_content)
                            files_modified += 1
                            replacements_made += file_replacements
                            print(
                                f"{file_path}: 完成了 {file_replacements} 处替换，已保存为GB 2312编码"
                            )
                            # 输出详细日志
                            if debug_mode and file_replacements_log:
                                print("  替换详情:")
                                for log_entry in file_replacements_log[
                                    :10
                                ]:  # 只显示前10条替换记录
                                    print(f"  - {log_entry}")
                                if len(file_replacements_log) > 10:
                                    print(
                                        f"  - ...等共 {len(file_replacements_log)} 处替换"
                                    )
                        except UnicodeEncodeError:
                            print(
                                f"警告：{file_path} 无法以GB 2312编码保存，尝试使用替代方案"
                            )
                            # 使用errors='replace'或'ignore'选项处理无法编码的字符
                            with open(
                                file_path, "w", encoding="gb2312", errors="replace"
                            ) as f:
                                f.write(new_content)
                            print(f"  - 已用替代字符保存文件")

        # 打印统计信息
        print("\n替换完成！")
        print(f"扫描文件数: {files_processed}")
        print(f"修改文件数: {files_modified}")
        print(f"总替换次数: {replacements_made}")

    except Exception as e:
        print(f"发生错误: {str(e)}")
        import traceback

        print(traceback.format_exc())


if __name__ == "__main__":
    replace_strings_in_files()

##########################################################

Put it in the root directory of AGE, extract it first, fill in the translation and then replace it. !!! Note: Due to the writing operation, please back up the folder in advance. Also, this script will save the replaced code file in GB 2132 character set.

er, I mean you can start your own project and put this in it...

May 13 '25 08:05 fengsuiyvmin

I made this in my spare time at work. I'm too lazy to fork it. I'll make it complete when I have time, or wait for others to do it.

May 14 '25 06:05 cute-cirno

This is my Excel doc for Chinese:

extracted_strings.xlsx

多谢，现在我也编译出了中文版本的编辑器了（然而数据仍然是英文的，比如单位名称是Heavy Camel而不是“重装骆驼兵”，不过这个可能暂时没什么办法了）

Oct 02 '25 07:10 GoddessLuBoYan

This is my Excel doc for Chinese: extracted_strings.xlsx

多谢，现在我也编译出了中文版本的编辑器了（然而数据仍然是英文的，比如单位名称是Heavy Camel而不是“重装骆驼兵”，不过这个可能暂时没什么办法了）

一个合适的解决方法是，通过游戏本体的语言文件获取对应版本的文本，并通过修改UI的方式，将对应文本展示在界面上，辅助理解。然而这相当于为本工具设计多语言功能，时间、精力消耗都会比较大

A solution is: use language file from game to get text of all languages, and change UI form, put language text onto UI to help up understand them. However, it is almost equal to add i18n feature to this tool, and it will cost much time and thought.

Oct 02 '25 07:10 GoddessLuBoYan