semantic-code-search icon indicating copy to clipboard operation
semantic-code-search copied to clipboard

support utf8 and other encoding type

Open bankThanabat opened this issue 2 years ago • 0 comments

fix #24 This pull request adds support for an optional encoding type argument in the embed.py module. The previous version of the module did not properly handle UTF-8 encoded files when used with the do_embed() function. This issue was caused by the parser.parse() method in the _get_repo_functions() function

To resolve this issue, i added support for an optional encoding type argument using the -en or --encoding argument. Users can now specify an encoding type when running the script, and the script will use the specified encoding type when processing files. If no encoding type is specified, the default encoding type 'utf-8' will be used.

The following changes were made to the embed.py module:

  • Added the -en or --encoding argument to the parser.add_argument() method.
  • Modified the _get_repo_functions() function to open files using open(fp, 'r', encoding=encoding) instead of open(fp, 'r').
  • Modified the do_embed() function to encode the file content using file_content.encode(encoding) instead

These changes should resolve the issue with UTF-8 encoded files and make the script more flexible

bankThanabat avatar May 05 '23 06:05 bankThanabat