localGPT icon indicating copy to clipboard operation
localGPT copied to clipboard

multi folders

Open baiyu0083 opened this issue 2 years ago • 1 comments

I have about a hundred folders, and about six hundred pdf files in each folder. How to modify SOURCE_DOCUMENTS directory? Thanks

baiyu0083 avatar Jun 02 '23 08:06 baiyu0083

@baiyu0083 that is something i will look into, my Monday you should have that feature, if possible

Allaye avatar Jun 02 '23 10:06 Allaye

I added support locally for recursive folders by updating the beginning of load_documents (in ingest.py) to look like this. It's not cleaned up or anything, but it works for me.

def load_documents(source_dir: str) -> list[Document]:
    # Loads all documents from the source documents directory
    all_files = os.listdir(source_dir)
    paths = []
    for dirpath, dirs, files in os.walk(source_dir):
        for filename in files:

        # for file_path in all_files:

            file_extension = os.path.splitext(filename)[1]
            source_file_path = os.path.join(source_dir, dirpath, filename)
            # print(source_file_path)
            if file_extension in DOCUMENT_MAP.keys():
                paths.append(source_file_path)

StrikeNP avatar Aug 17 '23 18:08 StrikeNP