LeanDojo icon indicating copy to clipboard operation
LeanDojo copied to clipboard

get_traced_theorems() method for repo doesn't return repo theorems

Open AG161 opened this issue 1 year ago • 0 comments

Description After tracing a mathlib4 repo, the get_traced_theorems() method on the traced repo fails to return all mathlib4 theorems, seems to return base Lean 4 theorems instead.

Logs in Debug Mode

Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] on linux                                  
Type "help", "copyright", "credits" or "license" for more information.                              
>>> from lean_dojo import *                                                                         
2025-04-30 18:43:24.365 | DEBUG    | lean_dojo.data_extraction.lean:<module>:41 - Using GitHub perso
nal access token for authentication                                                                 
>>> repo = LeanGitRepo("https://github.com/leanprover-community/mathlib4","203d7e7e6b62d6c349f20f375
df9b69ad3d384eb")                                                                                   
>>> tr = trace(repo)                                                                                
2025-04-30 18:43:37.821 | DEBUG    | lean_dojo.data_extraction.trace:get_traced_repo_path:219 - The 
traced repo is available in the cache.                                                              
2025-04-30 18:43:37.821 | INFO     | lean_dojo.data_extraction.trace:trace:248 - Loading the traced 
repo from /home/agittis/.cache/lean_dojo/leanprover-community-mathlib4-203d7e7e6b62d6c349f20f375df9b
69ad3d384eb/mathlib4                                                                                
2025-04-30 18:43:38.048 | DEBUG    | lean_dojo.data_extraction.traced_data:load_from_disk:1187 - Loa
ding 1441 traced XML files from /home/agittis/.cache/lean_dojo/leanprover-community-mathlib4-203d7e7
e6b62d6c349f20f375df9b69ad3d384eb/mathlib4 with 31 workers                                          
2025-04-30 18:43:39,823 INFO worker.py:1879 -- Started a local Ray instance. View the dashboard at 1
27.0.0.1:8265 
  0%|                                                                      | 0/1441 [00:00<?, ?it/s]
(pid=896035) 2025-04-30 18:43:41.162 | DEBUG    | lean_dojo.data_extraction.lean:<module>:41 - Using
 GitHub personal access token for authentication
100%|███████████████████████████████████████████████████████████| 1441/1441 [01:37<00:00, 14.82it/s]
(pid=896050) 2025-04-30 18:43:41.300 | DEBUG    | lean_dojo.data_extraction.lean:<module>:41 - Using
 GitHub personal access token for authentication [repeated 30x across cluster] (Ray deduplicates log
s by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/maste
r/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
2025-04-30 18:45:20.029 | DEBUG    | lean_dojo.data_extraction.lean:get_dependencies:652 - Querying 
the dependencies of LeanGitRepo(url='/home/agittis/.cache/lean_dojo/leanprover-community-mathlib4-20
3d7e7e6b62d6c349f20f375df9b69ad3d384eb/mathlib4', commit='203d7e7e6b62d6c349f20f375df9b69ad3d384eb')
2025-04-30 18:45:20.624 | DEBUG    | lean_dojo.data_extraction.lean:_to_commit_hash:195 - Querying t
he commit hash for lean4 v4.18.0
2025-04-30 18:45:34.320 | DEBUG    | lean_dojo.data_extraction.traced_data:check_sanity:1049 - Check
ing the sanity of TracedRepo(repo=LeanGitRepo(url='/home/agittis/.cache/lean_dojo/leanprover-communi
ty-mathlib4-203d7e7e6b62d6c349f20f375df9b69ad3d384eb/mathlib4', commit='203d7e7e6b62d6c349f20f375df9
b69ad3d384eb'), dependencies={'lean4': LeanGitRepo(url='https://github.com/leanprover/lean4', commit
='11ccbced796476be020459a83c599b301a765d3e'), 'plausible': LeanGitRepo(url='https://github.com/leanp
rover-community/plausible', commit='115177554fb2f67d583586886a9cfe1db2899945'), 'LeanSearchClient': 
LeanGitRepo(url='https://github.com/leanprover-community/LeanSearchClient', commit='8d29bc2c3ebe1f86
3c2f02df816b4f3dd1b65226'), 'importGraph': LeanGitRepo(url='https://github.com/leanprover-community/
import-graph', commit='82480d38be7db5235f5407d5a5951b36b37f2584'), 'proofwidgets': LeanGitRepo(url='
https://github.com/leanprover-community/ProofWidgets4', commit='a602d13aca2913724c7d47b2d7df0353620c
4ee8'), 'aesop': LeanGitRepo(url='https://github.com/leanprover-community/aesop', commit='2bcdf2985d
be37cff63ca18346d8b26b8a448d3d'), 'Qq': LeanGitRepo(url='https://github.com/leanprover-community/quo
te4', commit='0e05c2f090b7dd7a2f530bdc48a26b546f4837c7'), 'batteries': LeanGitRepo(url='https://gith
ub.com/leanprover-community/batteries', commit='613510345e4d4b3ce3d8c129595e7241990d5b39'), 'Cli': L
eanGitRepo(url='https://github.com/leanprover/lean4-cli', commit='dd423cf2b153b5b14cb017ee4beae78856
5a3925')}, root_dir=PosixPath('/home/agittis/.cache/lean_dojo/leanprover-community-mathlib4-203d7e7e
6b62d6c349f20f375df9b69ad3d384eb/mathlib4'))
>>> thm_list = tr.get_traced_theorems()
>>> len(thm_list)
16017
>>> repos = list({thm.repo for thm in thm_list})
>>> repos
[LeanGitRepo(url='https://github.com/leanprover/lean4', commit='11ccbced796476be020459a83c599b301a765d3e')]
>>> 

Platform Information Ubuntu 22.04.5 LeanDojo 2.2.0

AG161 avatar May 01 '25 01:05 AG161