Error calling `_initialize_deepspeed_train`: attempted relative import beyond top-level package
I am using pytorch-lightning with deepspeed and getting the following error. Any ideas on how to fix? Thanks a lot!
Relevant Versions
deepspeed==0.8.1
pytorch-lightning==1.9.4
ValueError: attempted relative import beyond top-level package
Traceback (most recent call last):
File "main.py", line 127, in <module>
run(args)
File "main.py", line 91, in run
trainer.fit(model, datamodule=datamodule, ckpt_path=args.ckpt_path)
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
call._call_and_handle_interrupt(
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 88, in launch
return function(*args, **kwargs)
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1093, in _run
self.strategy.setup(self)
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/strategies/deepspeed.py", line 345, in setup
self.init_deepspeed()
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/strategies/deepspeed.py", line 456, in init_deepspeed
self._initialize_deepspeed_train(model)
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/strategies/deepspeed.py", line 493, in _initialize_deepspeed_train
model, deepspeed_optimizer = self._setup_model_and_optimizer(model, optimizer, scheduler)
File "/home/griffin/bhc/lib/python3.8/site-packages/pytorch_lightning/strategies/deepspeed.py", line 414, in _setup_model_and_optimizer
deepspeed_engine, deepspeed_optimizer, _, _ = deepspeed.initialize(
File "/home/griffin/bhc/lib/python3.8/site-packages/deepspeed/__init__.py", line 125, in initialize
engine = DeepSpeedEngine(args=args,
File "/home/griffin/bhc/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 336, in __init__
self._configure_optimizer(optimizer, model_parameters)
File "/home/griffin/bhc/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1292, in _configure_optimizer
self.optimizer = self._configure_zero_optimizer(basic_optimizer)
File "/home/griffin/bhc/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1542, in _configure_zero_optimizer
optimizer = DeepSpeedZeroOptimizer(
File "/home/griffin/bhc/lib/python3.8/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 165, in __init__
util_ops = UtilsBuilder().load()
File "/home/griffin/.local/lib/python3.8/site-packages/op_builder/builder.py", line 230, in load
from ...git_version_info import installed_ops, torch_info
ValueError: attempted relative import beyond top-level package
Hi @griff4692, your deepspeed repo structure looks odd.
The line that throws error
File "/home/griffin/.local/lib/python3.8/site-packages/op_builder/builder.py", line 230, in load
Should not it be in /home/griffin/.local/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py instead?
My guess is your installation is corrupted. Maybe due to your old installation not being removed?
Hi @griff4692, I will close this issue for now. Feel free to re-open if you're still seeing it.