dbx
dbx copied to clipboard
dbx sync fails after git rebase
Expected Behavior
I am using dbx sync during the development. It works through the whole development cycle. At the end or throughout the process, I perform a git rebase. The changes should also sync.
Current Behavior
During/after the rebase, DBX seems to run into the following error:
[dbx][2023-02-07 11:15:58.828] Putting /Repos/[email protected]/lotus-ml/notebooks/image_similarity/autoencoder/DeployModel.py
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/commands/sync/sync.py:318 in │
│ repo │
│ │
│ 315 │ │
│ 316 │ client = ReposClient(user=user_name, repo_name=dest_repo, config=config) │
│ 317 │ │
│ ❱ 318 │ main_loop( │
│ 319 │ │ source=source, │
│ 320 │ │ matcher=matcher, │
│ 321 │ │ client=client, │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/commands/sync/functions.py:1 │
│ 29 in main_loop │
│ │
│ 126 │ # Run the incremental copy and record how many operations were performed or would ha │
│ 127 │ # performed (if in dry run mode). An operation usually translates to an API call, s │
│ 128 │ # create a directory, put a file, etc. │
│ ❱ 129 │ op_count = syncer.incremental_copy() │
│ 130 │ │
│ 131 │ if not op_count: │
│ 132 │ │ dbx_echo("No changes found during initial copy") │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/sync/__init__.py:449 in │
│ incremental_copy │
│ │
│ 446 │ │ │
│ 447 │ │ # Use the diff between current snapshot and previous snapshot to apply the same │
│ 448 │ │ # against the remote location. │
│ ❱ 449 │ │ op_count = asyncio.run(self._apply_snapshot_diff(diff)) │
│ 450 │ │ │
│ 451 │ │ self.last_snapshot = snapshot │
│ 452 │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/asyncio/runners.py:44 in run │
│ │
│ 41 │ │ events.set_event_loop(loop) │
│ 42 │ │ if debug is not None: │
│ 43 │ │ │ loop.set_debug(debug) │
│ ❱ 44 │ │ return loop.run_until_complete(main) │
│ 45 │ finally: │
│ 46 │ │ try: │
│ 47 │ │ │ _cancel_all_tasks(loop) │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/asyncio/base_events.py:649 in │
│ run_until_complete │
│ │
│ 646 │ │ if not future.done(): │
│ 647 │ │ │ raise RuntimeError('Event loop stopped before Future completed.') │
│ 648 │ │ │
│ ❱ 649 │ │ return future.result() │
│ 650 │ │
│ 651 │ def stop(self): │
│ 652 │ │ """Stop running the event loop. │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/sync/__init__.py:243 in │
│ _apply_snapshot_diff │
│ │
│ 240 │ │ │ op_count += await self._apply_dirs_created(diff, session) │
│ 241 │ │ │ op_count += await self._apply_files_created(diff, session) │
│ 242 │ │ │ op_count += await self._apply_files_deleted(diff, session, deleted_dirs) │
│ ❱ 243 │ │ │ op_count += await self._apply_files_modified(diff, session) │
│ 244 │ │ │
│ 245 │ │ return op_count │
│ 246 │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/sync/__init__.py:209 in │
│ _apply_files_modified │
│ │
│ 206 │ │ return await self._apply_file_puts(session, diff.files_created, "created") │
│ 207 │ │
│ 208 │ async def _apply_files_modified(self, diff: SnapshotDiff, session: aiohttp.ClientSes │
│ ❱ 209 │ │ return await self._apply_file_puts(session, diff.files_modified, "modified") │
│ 210 │ │
│ 211 │ async def _apply_files_deleted( │
│ 212 │ │ self, diff: SnapshotDiff, session: aiohttp.ClientSession, deleted_dirs: List[str │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/sync/__init__.py:202 in │
│ _apply_file_puts │
│ │
│ 199 │ │ │ else: │
│ 200 │ │ │ │ dbx_echo(f"(noop) File {msg}: {path}") │
│ 201 │ │ if tasks: │
│ ❱ 202 │ │ │ await asyncio.gather(*tasks) │
│ 203 │ │ return op_count │
│ 204 │ │
│ 205 │ async def _apply_files_created(self, diff: SnapshotDiff, session: aiohttp.ClientSess │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/sync/__init__.py:196 in task │
│ │
│ 193 │ │ │ │ │ # Files can be created in parallel, but we limit how many are opened │
│ 194 │ │ │ │ │ # so we don't use memory excessively. │
│ 195 │ │ │ │ │ async with sem: # noqa │
│ ❱ 196 │ │ │ │ │ │ await self.client.put(get_relative_path(self.source, p), p, sess │
│ 197 │ │ │ │ │
│ 198 │ │ │ │ tasks.append(task(path)) │
│ 199 │ │ │ else: │
│ │
│ /usr/local/Caskroom/miniconda/base/lib/python3.10/site-packages/dbx/sync/clients.py:273 in put │
│ │
│ 270 │ │ │ │ │ else: │
│ 271 │ │ │ │ │ │ txt = await resp.text() │
│ 272 │ │ │ │ │ │ dbx_echo(f"HTTP {resp.status}: {txt}") │
│ ❱ 273 │ │ │ │ │ │ raise ClientError(resp.status) │
│ 274 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ClientError: 404
Steps to Reproduce (for bugs)
Context
Your Environment
- Mac OSX Ventura 13.2
- dbx version used: 0.8.7
- Databricks Runtime version: 11.3
Hey thanks for the report. I wonder if this is related to #280. The fact that it returns 404 on a put seems to suggest that the parent directory does not exist for some reason. Does this happen consistently when you rebase or is it sporadic?