databricks-cli icon indicating copy to clipboard operation
databricks-cli copied to clipboard

Please support wildcards when doing dbfs cp

Open arvindshmicrosoft opened this issue 8 years ago • 1 comments

dbfs cp does not seem to support wildcards. Any plans to support wildcards, especially for files which are on the remote dbfs.

arvindshmicrosoft avatar Nov 30 '17 21:11 arvindshmicrosoft

Little bit late to the party, but still as relevant I guess. I was running into the same but then with dbfs rm, so I created a utility script as a workaround. You can do the same with dbfs cp

  1. Put the following in a bash script (e.g. clean_dbfs.sh)
#!/bin/bash
echo "----------------- Utility script to clean up remote dbfs using wildcards ------------------"
pattern="$1"
echo "Pattern: $1";
for i in $(dbfs ls dbfs:/FileStore | grep "$1")
  do
     dbfs rm dbfs:/FileStore/$i
     echo "Removed $i"
 done
  1. Make the file executable: $ chmod 766 clean_dbfs.sh
  2. Set an alias in your bashrc: alias 'clean-dbfs'='~/clean_dbfs.sh'. Source your bashrc.
  3. clean-dbfs pattern

Obviously supporting wildcards would be nice, so +1 there.

aglohmeijer avatar Mar 08 '21 12:03 aglohmeijer