spark icon indicating copy to clipboard operation
spark copied to clipboard

spark-submit: throw an error when duplicate argument is provided

Open sybernatus opened this issue 3 years ago • 1 comments

What changes were proposed in this pull request?

I propose this PR to warn the user when he is not using correctly the spark submit CLI. The idea here is to add a control on duplicates arguments that should not be duplicated.

As I don't know the project so much, I've just added this rule on the following arguments :

  • --py-files
  • --files

I can add more arguments to the list of non duplicatable arguments if needed.

For example using --files argument, the argument can only appear once with a list of files coma separated as value as follow :

spark-submit --files my-file,my-second-file

But if someone try to use it like below instead :

sparksubmit --files my-file --files my-second-file

Using multiple times the argument --files, will work without error but only one of the files will be provided to spark. It could be time consuming for user to understand why it works but not as expected.

Why are the changes needed?

This change is put in place to facilitate the use of spark as it's already a heavy project.

Does this PR introduce any user-facing change?

Yes, user using duplicates arguments will now receive an error instead of submitting to spark

How was this patch tested?

Test case has been added to the Spark launcher project.

:zap:

sybernatus avatar Jul 23 '22 09:07 sybernatus

Can one of the admins verify this patch?

AmplabJenkins avatar Jul 23 '22 21:07 AmplabJenkins

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

github-actions[bot] avatar Nov 02 '22 00:11 github-actions[bot]

/reopen please 🙂

sybernatus avatar Nov 03 '22 06:11 sybernatus