fuzzyjoin
fuzzyjoin copied to clipboard
Confusion regarding by vs. multi_by and match_fun vs. multi_match_fun
In addition to providing examples of match_fun's #22 , it looks like match_fun gets used as multi_match_fun if match_fun is singular and there are multiple column's in the by argument? "If only one function is given it is used on all column pairs. "
If so, then multi_by and multi_match_fun seems confusing and redundant to me.
I see the note "Note that as of now, you cannot give both match_fun and multi_match_fun- you can either compare each column individually or compare all of them." Perhaps multi_by and multi_match_fun should be removed in the future?
Basically, the following definitions seem redundant and I can't tell what the differences are
by
Columns of each to join
match_fun
Vectorized function given two columns, returning TRUE or FALSE as to whether they are a match. Can be a list of functions one for each pair of columns specified in by (if a named list, it uses the names in x). If only one function is given it is used on all column pairs.
multi_by
Columns to join, where all columns will be used to test matches together
multi_match_fun
Function to use for testing matches, performed on all columns in each data frame simultaneously