Output Column Naming and Append vs replace
This is most likely very subjective but I always "struggle" with the configuration of the RDKit nodes namely the output column name of the changed molecule. The actual cases where I do NOT want to simply replace the exiting column are rare. So I end up having to check "Remove Source Column" and then Change the output column name back to the original name. This still leaves the issue that now the column order needs to be fixed afterwards. It's just cumbersome and not very neat/clean.
I much prefer the way the indigo nodes handle this with the Append column check box.

I simply need to uncheck the box and it's done (ideally it should be unchecked by default). It also actually replaces the column and keeps column order intact.
The suggestion hence is do it like the indigo nodes but have "append column" unchecked by default. This would in my opinion greatly increase the ease of use of RDKit nodes for common use-cases.
Thinking in depth about it showed that implementing this as suggested would break backward compatibility of all RDKit calculation nodes. The current settings controlling the behavior are "New column name" (String) and "Replace source column" (bool). The new proposed settings are "Append" (bool) and "New column name" (String).
The following mapping could be done (and NOT be done):
Old: New column name != source column, Remove source column True New: Append column True, New column name defined (!= source column) BUT: Source column would remain!!! Bad! => This is the current default main use case and in my opinion a showstopper for the suggested approach
Old: New column name == source column, Remove source column True New: Append column False (Replaces source column) Works
Old: New column name != source column, Remove source column False New: Append column True, New column name defined (!= source column) Works
Not applicable (invalid): Old: New column name == source column, Remove source column False New: Append column True, New column name defined (== source column)
As it would be very confusing for RDKit node users to have suddenly the default use case changed or not supported anymore, two different alternatives come to mind:
-
Introduce beside Append (bool) and New Column Name also the Remove Source Column (bool) option, which would make only sense if Append = true to generate a new column. => This makes things more complex. Not my favorite.
-
Keep all settings as they are, but change defaults:
- New column name: Default will be the same as input column name (determined by auto-guessing as before) (was containing a function specific postfix in the past)
- Remove source column: Default will be true (was false in the past)
- Change node logic that keeps result column at the same index as input column if the new column has the same name and the remove source column option is true.
- Write a note in the node documentation that it will replace the input column at the same index, if new column name is set to the same name as input column name and the remove source column option is enabled Old nodes will continue to function as before. New nodes will work as anticipated – no clicks anymore, no renaming if one would like to replace the input column with the result column
I would propose to implement this second alternative, which is fully backward compatible and quite easy to change in the code.
Agree. 2. makes sense.