New README proposal
I worry that this duplicates too much information from the website. Duplicated information will be hard to maintain and will fall out of sync
Apart from the "When is it useful?" part which is too detailed, I find you it's great as it's much clearer. Maybe you can remove/shorten and merge the "When is it useful?" and "What dirty_cat does not"? In any case, you put the link to the website every time, and for every encoder, people will tend to click on it for more information.
Okay, since these changes are a point of contention, we should delay the PR, so we can release on Monday, and we can discuss it together on Wednesday, during the sprint :)
I've moved some parts on the website, which I agree is more suitable. Please check it out and let me know what you think!
I like what you added to the website!
For the part What can dirty_cat do, I fear it will have to change often. We have added recently fuzzy_join and we will add soon deduplication, so it will evolve fast.
The rest are all good improvements to me.
Great! Maybe it would be cool to add one line on the SuperVectorizer in "What can dirty-cat do?" ? It doesnt' really fit in the rest of the description and it's a really cool feature.
I get what you mean, but I feel like this is true for both the SuperVectorizer and the fuzzy join. Without completely reverting what I did earlier, I added some very brief mentions of the most important tools and encoders, in order to guide users.
LGTM
Alright, thanks all for the reviews, merging!