Plugin search - Diacritical marks, accents
Make plugin search aware of diacritical marks, accents.
Enhancement Summary
Make letters with diactitical marks turn up in search results sorted as if it was not having a diacritical mark, base letter. Possibly make this sorting optional (setting). In some languages diacritics affects sorting and then in some others they don't.. We would probably need to roll our own as QString::localeAwareCompare() may not work: https://doc.qt.io/qt-5/qstring.html#localeAwareCompare-1 https://stackoverflow.com/questions/286921/efficiently-replace-all-accented-characters-in-a-string/9667817#9667817
Justification
Not many plugins use letters in the name other than the ones provided by the English language but when they do the result is quite confusing. The only example out there I've come across so far is Rézonateur by https://github.com/jpcima/rezonateur
In LV2 plugins there is support for localization of the name but in case this is not implemented or has been missed it would be good to catch the odd letters. In the case of the plugin above it will not turn up in a search for 'rezonateur' and if you scan the list of plugins it will also not turn up around re* but later. You may miss that the plugin installation succeeded.

We would probably need to roll our own as QString::localeAwareCompare() may not work
What makes you say that?
What makes you say that?
This comment by @tresf in discord/dev-only
I was hoping QString's localCompare would work, but from what I'm reading it doesn't.... Instead, I think we need to roll our own... Here's a decent thread: https://stackoverflow.com/a/9667817/3196753
https://stackoverflow.com/questions/14009522/how-to-remove-accents-diacritic-marks-from-a-string-in-qt https://stackoverflow.com/questions/12278448/removing-accents-from-a-qstring
Seems to be possible to convert é to e but not œ to oe for example
OK. I'm all for a partial/pragmatic fix that is easier to implement.
I fooled around with it a bit and the sorting part was easy. Add a hidden column with the name stripped from accents and other nonsense. I don't know about the search though. Grab da code...

diff --git a/src/gui/modals/EffectSelectDialog.cpp b/src/gui/modals/EffectSelectDialog.cpp
index 2b5c0c58d..324d85459 100644
--- a/src/gui/modals/EffectSelectDialog.cpp
+++ b/src/gui/modals/EffectSelectDialog.cpp
@@ -73,30 +73,38 @@ EffectSelectDialog::EffectSelectDialog( QWidget * _parent ) :
// and fill our source model
m_sourceModel.setHorizontalHeaderItem( 0, new QStandardItem( tr( "Name" ) ) );
m_sourceModel.setHorizontalHeaderItem( 1, new QStandardItem( tr( "Type" ) ) );
+ m_sourceModel.setHorizontalHeaderItem( 2, new QStandardItem( tr( "namestripped" ) ) );
int row = 0;
for( EffectKeyList::ConstIterator it = m_effectKeys.begin();
it != m_effectKeys.end(); ++it )
{
QString name;
QString type;
+ QString namestripped;
if( it->desc->subPluginFeatures )
{
name = it->displayName();
type = it->desc->displayName;
+ namestripped = name.normalized(QString::NormalizationForm_KD);
}
else
{
name = it->desc->displayName;
type = "LMMS";
+ namestripped = name;
}
m_sourceModel.setItem( row, 0, new QStandardItem( name ) );
m_sourceModel.setItem( row, 1, new QStandardItem( type ) );
+ m_sourceModel.setItem( row, 2, new QStandardItem( namestripped ) );
++row;
}
// setup filtering
m_model.setSourceModel( &m_sourceModel );
m_model.setFilterCaseSensitivity( Qt::CaseInsensitive );
connect( ui->filterEdit, SIGNAL( textChanged( const QString& ) ),
&m_model, SLOT( setFilterFixedString( const QString& ) ) );
@@ -128,7 +136,8 @@ EffectSelectDialog::EffectSelectDialog( QWidget * _parent ) :
QHeaderView::Stretch );
ui->pluginList->horizontalHeader()->setSectionResizeMode( 1,
QHeaderView::ResizeToContents );
- ui->pluginList->sortByColumn( 0, Qt::AscendingOrder );
+ ui->pluginList->sortByColumn( 2, Qt::AscendingOrder );
+ ui->pluginList->setColumnHidden(2, true);
updateSelection();
show();
Would we need characters like æ be replaced by ae, a, or e, not just any one? Because when thinking of æ a person can type any of them.
We should really use QString::localeAwareCompare() for sorting, as characters like ñ ø ä are actual characters in foreign alphabets in contrast to é which is just an e. Normalizing can be used for search tho.
I don't know if we should even bother to make a custom function to convert double letters like æ. First find a plugin with æ in its name.
If we decide to do it anyway, just go with ae. For better matching we should have fuzzy search to correct spelling errors, that would make a way better user experience than just æ.
Would we need characters like æ be replaced by ae, a, or e, not just any one? Because when thinking of æ a person can type any of them.
æ is usually substituted with ae.
I don't know if we should even bother to make a custom function to convert double letters like æ. First find a plugin with æ in its name.
I wouldn't go to great lengths to fix that. I've found two plugins that stick out in the crowd Rézonateur with it's accent and μ-law compressor. I'm pretty sure whoever named μ-law knew it would turn up last in the list. If someone from Scandinavia sticks an æ in there they did it just for the hell of it and expect trouble like this.
We should really use QString::localeAwareCompare() for sorting.
I'm pretty sure I started fiddling with localeAwareCompare() the other day but I couldn't get it to work. This is however QSortFilterProxyModel::setSortLocaleAware().
This line achieves what the patch above did but in one line. It still doesn't affect search.
m_model.setFilterCaseSensitivity( Qt::CaseInsensitive );
+ m_model.setSortLocaleAware(true);
æ is usually substituted with ae.
And if you substitute it like that it will pop up in the wrong place in the sort from a Scandinavian point of view. Letters Å, Ä, Æ, Ö and Ø, are all sorted in the end of their respective alphabets. So they are more correctly sorted by not touching them. From a Scandinavian point of view that is. If your not from Scandinavia and want to test your new plugin 'Ångest' you may think of Å as sorting like A and this is what sortLocaleAware() does. I don't think there is a good way to solve this from our side. Fixing only accents is reasonable. Ligatures? Not so much.
This problem, if it arises, is probably more up-stream than anything else. Give plugins a proper, english sortable, working title and call it whatever you want on the gui, in the manual, and on your homepage. That should work for most people.
And if you substitute it like that it will pop up in the wrong place in the sort from a Scandinavian point of view.
Sort =! Search. Quoting discord Convo
Naturally, you'd need to support the original/diactrical versions as well
Sorting is weight-based and UIs have ways of rolling our own. AFAIR, this topic is more about searching than it is about sorting.
AFAIR, this topic is more about searching than it is about sorting.
Well, when I made the issue I thought the code behind the sort and search was more or less the same. I no longer hold that belief.
This line achieves what the patch above did but in one line. It still doesn't affect search.
Since zonkmachine has found the solution I don't see why we're still discussing that.
Here's the other missing part:
m_sourceModel.setItem(row, 2, new QStandardItem(name + name.normalized(QString::NormalizationForm_KD)));
m_model.setFilterKeyColumn(2);
Here's the other missing part:
m_sourceModel.setItem(row, 2, new QStandardItem(name + name.normalized(QString::NormalizationForm_KD))); m_model.setFilterKeyColumn(2);
Cool! But I couldn't make that work over here. Do you think you could pull something and add me as a reviewer?
Ah according to the stackoverflow link you also need to remove special characters. Tested:
for( EffectKeyList::ConstIterator it = m_effectKeys.begin();
it != m_effectKeys.end(); ++it )
{
QString name;
QString type;
if( it->desc->subPluginFeatures )
{
name = it->displayName();
type = it->desc->displayName;
}
else
{
name = it->desc->displayName;
type = "LMMS";
}
m_sourceModel.setItem( row, 0, new QStandardItem( name ) );
m_sourceModel.setItem( row, 1, new QStandardItem( type ) );
+ QString normalized = name.normalized(QString::NormalizationForm_KD);
+ normalized.remove(QRegExp("[^a-zA-Z\\s]"));
+ m_sourceModel.setItem( row, 2, new QStandardItem(normalized));
++row;
}
// setup filtering
m_model.setSourceModel( &m_sourceModel );
m_model.setFilterCaseSensitivity( Qt::CaseInsensitive );
+ m_model.setFilterKeyColumn(2);
connect( ui->filterEdit, SIGNAL( textChanged( const QString& ) ),
&m_model, SLOT( setFilterFixedString( const QString& ) ) );
connect( ui->filterEdit, SIGNAL( textChanged( const QString& ) ),
this, SLOT(updateSelection()));
connect( ui->filterEdit, SIGNAL( textChanged( const QString& ) ),
SLOT(sortAgain()));
ui->pluginList->setModel( &m_model );
+ ui->pluginList->setColumnHidden(2, true);
Getting there, but now you can't search for Rézonateur with an accent.
I guess you need to also normalize the search string.
but now you can't search for Rézonateur with an accent
If you combined both the normalized and original string in the hidden column you could search for both
I guess you need to also normalize the search string
You could absolutely, but personally, if I typed an Ö I wouldn't expect it to match O
If you combined both the normalized and original string in the hidden column you could search for both
m_sourceModel.setItem( row, 2, new QStandardItem(normalized + name));
Oh, its basic string concatenation. Got it!
You could absolutely, but personally, if I typed an Ö I wouldn't expect it to match O
No, but you wouldn't expect O to match Ö either. The things around French vowels you can (probably) just discard in the search but not the Swedish ones. It's a pretty tricky problem. I find it tempting to just let it be and forward it up-stream. Change the name of the plugins in the LV2 .ttl file, should a problem of this type arise.
No, but you wouldn't expect O to match Ö either.
Please allow me to humbly disagree. If there's a plugin collection called "björn collective", I'd expect this to match "bjorn". I'm not sure how much sense this makes in the native language, but for languages that don't use umlat, it's quite common, e.g. https://www.google.com/search?q=bjork
Please allow me to humbly disagree. If there's a plugin collection called "björn collective", I'd expect this to match "bjorn". I'm not sure how much sense this makes in the native language,
Yeah, maybye. The more I've looked into this the more unsure I've become on what needs to be done. PS. I'm not going to pull anything from here because I don't feel on top of the issue.