lyrics: In Genius backend, tolerate artist disambiguation markers
Problem
importing Genius lyrics for specific Artists does not work.
The reason is that the artists are not know in Genius with just their bandname (Psychonaut or Brutus) but due to multiple bands having the same name, know as <
Running this command in verbose (-vv) mode:
$ beet -vv lyrics violate consensus reality all your gods have gone
Led to this problem:
user configuration: /Media/home/.config/beets/config.yaml
data directory: /Media/home/.config/beets
plugin paths:
Sending event: pluginload
library database: /Media/home/.config/beets/library.db
library directory: /Media/Music
Sending event: library_opened
lyrics: Genius failed to find a matching artist for 'Psychonaut'
lyrics: failed to fetch: https://www.musixmatch.com/lyrics/Psychonaut/All-Your-Gods-Have-Gone (404)
lyrics: lyrics not found: Psychonaut - Violate Consensus Reality - All Your Gods Have Gone
Sending event: cli_exit
Here's a link to the music files that trigger the bug (if relevant):
Setup
- OS: alpine 3.17.3
- Python version: 3.10.11
- beets version: 1.6.0
- Turning off plugins made problem go away (no):
My configuration (output of beet config) is:
lyrics:
bing_lang_from: []
google_API_key: REDACTED
google_engine_ID: REDACTED
fallback: ''
sources: genius musixmatch
auto: yes
bing_client_secret: REDACTED
bing_lang_to:
genius_api_key: REDACTED
force: no
local: no
directory: /Media/Music
library: /Media/home/.config/beets/library.db
import:
copy: no
write: yes
ignore: ['?eaDir*']
incremental: yes
genres: yes
ui:
color: yes
paths:
default: $albumartist/$albumartist - $year - $album/$albumartist - $album - $track - $title
plugins: web discogs fetchart mbsync duplicates info missing lyrics
web:
host: 0.0.0.0
readonly: no
include_paths: yes
port: 8337
cors: ''
cors_supports_credentials: no
reverse_proxy: no
fetchart:
auto: yes
cover_names: cover front art album folder
sources: coverart itunes amazon albumart
minwidth: 0
maxwidth: 0
quality: 0
max_filesize: 0
enforce_ratio: no
cautious: no
google_key: REDACTED
google_engine: 001442825323518660753:hrh5ch1gjzm
fanarttv_key: REDACTED
lastfm_key: REDACTED
store_source: no
high_resolution: no
deinterlace: no
cover_format:
discogs:
index_tracks: yes
apikey: REDACTED
apisecret: REDACTED
tokenfile: discogs_token.json
source_weight: 0.5
user_token: REDACTED
separator: ', '
missing:
count: no
total: no
album: no
duplicates:
album: no
checksum: ''
copy: ''
count: no
delete: no
format: ''
full: no
keys: []
merge: no
move: ''
path: no
tiebreak: {}
strict: no
tag: ''
This sounds annoying! It would be helpful to experiment with different ways of resolving the ambiguity. For example, does simply dropping the last two-letter word always work, or does that ever introduce ambiguity with a different artist?
Here's where to start when tweaking the matching heuristic: https://github.com/beetbox/beets/blob/9527a07767629c1ceb99c2cd681b78172a7272a0/beetsplug/lyrics.py#L361
if i regex replace [<2 letter country code>] it seems to work line 359
old
hit_artist = hit["result"]["primary_artist"]["name"]
new
hit_artist = re.sub(r'.[\(\[]..[\)\]]','',hit["result"]["primary_artist"]["name"])
Nice, that seems like a good step! An eventual PR should try both (the original and truncated name, if any) to make sure we don't miss artists that happen to look like this.