Infobox inconsistent attribute names.
check out http://awoiaf.westeros.org/index.php/Stannis_Baratheon
The scrapper did not pick up a House affiliation for Stannis because the title in the info box is royal house and not house. The scraper needs to be reconfigured to handle this.
I can fix this pretty quick...
The houses are not the only issue. There are several inconsistencies in the infobox of the caracters. For example "Titles" != "Aliases" or "Book(s)" != "books" != "Books". I am actually reimplenting the whole scraper, because the regex stuff from theo is not very maintanable and slow... Sorry. This is not a biggy and i am almost done and now i am trying to get even more information.
Again @Adiolis do what you can and feel free to delegate to someone. Family is more important, always.
Yeah. I know. It is just a fix of one line.
But guys, there are more problems than only the houses. "Titles" != "Aliases" or "Book(s)" != "books" != "Books" and so on.
Feel free to fix that.
@togiberlin @boriside @kordianbruck
How about using http://www.w3schools.com/jsref/jsref_tolowercase.asp for all the fields to normalize these things somewhat?
Jep. Still extra fixes for "Titles" != "Aliases" and so on are necessary. First someone needs to make a list of possible atttribute names.
Correct. You should have an array of synonyms for a given field.
Or you can machine learn what goes where :D I think the static approach is easier :D
Any volunteers? :laughing:
@kordianbruck @boriside @togiberlin @theocheslerean @docjag @alschm