extraction-framework icon indicating copy to clipboard operation
extraction-framework copied to clipboard

Field Sampling in mapping server

Open VladimirAlexiev opened this issue 10 years ago • 6 comments

Enhancement to the mapping server:

  • On the template stats page (eg this)
  • for every field, there is a blue number of occurrences
  • make that a hyperlink to show some of the occurrences (say 100)

Why:

  • Extremely useful to understand the meaning of some fields, since many fields are not well documented in the wiki templates
  • And to see whether they're links, text, or both (re Object/DataProp Dichotomy)

What:

  • Make two columns: page, field.
  • Would be great if you output HTML, but TDV would also work fine.
  • First col is a link to the page in Edit mode
  • Eg if I click on "venue":
page field "venue"
Alpine skiing at the 2002 Winter Olympics `[[Snowbasin]] (downhill, super-G, combined),
[[Park City Mountain Resort

VladimirAlexiev avatar Feb 11 '15 08:02 VladimirAlexiev

@VladimirAlexiev I am working on this task. Added a new page where we will list the field occurences, link to my code

To proceed further, I need a guidance on how to get all occurences of field.

ujjwalwahi avatar Mar 06 '15 12:03 ujjwalwahi

Hi @ujjwalwahi! I don't know where to get them from, I'm sure @jimkont can help you. Where is the code of server/templatestatistics that shows the number of occurrences of each field?

VladimirAlexiev avatar Mar 06 '15 13:03 VladimirAlexiev

@VladimirAlexiev Code of server/templatestatistics is here

ujjwalwahi avatar Mar 06 '15 14:03 ujjwalwahi

I did a bit of tracing. counter is derived from count which is obtained from sortedProps which comes from getMappingStats() which gets them from mappedStatistics. Search shows it in MappingStatsHolder. The closest to what we need is propertyUseCount Again search shows where propertyUseCount is summed. Gets it from val properties that's a map (name, (count, mapped)) Search shows that's constructed here and is obtained from wikiStats.templates This seems to be loaded here from some file...

This is as far as I got. @jimkont or @jcsahnwaldt can help better.

VladimirAlexiev avatar Mar 06 '15 17:03 VladimirAlexiev

@VladimirAlexiev I think, they're loaded from here. Yes, that's the same place as ignorelists. I guess you've already been there recently :) There's even a hint just one line above the last one you linked to.

Nono314 avatar Mar 06 '15 18:03 Nono314

So @ujjwalwahi this is a bit of a dead end: the extractor loads field mapping stats from files, but I still don't know HOW these files are produced. Maybe @jimkont or @jcsahnwaldt can help?

VladimirAlexiev avatar Mar 21 '15 11:03 VladimirAlexiev