webcompat.com icon indicating copy to clipboard operation
webcompat.com copied to clipboard

[ml] rate of issues that machine learning is handling

Open karlcow opened this issue 5 years ago • 8 comments

Looking at the number of issues which are in needstriage, without action-needsmoderation and anonymous. I only see a couple which have been handled by the machine learning bot.

We need to better understand in which occasion the ML-process kicks in and if it is supposed to label all issues or not. And if it's normal or if it's a series of miss.

Capture d’écran 2020-03-09 à 06 59 47

karlcow avatar Mar 08 '20 22:03 karlcow

We moderated this morning 144 issues and the ML bot closed about 30 of them.

image

cipriansv avatar Mar 09 '20 07:03 cipriansv

@cipriansv how does it compare with before ?

karlcow avatar Mar 09 '20 07:03 karlcow

It took me about 45 minutes to moderate the issues and by the time I was finished with the process, the ML bot did its job.

From our point of view, it seems to be working fine, just as before the incident happened.

cipriansv avatar Mar 09 '20 07:03 cipriansv

From our point of view, it seems to be working fine, just as before the incident happened.

Two questions here

  • Why all anonymous issues are not being processed by the ML bot (what's the criteria? @miketaylr )
  • What was the rationale to not process issues which are non anonymous?

I have the feeling that all issues should be processed by the ML engine, at least as an advisory mechanism if we do not want to close them automatically for non anonymous. But I'm asking because I don't know yet what was the original expectations/constraints.

karlcow avatar Mar 10 '20 00:03 karlcow

  • Why all anonymous issues are not being processed by the ML bot (what's the criteria? @miketaylr )

Can you clarify what you mean by not being processed, @karlcow? My understanding is that the bot classifies all issues, and only tags/closes issues that have a 95% confidence level of being invalid. Something with a confidence level lower than that will be left to humans.

(If we find it useful, we could close all issues below 90%)

What was the rationale to not process issues which are non anonymous?

I'm not sure we have a strong rationale, it was just a decision that was made. I think the assumption was that anonymous reports tend to be less valid than authed reports, so let's start there. It would be interesting to run it for all reports, IMO.

miketaylr avatar Mar 12 '20 17:03 miketaylr

Can you clarify what you mean by not being processed, @karlcow? My understanding is that the bot classifies all issues, and only tags/closes issues that have a 95% confidence level of being invalid. Something with a confidence level lower than that will be left to humans.

@miketaylr in the current design I was not sure if an issue had been processed or not. but you confirmed that is the case. i wonder if a label action-ml-done would be appropriate. Maybe not.

karlcow avatar Apr 21 '20 05:04 karlcow

We noticed after monitoring the ML-bot activity that, lately, it does not close some issues that are obviously invalid, since they are scam or phishing sites and contain the "scam" or "phishing" words as part of the issue description.

Examples:

  • https://github.com/webcompat/web-bugs/issues/52740

  • https://github.com/webcompat/web-bugs/issues/52739

  • https://github.com/webcompat/web-bugs/issues/52683

cipriansv avatar May 11 '20 12:05 cipriansv

it does not close some issues that are obviously invalid,

This will probably depend on how the model was trained. It follows a pattern of what was invalid in the past (which may not have had those words or patterns inside of it). I think as we continue to re-train the model, it should bet better in the future.

(This should be possible, we just don't really know how to do it right now. :))

miketaylr avatar May 11 '20 18:05 miketaylr