Add units to all stats
We currently don't expose the units for each stat that we publish under /_node/<node>/_stats. We should.
I bet this is a pretty straightforward update to all of the stats_descriptions.cfg files for someone who knows what all of the stats are.
Hi @wohali.
I was looking at this issue, and the referenced file. It seems like there are units referenced in the description:
{[couchdb, auth_cache_hits], [
{type, counter},
{desc, <<"number of authentication cache hits">>}
]}.
Or were you thinking something like
{[couchdb, auth_cache_hits], [
{type, counter},
{unit, Integer},
{desc, <<"number of authentication cache hits">>}
]}.
Hey @tabeth , thanks for looking at this.
There's a lot of files:
./src/couch/priv/stats_descriptions.cfg
./src/dreyfus/priv/stats_descriptions.cfg
./src/couch_log/priv/stats_descriptions.cfg
./src/mem3/priv/stats_descriptions.cfg
./src/couch_replicator/priv/stats_descriptions.cfg
./src/ddoc_cache/priv/stats_descriptions.cfg
./src/chttpd/priv/stats_descriptions.cfg
./src/rexi/priv/stats_descriptions.cfg
./src/couch_mrview/priv/stats_descriptions.cfg
./src/global_changes/priv/stats_descriptions.cfg
./src/fabric/priv/stats_descriptions.cfg
It's a concern primarily for anything that isn't a counter - for instance, looking at the histograms for length of time, are those in milliseconds? microseconds?
Ah, I see @wohali
So this is more investigative, where we want (for all of the stats_descriptions.cfg files)
This:
{[dreyfus, rpc, search], [
{type, histogram},
{desc, <<"length of a search RPC worker">>}
]}.
To be something more like this:
{[dreyfus, rpc, search], [
{type, histogram},
{desc, <<"length (milliseconds) of a search RPC worker ">>}
]}.
So to successfully complete this we need to traverse each file and then figure out what the true values are and update the descriptions. Is that about right?
You got it! Anything that isn't a counter should be reviewed and ensure the description contains the correct unit.
@wohali OK. I'll take a look at the files and start trying to update the descriptions.
So for example (to make sure I'm doing this right): We have this:
{[dreyfus, index, search], [
{type, histogram},
{desc, <<"length of an dreyfus_index search request">>}
]}.
Which seems to be implemented here. dreyfus_util:time/2 seems to call timer:now_diff/2 here, which returns microseconds. However it's divided by 1000, so the answer to the original question posed in the beginning should be milliseconds.
So our original example can be
{[dreyfus, index, search], [
{type, histogram},
{desc, <<"length (in milliseconds) of an dreyfus_index search request">>}
]}.
Is that right?
@tabeth Careful. On Windows, erlang:system_time/0 will return time only in ms, not us. You'll need to check the code to be sure this does the right thing on that platform. We may need a source code change here if it does the wrong thing, similar to https://github.com/apache/couchdb/commit/24e2c8bf61a4e82504ea6e6ba2caa779980b9597#diff-7470c79716cb3e7f1b3d572732277d94R349
/cc @rnewson
But yes, that's the right approach.