twosamples icon indicating copy to clipboard operation
twosamples copied to clipboard

Weighting schemes

Open cdowd opened this issue 6 years ago • 3 comments

Should be able to incorporate observation weights with ease. This has been requested by several users.

cdowd avatar Dec 10 '19 19:12 cdowd

Okay, so weights on observations are a bit funky actually. For the ECDF the implication is easy -- it merely adjusts heights. I actually build that weight vector anyhow, and it would be easy to adjust.

But the weights may also affect the resampling probabilities (i.e. chances I observed this individual affect the sampling variation associated with drawing them, while the portion of the larger population they represent affects the ecdf height -- survey designs are capable of tying or splitting those values). Typically these weights are related (though they don't need to be), so you need to build in two different weight options, set a reasonable default, and explain which is which in a clear and consistent (and correct) manner.

The good news (I think) is that only the ECDF height weights need to be passed down to the C++, the other weights only affect the sampling level.

cdowd avatar Jan 13 '20 23:01 cdowd

Worse. Seems likely that sampling weights break exchangeability assumption.

Easy enough to build each ECDF. Much harder to figure out what the null about the ECDFs are.

E.g. suppose samples from two different surveys... what then?

cdowd avatar Jun 13 '22 22:06 cdowd

But if we are talking about comparing within a sample, for example, male vs female for employment data, then exchangeability assumption ok?

Maybe can implement weights and let individuals know that it only works when weights come from same sampling framework?

luisvalenzuelar avatar Nov 02 '23 21:11 luisvalenzuelar