chainladder-python Formatting output for Mack tests

I have some proposals for changing the output formatting of the three Mack tests from the first part of the development tutorial:

Valuation Correlation - All Origins: Currently output as a 1x1 Pandas DataFrame containing a bool, with the row and column label being remnants from input data structures. I propose we switch to a bool, which would save the trouble of having to extract the value from the DataFrame when using it for things like conditional statements.

Valuation Correlation - Individual Origins: Output as a 1-row chainladder Triangle filled with bools. The row label is the earliest origin period. I propose we relabel the row with something related to the hypothesis test. I'm thinking maybe "Reject", but that doesn't exactly match the True/False nature of the output. "Effect" would be another candidate, since True/False answers the question as to whether that diagonal has a calendar period effect.

Development Correlation: Currently output as a 1x1 Numpy ndarray with a bool. I propose we switch to a bool.

If you think these are worthwhile, I could add them to my upcoming PR concerning the other Mack stuff.

Oct 09 '22 13:10 genedan

I agree with #1 and #4, are you missing #2?

For #3, I don't know the paper/method well, but if I recall correctly, this is the test for calendar year effect, so the first False is the test statistics comparing the correlation between 1981 vs 1982, and the second False is the test statistics comparing the correlation between 1981 and 1983, and so on...

I don't think I see anything wrong with the label, can you elaborate on what you think is wrong with the current label of 1981?

Oct 12 '22 15:10 kennethshsu

Is there an update on this? @genedan what was merged?

(Trying to keep tickets clean 🥹)

Nov 17 '22 05:11 kennethshsu

@kennethshsu

I'm still working on it. I reviewed the paper and the valuation correlation test is carried out for each diagonal, so we should just display a NaN for the missing ones.

But I did run into an issue where the result is an ndarray of bools which can't handle the NaNs appropriately (they evaluate to some truth value instead). I'm thinking of changing the output format to a pandas dataframe which should be able to handle mixed types.

Nov 18 '22 01:11 genedan