gate
gate copied to clipboard
Partition summary for embeddings
Interesting approach for drift detection! Can you please tell me if the partition summary in the case of embeddings is the same as below (https://dm4ml.github.io/gate/how-it-works/) or are you taking into account other factors: coverage: The fraction of the column that has non-null values. mean: The mean of the column. p50: The median of the column. num_unique_values: The number of unique values in the column. occurrence_ratio: The count of the most frequent value divided by the total count. p95: The 95th percentile of the column.
The partition summary includes the summary statistics listed above, for each dimension of the embeddings!