FeatureExtraction icon indicating copy to clipboard operation
FeatureExtraction copied to clipboard

CVX Codes Excluded from Covariates

Open cukarthik opened this issue 5 years ago • 6 comments

We've noticed that vaccines coded with CVX codes are not included as covariates. I'm still trying to understand the code, but I'm wonder if it is due to this query or this one.

cukarthik avatar Aug 17 '20 04:08 cukarthik

@schuemie , if you think this is correct, I can try to create a PR if you'd like.

cukarthik avatar Sep 10 '20 17:09 cukarthik

The first query your referenced is for emulating the HDPS algorithm, which I hope nobody uses outside of method evaluation. The second query is indeed the query used to construct the drug covariates, but I'm not sure what the issue is.

What specifically is the problem with the vaccines? Are they not standard concepts? Are the concepts not in the Drug domain? Are they not in the Drug_era table?

schuemie avatar Oct 27 '20 05:10 schuemie

The issue is that CVX codes are standard codes; however, they are not part of the ATC hierarchy so all CVX coded vaccines in the drug table are not included as a covariate in feature extraction based on my understanding of the query. most of our vaccines are all coded in CVX codes and I would imagine many EHR based databases in the US are as well, at least for vaccines prior to 2018..

@cgreich can correct me on CVX not being under ATC.

cukarthik avatar Oct 28 '20 14:10 cukarthik

I would like to add that vaccine data on some claims databases are not being included because of this issue.

mattspotnitz avatar Oct 28 '20 15:10 mattspotnitz

Hi @schuemie ,

We did some digging and looked at the volume of CVX codes in our databases (both claims and ehr) as seen below. We found that CVX codes are present a lot for vaccines (we searched for the word vaccine for RxNorm concepts - thanks @aostropolets). Considering the high frequency of CVX codes are in the databases, I would expect them to show up as a covariate in the propensity model; however, in one of our studies we are seeing that HPV vaccines are not showing up as a covariate when we would expect it. The only other place I see CVX codes potentially excluded is here, but I'm not sure if that used in the propensity model. Anyway, if you point me in the right direction of what query needs to be adjusted, I can make the change and create a pull request, assuming it's a sql issue.

record_count vocabulary_id database
14,342,170 RxNorm CCAE
165,986,759 CVX CCAE
4,155,058 RxNorm MDCD
67,567,343 CVX MDCD
1,217,238 RxNorm MDCR
8,339,518 CVX MDCR
402,354 RxNorm 2018q4
3,297,009 CVX 2018q4

@pbr6cornell @aostropolets

cukarthik avatar Nov 10 '20 21:11 cukarthik

We'are still working on the CVX hierarchy. We expect that all vaccines will roll up to some ATC code (through RxNorm or via direct hierarchical relationship to ATC), so these ATC or RxNorm concepts would be treated as a feature.

dimshitc avatar Jan 20 '21 11:01 dimshitc