Drop table SOURCE_TO_CONCEPT_MAP
This table is a remnant of OMOP CDM Version 3, when we didn't have the notion of source concepts. All sources were codes. That created havoc, because many vocabularies had both source only and standard content. Of course, some standard content is also used in the source. In addition, names from the source codes were not available anywhere. Finally, searching for source codes could only be done in the STCM table, which meant, we had to add a ton of records without any mapping. So, it was a mess.
The solution was to have all codes as concepts, and indicate their nature in the standard_concept field. That system, together with the "Maps to" CONCEPT_RELATIONSHIP records are the comprehensive solution to the problem. STCM is now completely redundant and only left in there for convenience, so not everybody had to change their mechanism to manage their mappings.
That was what? 7 years ago? It's time to move on. Codes in STCM are essentially hidden from any activity except ETLing, which could easily pivot to CONCEPT_RELATIONSHIP. ATLAS doesn't use it, and no other analytic for whatever use case uses it.
Agree that this would be a step forward, there is no reason to have this as a standard table. Local teams can still choose to use a table like the stcm as a mapping table during the ETL (but we should discourage that).
Note that this also affects Usagi where the stcm is the default export format (which is also not ideal). Linking relevant issue: https://github.com/OHDSI/Usagi/issues/132
And for reference, Melanie has provided a nice comparison of the methods, and why using concept_relationship is overall better: https://github.com/OHDSI/Usagi/issues/132.
Andrew: this table is used quite widely across the community, not just for manual mappings. This is also used for staging and other things. Potentially separate from the other tables in a "utility" or "reference" schema. Potentially we could make this designation for cohort and cohort_definition in the CDM schema as well.
I do not think we should remove the STCM in the upcoming minor version as we know it is widely used in the community.
We need to know how people are using it, we should be aware of it before we propose a change.
We can create schemas or categories for working tables, results tables, evidence tables, ETL tables to describe these tables.
From Roger's 2024 OHDSI poster: The STCM still remains in wide use for several reasons:
- The Book of OHDSI 1 recommends its use.
- It is relatively simple to implement.
- USAGI 2 works with the STCM format.
- Local mapping can be maintained by a separate team unfamiliar with OMOP.
- There is no standard or recommended way to maintain the C/CR method.
- Moving from the STCM to C/CR can involve a lot of ETL code modification.
There are several reasons to stop using SCTM:
- Codes mapped in STCM are not visible in ATLAS 3 and other standard OHDSI tools.
- STCM is not flexible enough to map the more subtle relationships available with C/CR like “Maps To Value”.
- Hierarchies are not supported using STCM.
BOTTOM LINE
- Keep STCM in v5.x
- Add the WIDE_MAPPING table to v5.x as alternative
- Create categories of tables like "working" or "ETL support" tables to encompass STCM, CDM.cohort and cohort_definition, WIDE_MAPPING (needs discussion)
- Update Usagi to output in the WIDE_MAPPING structure
- Create code to take data from STCM structure to WIDE_MAPPING structure
- More documentation around how to use these tables, reference implementations
- More documentation around the process to update the CDM to a minor and major version