GeneLab_Data_Processing icon indicating copy to clipboard operation
GeneLab_Data_Processing copied to clipboard

[Microarray Affymetrix] Probe IDs for AFFY HTA 2 0 probes have extra .1 suffix

Open cyouh95 opened this issue 1 year ago • 0 comments

Description

For one dataset with AFFY HTA 2 0 probes, the probe IDs have .hg.1 suffix whereas biomaRt data uses .hg suffix, causing the data not to be able to merge. This results in the following error:

Error in `dplyr::mutate()`:
  ! Problem while computing `..1 = dplyr::across(affy_hta_2_0,
    as.character)`.
  Caused by error in `across()`:
  ! Can't subset columns that don't exist.
  x Column `affy_hta_2_0` doesn't exist.
  Backtrace:
    1. ... %>% ...
    6. dplyr:::mutate.data.frame(...)
    7. dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())
    9. dplyr:::expand_across(dots[[i]])
   10. dplyr:::across_setup(...)

Not clear whether this is going to be an issue for all AFFY HTA 2 0 datasets, but does appear to be a known issue as reported here.

Solution

Remove the extra .1 suffix from the probe IDs in the raw data.

cyouh95 avatar Jul 29 '24 20:07 cyouh95