methylKit When have more than one experimental group, how to assign

For the 'myobj' where using the 'methylread' function: How should we identify different experimental groups for the 'treatment' variable? Should the numbers 0 (controls), 1 (exp condition1), 2 (exp condition2), 3 (exp condition3), etc be used? Or are all experimental groups, regardless if they are independent groups, be designated by a "1"? I have one control group with 3 to 6 biological replicates, and 3 different experimental groups that are independent of each other (each experimental group has 3-6 biological replicates). All experimental groups will be compared to the control group. So far, I've been assigning any and all experimental groups (and their respective biological replicates) as "1." The 'sample.id' is where I have been distinguishing the different experimental conditions.... Then I've been independently running pairwise comparisons of each experimental group vs. the control group for differential methylated regions. Does this sound correct? Or should I be doing this differently? Thanks!

Oct 16 '25 19:10 ajneville

Hi @ajneville,

Usually, I tend to use different treatment numbers for multiple conditions, but this is only relevant if you want to keep all samples in the same object. For comparing multiple groups against a single control, we are following a similar workflow as you are describing here. If you are storing all samples in one object, you can use the reorganize function to subset for the current group of interest and then proceed with your analysis.

Best, Alex

Oct 20 '25 12:10 alexg9010

@alexg9010, thanks for the reply! I am just curious if it has some effect on when doing the unite functions or downstream analyses, especially considering how it may affect the way it's analyzed when using parameters like min.per.group:

If the control group (treatment = 0) is viewed as a single group containing its respective 3 samples/bioreplicates (with respective sample IDs, i.e., control-1, control-2, control-3) and it's "technically" viewing/analyzing all of the treatment's groups (although very different), as ONE single "group", and not independently? - for example:
being 3 different treatment groups (consisting of 3+biological reps each), it is looking at the 3 bioreps x 3 exp treatments = 9 "treated" samples and taking a min.per.group = 3 (if set to 3) but it is looking at 9 total samples of which 6 are un, although each treatment group consisting of 3 bioreps each, are very much different and independent, and no expected correlation to each other). Maybe I am misunderstanding if min.per.group applies to what the treatment parameter is designated as or if it is by the group or sampleID parameters in the unite functions?

So you would actually recommend splitting different treatment groups and listing as:

0 = control group (consisting of 3 bio-reps/samples)
1 = exp treatment A group (with 3 bioreps/samples)
2 = exp treatment B group (with 3 bioreps/samples)
3 = exp treatment C group (with 3 bioreps/samples).

Then you would use the reorganize function, as you stated?

I just am curious if by not doing what you proposed, and instead did what I stated in my initial post, it could be doing a faulty analysis, mostly with respect to the min.per.group variable. As well as normalizations?

Perhaps I mis-phrased the question, and it's a question of whether designation any groups other than controls as "1" = 'treated" effects the statistical model that will affect the DMRs/DMBs. Sorry if that is confusing, please feel free to ask for clarification if needed. Thanks for your time and help!

Best, Andrew

Oct 30 '25 02:10 ajneville