PEtab icon indicating copy to clipboard operation
PEtab copied to clipboard

Long formats for conditions and experiments (timecourses)

Open dilpath opened this issue 1 year ago • 3 comments

Follow-up to https://github.com/PEtab-dev/PEtab/issues/585 (no need to read that).

Here are specs for long formats of the conditions and experiments (timecourses) tables. Additional feedback is very welcome!

Conditions table

conditionId inputId inputType inputValue
PETAB_ID NON_ESTIMATED_ENTITY_ID constant OR initial OR ... PETAB_MATH
e.g.
cond1 rate1 constant 1
cond2 species1 initial species1 + 5

Row and column ordering are arbitrary, although using the above column ordering may improve human readability.

Additional columns are allowed, for example, to specify a human-friendly name for the condition.

Other optional columns we could officially support include conditionName, but this might mean duplicated the same condition name to all rows with that condition ID...

Detailed field description

  • conditionId [PETAB_ID, REQUIRED] Unique identifier for the simulation/experimental condition, to be used in the experiments table.
  • inputId [NON_ESTIMATED_ENTITY_ID, REQUIRED] An entity that will be changed in this condition.
  • inputType [constant OR initial OR ..., REQUIRED] How the value inputValue changes the entity inputId.
    • constant The entity inputId is fixed to the value inputValue. The entity must be static in time while the condition is active, e.g. a model parameter.
    • initial The entity inputId is initialized to the value inputValue. The entity must be dynamic and defined in terms of time-derivative information, e.g. a model species involved in some reaction or specified by an ordinary differential equation.
    • rate/assignment/relativeRate/relativeAssignment These are currently not supported, until a tool implements them. However, they are reserved to mean changes equivalent to setting a new SBML rateRule or assignmentRule for the entity. relative indicates relative changes to pre-existing rates or assignments. edit: These can only be applied to entities (inputId) that are already governed by these kinds of dynamics. i.e. rate can only apply to entities that already have a rate rule in the original model. assignment/relativeAssignment can only apply to entities that already have an assignment rule in the original model. relativeRate can only apply to entities that already have either a rate rule or reactions.
  • inputValue [PETAB_MATH, REQUIRED] The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is activated (edit: or active, for time-varying inputTypes like rate), as defined in the experiments table.

Experiments table

experimentId time conditionId
PETAB_ID NUMERIC OR -inf conditionId
e.g.
timecourse1 -inf cond1
timecourse1 0 cond2

Row and column ordering are arbitrary, although using the above column ordering may improve human readability.

Additional columns are allowed, for example, to specify a human-friendly name for the experiment.

Detailed field description

  • experimentId [PETAB_ID, REQUIRED] Unique identifier for the experiment, to be used in the measurements table.
  • time [NUMERIC OR -inf, REQUIRED] The time when the condition will become active, in the time unit specified in the model. -inf indicates pre-equilibration (e.g. for drug treatments, the model would be pre-equilibrated with the no-drug condition).
  • conditionId [conditionId, REQUIRED] A conditionId from the conditions table.

Measurements table

Only the required or changed columns are included here (other optional columns, e.g. noiseFormula, are still supported by irrelevant to this discussion).

observableId [experimentId] time measurement
observableId [experimentId] NUMERIC OR inf NUMERIC
e.g.
obs1 experiment1 5 2

Detailed field description observableId and measurement are unchanged.

  • experimentId [experimentId, OPTIONAL] An experimentId from the experiments table. This replaces the preequilibrationConditionId and simulationConditionId in PEtab v1. If unspecified, then the simulation will be performed with the default parameters in the model.
  • time [NUMERIC OR inf, REQUIRED] Time point of the measurement in the time unit specified in the SBML model. inf (lower-case) indicates steady-state measurements. Cannot be lower than the lowest finite time in the experiments table.

Example

Conditions table

conditionId inputId inputValue inputType units
cond1 rate1 0 constant mg/s
cond1 rate2 1 constant m/s
cond2 species1 0 initial mol
preeq_cond1 rate1 1 constant g/s
switch_on switch 1 constant dimensionless
switch_off switch 0 constant dimensionless

Experiments table

experimentId time conditionId
timecourse1 -inf preeq_cond1
timecourse1 0 cond1
timecourse1 10 cond2
experiment1 -5 cond1
experiment1 -5 cond2
switch_sequence 0 switch_on
switch_sequence 1 switch_off
switch_sequence 2 switch_on
switch_sequence 3 switch_off
switch_sequence 4 switch_on
switch_sequence 5 switch_off

timecourse1 has a PEtab v1 preequilibrationConditionId (preeq_cond1), a PEtab v1 simulationConditionId (cond1), and then a 3rd timecourse period at t=10 with condition cond2.

experiment1 is not a timecourse, rather a single-condition simulation starting at t=-5 where two conditions are applied simultaneously.

switch_sequence is a repeating timecourse, equivalent to a nested timecourse (see https://github.com/PEtab-dev/PEtab/issues/585).

Open points

  1. There is currently some undefined behavior in the conditions table -- do we clarify that now or in a future PEtab v2.1 when the use cases are clearer? For example, what happens when a user specifies a parameter in the conditions table with inputType=constant, but then an SBML event affects the same parameter? We could simply disallow this for now.
  2. How are simultaneous conditions handled (e.g. experiment1 in the example). We could decide that they are only allowed if they change different entities. Otherwise we would need to care about some ordering.
  3. I'm happy to change naming, e.g. inputId->targetId, or experimentId->timecourseId. Let me know what you prefer. experimentId was chosen because most users won't care about timecourses, but then would still need to use that table for their single-condition "timecourses".

dilpath avatar Jul 17 '24 15:07 dilpath

Thanks a lot Dilan for accommodation the suggestions from #585 .🙏 Looks good to me for the most part, but I have two questions

rate/assignment/relativeRate/relativeAssignment These are currently not supported, until a tool implements them. However, they are reserved to mean changes equivalent to setting a new SBML rateRule or assignmentRule for the entity. relative indicates relative changes to pre-existing rates/reactions or assignments.

Can you give examples for the use of these? Cause to me, only the rate makes intuitive sense. I.e. rate means that the model is really perturbed. assignment would just mean that the current way of calculating the value of an SBML species/parameter/compartment is overridden with an assignment rule, right? But that would violate the SBML specs, which state "an assignment rule cannot be defined for a species that is created or destroyed in a reaction unless that species is defined as a boundary condition in the model." Not that PEtab has to stick to the SBML specs here, but still.

inputValue [PETAB_MATH, REQUIRED] The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is active, as defined in the experiments table.

Do you mean "is activated" instead of "is active"?

paulflang avatar Jul 17 '24 19:07 paulflang

Thanks a lot Dilan for accommodation the suggestions from #585 .🙏 Looks good to me for the most part, but I have two questions

Sure! Thanks for the feedback.

rate/assignment/relativeRate/relativeAssignment These are currently not supported, until a tool implements them. However, they are reserved to mean changes equivalent to setting a new SBML rateRule or assignmentRule for the entity. relative indicates relative changes to pre-existing rates/reactions or assignments.

Can you give examples for the use of these? Cause to me, only the rate makes intuitive sense. I.e. rate means that the model is really perturbed.

Given an inputId=species1 and inputValue=5 and original rate rule for species1: d(species1)/dt = 2*species1

  • inputType=rate means d(species1)/dt = 5, i.e. simply change the whole dynamic of the species
  • inputType=relativeRate means d(species1)/dt = 2*species1 + 5, i.e. a modification of the original rate is made, which is somewhat like adding a new reaction for this species to the system

Given an original assignment rule for species1: species1(t) = 2*t + k1

  • inputType=assignment means species1(t) = 5
  • inputType=relativeAssignment means species1(t) = 2*t + k1 + 5

I was trying to capture all possibilities, include the "relative" and "isDelta" changes discussed in https://github.com/PEtab-dev/PEtab/issues/564 and https://github.com/PEtab-dev/PEtab/issues/585, and the "bolus" vs. "infusion" that you implemented in PumasQSP [1] via the duration column in that dosing table.

assignment would just mean that the current way of calculating the value of an SBML species/parameter/compartment is overridden with an assignment rule, right? But that would violate the SBML specs, which state "an assignment rule cannot be defined for a species that is created or destroyed in a reaction unless that species is defined as a boundary condition in the model." Not that PEtab has to stick to the SBML specs here, but still.

I agree, I'm not sure how to resolve this best. This is one reason why I limited constant to things that are already "constant" in the model, like parameters, and initial to things that are specified by time-derivative information. Similarly, I would limit rate/(relative)Assignment to things that are already defined by rate/assignment rules in the model. However, I think relativeRate can be interpreted as a new reaction for a species, so could apply to species defined by either reactions or rate rules. I clarified this with a bold edit in the first message now.

inputValue [PETAB_MATH, REQUIRED] The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is active, as defined in the experiments table.

Do you mean "is activated" instead of "is active"?

For the inputTypes constant and initial, this is equivalent to "is activated". But for e.g. rate, then "is active" is more accurate, since it will be evaluated "continuously" during the timecourse period with this condition. I can see this is confusing though... I clarified it with a bold edit in the first message now.

[1] https://help.juliahub.com/pumasqsp/stable/tutorials/petabimport_tutorial/#Detailed-field-description

dilpath avatar Jul 17 '24 23:07 dilpath

  • inputType [constant OR initial OR ..., REQUIRED] How the value inputValue changes the entity inputId.

    • constant The entity inputId is fixed to the value inputValue. The entity must be static in time while the condition is active, e.g. a model parameter.
    • initial The entity inputId is initialized to the value inputValue. The entity must be dynamic and defined in terms of time-derivative information, e.g. a model species involved in some reaction or specified by an ordinary differential equation.

If constant is only allowed for entities that are already constant, this could as well be replaced by initial, right? This would be coherent with initialAssignments in SBML.

  • rate/assignment/relativeRate/relativeAssignment These are currently not supported

Then I'd leave them out for now.

  • inputValue [PETAB_MATH, REQUIRED] The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is activated (edit: or active, for time-varying inputTypes like rate), as defined in the experiments table.

Related to previous discussions, we could introduce a priority column or something, which would potentially make simultaneous compartment size and concentration changes more intuitive. The interpretation of the entity symbol would be different then.

  • There is currently some undefined behavior in the conditions table -- do we clarify that now or in a future PEtab v2.1 when the use cases are clearer? For example, what happens when a user specifies a parameter in the conditions table with inputType=constant, but then an SBML event affects the same parameter? We could simply disallow this for now.

This could be clarified by replacing constant by initial, unless one really wants to disable events affecting a certain entity. Do we need that?

  • How are simultaneous conditions handled (e.g. experiment1 in the example). We could decide that they are only allowed if they change different entities. Otherwise we would need to care about some ordering.

I'd go for specifying some priority as suggested for the conditions table.

  • I'm happy to change naming, e.g. inputId->targetId, or experimentId->timecourseId. Let me know what you prefer. experimentId was chosen because most users won't care about timecourses, but then would still need to use that table for their single-condition "timecourses".

experimentId is good. Slight preference for targetId, targetValue over input*.

dweindl avatar Aug 02 '24 07:08 dweindl

Brief update from some further discussions with @dilpath:

For the condition table, more appropriate column names might be

  • operationType instead of inputType, the respective values could be setRate, setAssignment, addToRate, ...
  • targetId instead of inputId
  • targetValue instead of inputValue

I would suggest to consolidate inputType=constant and inputType=initial and just have operationType=setCurrentValue (or similar), because I don't really see any added value in distinguishing those.

Another issue was: If I want to have pre-equilibration with the default model parameters and then switch to some other condition - how could I specify that? Previously, this could have been implemented by an all-NaN condition in the conditions table. This is no longer possible. After considering a couple of alternatives (e.g., empty conditionId in the experiment table; some conditionId in the experiment table that does not occur in the conditions table) , the most reasonable one seemed to be introducing some kind of no-op operationType that would be the same as an all-NaN condition in the PEtab v1 condition table (i.e., using the model state from the previous period, or in case of the first period, use the model without any changes).

Feedback is welcome.

dweindl avatar Dec 19 '24 12:12 dweindl

Brief update from some further discussions with @dilpath:

For the condition table, more appropriate column names might be

  • operationType instead of inputType, the respective values could be setRate, setAssignment, addToRate, ...
  • targetId instead of inputId
  • targetValue instead of inputValue

Fully agreed.

I would suggest to consolidate inputType=constant and inputType=initial and just have operationType=setCurrentValue (or similar), because I don't really see any added value in distinguishing those.

Yes.

Another issue was: If I want to have pre-equilibration with the default model parameters and then switch to some other condition - how could I specify that? Previously, this could have been implemented by an all-NaN condition in the conditions table. This is no longer possible. After considering a couple of alternatives (e.g., empty conditionId in the experiment table; some conditionId in the experiment table that does not occur in the conditions table) , the most reasonable one seemed to be introducing some kind of no-op operationType that would be the same as an all-NaN condition in the PEtab v1 condition table (i.e., using the model state from the previous period, or in case of the first period, use the model without any changes).

For me, the most natural thing would be the described alternative of conditionId that appears in the experiment table but not the conditions table. I see how this could make linting/validation more difficult and typos are more dangerous, but wouldn't making sure that all conditions in the condition table appear in the experiments table be enough to catch the worst errors?

FFroehlich avatar Dec 19 '24 12:12 FFroehlich

For me, the most natural thing would be the described alternative of conditionId that appears in the experiment table but not the conditions table. I see how this could make linting/validation more difficult and typos are more dangerous, but wouldn't making sure that all conditions in the condition table appear in the experiments table be enough to catch the worst errors?

That typo issue seemed relevant to me and I thought the proposed no-op makes the intent more explicit. I'd have some preference for explicitness, but I could live with either.

Whether unused conditionIds should be considered illegal, or just optionally trigger some warning is another question that should be clarified (same as, for example, unused observables -- I don't think there is anything in the specs). But even if we consider it illegal, we still wouldn't know if some conditionId in the experiment table was left undefined on purpose or not. Maybe that argument is isn't that strong, given that we have a number optional fields, and allow empty experimentIds in the measurement table for trivial timecourses...

dweindl avatar Dec 19 '24 12:12 dweindl

For me, the most natural thing would be the described alternative of conditionId that appears in the experiment table but not the conditions table. I see how this could make linting/validation more difficult and typos are more dangerous, but wouldn't making sure that all conditions in the condition table appear in the experiments table be enough to catch the worst errors?

That typo issue seemed relevant to me and I thought the proposed no-op makes the intent more explicit. I'd have some preference for explicitness, but I could live with either.

Fair point

Whether unused conditionIds should be considered illegal, or just optionally trigger some warning is another question that should be clarified (same as, for example, unused observables -- I don't think there is anything in the specs). But even if we consider it illegal, we still wouldn't know if some conditionId in the experiment table was left undefined on purpose or not. Maybe that argument is isn't that strong, given that we have a number optional fields, and allow empty experimentIds in the measurement table for trivial timecourses...

Also good point, I generally would prefer warnings only as it makes it a bit easier to reuse tables across problems.

FFroehlich avatar Dec 19 '24 15:12 FFroehlich

Brief update from some further discussions with @dilpath: ...

I support the renaming.

On the pre-equilibration issue: I'm generally in favour of one explicit way of formulating this. Would it be an option to permit empty conditionIds in the experiment table where the model would be simulated with the default model parameters?

m-philipps avatar Jan 13 '25 20:01 m-philipps

Would it be an option to permit empty conditionIds in the experiment table where the model would be simulated with the default model parameters?

Just so we're talking about the same thing, I'll take an example from Daniel: the current suggestion is to have a conditions and experiments table like

conditionId operatorType targetId targetValue
foo setValue p 1
preeq no-op
experimentId conditionId time
e1 preeq -inf
e1 foo 0

preeq is some explicit, interpretable label that describes the preequilibration with default model parameters.

Your question suggests an experiments table like

experimentId conditionId time
e1 -inf
e1 foo 0

which is a single way of formulating it, but I don't see it as explicit: did the user intend to omit a condition ID there, or just forget? I would consider a single explicit formulation to instead be some reserved conditionId NOOP. Also fine for me, but less interpretable than a user-defined ID like preeq.

dilpath avatar Jan 13 '25 21:01 dilpath

Your question suggests an experiments table like experimentId conditionId time e1 -inf e1 foo 0

which is a single way of formulating it, but I don't see it as explicit: did the user intend to omit a condition ID there, or just forget? I would consider a single explicit formulation to instead be some reserved conditionId NOOP. Also fine for me, but less interpretable than a user-defined ID like preeq.

Yes, that's it. No, it's not an explicit solution, I just wanted to get your opinion.

In principle, I would prefer something like NOOP, but I don't see the advantage of defining a condition and making it no-op over having a DEFAULT or NOOP option for the experiment table conditionId.

m-philipps avatar Jan 15 '25 09:01 m-philipps

Addressed by https://github.com/PEtab-dev/PEtab/pull/581

dweindl avatar Mar 25 '25 09:03 dweindl