Study thematic controlled vocabulary?
If I'm not mistaken, the BrAPI studies can be used to represent phenotyping studies or genotyping studies (and maybe even other kinds in the future)
The problem is that for now there is no definite way to distinguish between the two.
The "studyType" could be used but most of the current implementation of BrAPI use it to describe the type of phenotyping study ("field study", "greenhouse study") not the study thematic (phenotyping vs genotyping)
Maybe it would be best to add a new string field like "studyThematic" or "thematicType" that should respect a controlled vocabulary defined by the BrAPI containing at least: "phenotyping" and "genotyping" (and maybe also "genomic", "transcriptomic", "proteomic")
It would even be better if these terms came from ontologies but its difficult to find the right one and reference it correctly in the BrAPI (without JSON-LD)
I've thought about this too. I'd want to see what the differences in data needs are between a phenotyping study and a genotyping study are. Does a genotyping study need its own data model?Right now BrAPI studies have a list of observation units with phenotype observations. Does a genotyping study need this list of observation units, or should it have lists of samples and markers instead? Are there some studies which analyze both phenotypic and genotypic traits?
I started a topic on the BrAPI forum a better genotyping data model (though I haven't had time to really work on it yet). I think this issue needs more than just an additional field linked to an ontology.
In BrAPI-speak, in cassavabase, the "samples" that are in a "plate" can be linked to "observationUnits" (plots or plants) from a "study"; the "markerprofiles" will thereby be linked to the same "samples" in the "plate". In this way we can have direct linkage between phenotypes and genotypes. Does this map to what BrAPI intended to happen?
On Tue, Apr 10, 2018 at 2:02 PM, Peter Selby [email protected] wrote:
I've thought about this too. I'd want to see what the differences in data needs are between a phenotyping study and a genotyping study are. Does a genotyping study need its own data model?Right now BrAPI studies have a list of observation units with phenotype observations. Does a genotyping study need this list of observation units, or should it have lists of samples and markers instead? Are there some studies which analyze both phenotypic and genotypic traits?
I started a topic on the BrAPI forum https://forum.brapi.org/t/finding-a-better-genotype-model/15/3 a better genotyping data model (though I haven't had time to really work on it yet). I think this issue needs more than just an additional field linked to an ontology.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/plantbreeding/API/issues/232#issuecomment-380193660, or mute the thread https://github.com/notifications/unsubscribe-auth/AGs_IHerefyARtLnW52BxCacmAjsTlVpks5tnPPEgaJpZM4TOkBR .
In T3 we have both genotype and phenotype studies. There are a few fields that are shared between them (program, year, etc) and there are other fields that are specific to each type phenotype_study (location, planting_data, harvest_date) genotype_study (marker_type, platform, analysis_software). So I have been using the studyType field to indicate (genotype, phenotype). I like the idea of a controlled vocabulary when it it possible.
@nickmorales yea that chain of data makes sense to me. And in that case in Cassavabase, you would consider a "study" entity to be both a "Phenotype Study" and a "Genotype Study". This works because you have all the data in one system.
Other breeding management systems don't have access to marker or sample data, so they only have enough data to return a "Phenotype study"
On the flip side, GOBii has the concept of a "Genotype study" in their database, but no way to return it in BrAPI because they don't have the observation unit data.
@ClayBirkett Thanks for the info, that helps. I think using studyType to differentiate like you are is a fine solution until a more robust one is available. Are the two types of studies somehow linked if someone is performing analysis which requires both types of data?
I admit that we do not have a good way to link phenotype studies to genotype studies. They use the same germplasm identifiers but there is no simple way to determine which is the best genotype data set for a specific phenotype study. One method to solve this problem is to use consensus genotype data (combining all genotypes for a marker) but this only works for array based genotyping not GBS. We also may want to support imputed genotype data, but I won’t suggest that until next year.
Clay