peppy icon indicating copy to clipboard operation
peppy copied to clipboard

Merge Eido into Peppy

Open ClaudeHu opened this issue 3 months ago • 4 comments

Likely solved

merging eido

merging

eido is relocated and is now a module within peppy.

from peppy.eido.validation import validate_project
from peppy import Project

pep_proj = Project("path/to/project/config.yaml")
validate_project(pep_proj, "path/to/schema/file.yaml")

peppy didn't have CLI before relocation. After the relocation, the command is peppy:

$ peppy inspect config.yaml
$ peppy validate config.yaml -s schema.yaml
$ peppy convert config.yaml --format csv

pre-processd PEP validation

Config validation is extended. It can directly validate given path of config file against a schema.

validate the original samples

New function that validate a sample table against a given schema.

Relative path in schema to import

If the schema path is given, peppy.schema.read_schema parses relative paths if the initial schema path is the input.

Repeated warning

The warning message is now printed at the end. For an unpopulated environment variable, the relative paths under it are accumulated and printed once together. For example:

WARNING - Not all environment variables were populated in derived attribute source: $PROJECT/{frog_0h.fastq, pig_1h.fastq, pig_0h.fastq, frog_1h.fastq}

Confusing error

Corrected as suggested in issue.

To discuss

Integer type in PEP

Default dtype is str when a sample csv table is parsed. Would like to hear everyone's thought on that.

Other minor changes

In argparser for CLI relocated from eido, default args related to sample table indices is changed to None. Previous default values can interfere with project config when running CLI, like the attached PEP in integer type issue.

ClaudeHu avatar Oct 30 '25 22:10 ClaudeHu

Its so much code.... hard to get to all of it I tried. Theres a lot of missing type annotations. I think the co-pilot review is good tooo

True...really not many type annotations in the original eido (a few in docstrings)

ClaudeHu avatar Oct 31 '25 20:10 ClaudeHu

Good first start. In my opinion there few things in structure that needs to be done.

  1. peppy should be also in subfolder, similar how we are doing in gtars. Eido and peppy are two different projects, so it's better if each of them will live in separate folder.

I disagree -- I would like eido to be a sub-component or module within peppy, not a standalone project. So I think the way it's done here is correct.

  1. Tests: same as previous point: tests/data - folder should be divided on 2 or 3 folder: eido, peppy, common (if we have any common files)
  2. I don't see eido git history code. Probably we need to merge 2 branches into one, so we can preserve history. Unfortunately it is pretty a big project, and I would highly recommend of preserving git history. To do it we should merge 2 repositories into one.

I think it's probably fine to just leave it like this. preserving history would have been better, but I think it's fine to just go for it. Eido was relatively small anyway.

  1. CLI file should live in eido, not peppy. + We want to add pephubclient too, so Everything should be in separate folders (hmmm, maybe I am wrong - but here we returning to the first point)

Actually I think everything should move under the peppy CLI . So I think this is actually the right directly.

Good first start, but I think this things should be fixed. Looking forward to hear other people opinions

nsheff avatar Nov 06 '25 02:11 nsheff

I like @khoroshevskyi 's suggestion:

Tests: same as previous point: tests/data - folder should be divided on 2 or 3 folder: eido, peppy, common (if we have any common files)

I think it would be better to have the tests be more organized.

donaldcampbelljr avatar Nov 06 '25 16:11 donaldcampbelljr

Please ensure the exception handling is more polished for the areas raised by ClaudeAI.

some of this is addressed in my stacked PRs

nsheff avatar Nov 06 '25 16:11 nsheff