CPhasing icon indicating copy to clipboard operation
CPhasing copied to clipboard

expected output type 'Null', got 'Int64'; set `return_dtype` to the proper datatype

Open xychen233 opened this issue 4 months ago • 3 comments

老师您好!感谢您发布的CPhasing软件!我最近正在用它测试组装的单倍型基因组,我遇到了一些错误不知道应该怎么解决,还望老师可以给一些建议! 这是我的运行命令:

export PATH=/media/APP/miniconda3/bin/:$PATH
export PYTHONPATH=/media/APP/python3.12/site-packages:$PYTHONPATH
export PATH=/media/APP/CPhasing/bin:$PATH
export PYTHONPATH=/media/APP/CPhasing:$PYTHONPATH
cphasing pipeline -f genome.fasta -hic1 HiC_1.fq.gz -hic2 HiC_2.fq.gz -t 40 -n 47

我遇到的问题是在第三步:

                    #----------------------------------#
                    #  Running step 3. hyperpartition  #
                    #----------------------------------#
[19:25:43] INFO     Running hyperpartition with `basal(haploid)`     cli.py:3952
                    mode.
           INFO     Load raw hypergraph from pairs file              cli.py:4088
                    `../HiC.pairs.pqs`
           INFO     Extract edges from pairs.                  hypergraph.py:109
           INFO     Parsing pqs ...                                   pqs.py:669
           INFO     Filtered the data with mapq >= 1.                 pqs.py:347
[19:25:56] ERROR    expected output type 'Null', got 'Int64'; set    cli.py:1189
                    `return_dtype` to the proper datatype
                    _RemoteTraceback:
                    """
                    Traceback (most recent call last):
                      File
                    "/media/APP/python3.12/site
                    -packages/joblib/externals/loky/process_executor
                    .py", line 490, in _process_worker
                        r = call_item()
                      File
                    "/media/APP/python3.12/site
                    -packages/joblib/externals/loky/process_executor
                    .py", line 291, in __call__
                        return self.fn(*self.args, **self.kwargs)
                               ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/media/APP/python3.12/site
                    -packages/joblib/parallel.py", line 607, in
                    __call__
                        return [func(*args, **kwargs) for func,
                    args, kwargs in self.items]
                                ~~~~^^^^^^^^^^^^^^^^^
                      File
                    "/media/APP/CPhasing/cphasing/pqs.py",
                    line 1107, in process_chunk_hg
                        return chunk.collect()
                               ~~~~~~~~~~~~~^^
                      File
                    "/media/APP/python3.12/site
                    -packages/polars/_utils/deprecation.py", line
                    97, in wrapper
                        return function(*args, **kwargs)
                      File
                    "/media/APP/python3.12/site
                    -packages/polars/lazyframe/opt_flags.py", line
                    328, in wrapper
                        return function(*args, **kwargs)
                      File
                    "/media/APP/python3.12/site
                    -packages/polars/lazyframe/frame.py", line 2415,
                    in collect
                        return wrap_df(ldf.collect(engine,
                    callback))
                                       ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
                    polars.exceptions.SchemaError: expected output
                    type 'Null', got 'Int64'; set `return_dtype` to
                    the proper datatype
                    """

                    The above exception was the direct cause of the
                    following exception:

                    ╭───── Traceback (most recent call last) ──────╮
                    │ /media/APP/CPhasing/cphasing/cli.py:1 │
                    │ 121 in pipeline                              │
                    │                                              │
                    │   1118 │                                     │
                    │   1119 │   today = datetime.now().strftime(" │
                    │   1120 │   try:                              │
                    │ ❱ 1121 │   │   run(fasta,                    │
                    │   1122 │   │   │   ul_data,                  │
                    │   1123 │   │   │   porec_data,               │
                    │   1124 │   │   │   porectable, pairs,        │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/pipeline │
                    │ /pipeline.py:1071 in run                     │
                    │                                              │
                    │   1068 │   │   │   _out_sh.write("\n")       │
                    │   1069 │   │                                 │
                    │   1070 │   │   try:                          │
                    │ ❱ 1071 │   │   │   hyperpartition.main(args= │
                    │   1072 │   │   │   │   │   │   │   prog_name │
                    │   1073 │   │   except SystemExit as e:       │
                    │   1074 │   │   │   exc_info = sys.exc_info() │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/rich_click/rich_command.py:216  │
                    │ in main                                      │
                    │                                              │
                    │   213 │   │   try:                           │
                    │   214 │   │   │   try:                       │
                    │   215 │   │   │   │   with self.make_context │
                    │ ❱ 216 │   │   │   │   │   rv = self.invoke(c │
                    │   217 │   │   │   │   │   if not standalone_ │
                    │   218 │   │   │   │   │   │   return rv      │
                    │   219 │   │   │   │   │   # it's not safe to │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/click/core.py:1246 in invoke    │
                    │                                              │
                    │   1243 │   │   │   echo(style(message, fg="r │
                    │   1244 │   │                                 │
                    │   1245 │   │   if self.callback is not None: │
                    │ ❱ 1246 │   │   │   return ctx.invoke(self.ca │
                    │   1247 │                                     │
                    │   1248 │   def shell_complete(self, ctx: Con │
                    │   1249 │   │   """Return a list of completio │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/click/core.py:814 in invoke     │
                    │                                              │
                    │    811 │   │                                 │
                    │    812 │   │   with augment_usage_errors(sel │
                    │    813 │   │   │   with ctx:                 │
                    │ ❱  814 │   │   │   │   return callback(*args │
                    │    815 │                                     │
                    │    816 │   def forward(self, cmd: Command, / │
                    │    817 │   │   """Similar to :meth:`invoke`  │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/cli.py:4 │
                    │ 089 in hyperpartition                        │
                    │                                              │
                    │   4086 │   elif pairs:                       │
                    │   4087 │   │   if is_file_changed(hcr_bed) o │
                    │        Path(hypergraph_path).exists():       │
                    │   4088 │   │   │   logger.info(f"Load raw hy │
                    │ ❱ 4089 │   │   │   he = Extractor(hypergraph │
                    │   4090 │   │   │   │   │   │      min_qualit │
                    │        edge_length=edge_length,              │
                    │   4091 │   │   │   │   │   │      hcr_invert │
                    │   4092 │   │   │   he.save(hypergraph_path)  │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/hypergra │
                    │ ph.py:95 in __init__                         │
                    │                                              │
                    │     92 │   │   self.log_dir = Path(log_dir)  │
                    │     93 │   │   self.log_dir.mkdir(parents=Tr │
                    │     94 │   │                                 │
                    │ ❱   95 │   │   self.edges = self.generate_ed │
                    │     96 │                                     │
                    │     97 │   @staticmethod                     │
                    │     98 │   def _process_df(df, contig_idx, t │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/hypergra │
                    │ ph.py:159 in generate_edges                  │
                    │                                              │
                    │    156 │   │   │   │                         │
                    │    157 │   │   │   │   chunks = p.read(min_m │
                    │    158 │   │   │   │                         │
                    │ ❱  159 │   │   │   │   res = p.to_hg_df(chun │
                    │    160 │   │   │   │   │   │   │   │    edge │
                    │    161 │   │   │   │                         │
                    │    162 │   │   │   │   if Path(f"{pairs_pref │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/pqs.py:6 │
                    │ 75 in to_hg_df                               │
                    │                                              │
                    │    672 │   │   for chunk in chunks:          │
                    │    673 │   │   │   args.append((Path(chunk). │
                    │        min_mapq, edge_length))               │
                    │    674 │   │                                 │
                    │ ❱  675 │   │   results = Parallel(n_jobs=sel │
                    │    676 │   │   │   │   │   delayed(process_c │
                    │    677 │   │   │   │   )                     │
                    │    678                                       │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:2072 in      │
                    │ __call__                                     │
                    │                                              │
                    │   2069 │   │   # dispatch of the tasks to th │
                    │   2070 │   │   next(output)                  │
                    │   2071 │   │                                 │
                    │ ❱ 2072 │   │   return output if self.return_ │
                    │   2073 │                                     │
                    │   2074 │   def __repr__(self):               │
                    │   2075 │   │   return "%s(n_jobs=%s)" % (sel │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:1682 in      │
                    │ _get_outputs                                 │
                    │                                              │
                    │   1679 │   │   │   yield                     │
                    │   1680 │   │   │                             │
                    │   1681 │   │   │   with self._backend.retrie │
                    │ ❱ 1682 │   │   │   │   yield from self._retr │
                    │   1683 │   │                                 │
                    │   1684 │   │   except GeneratorExit:         │
                    │   1685 │   │   │   # The generator has been  │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:1784 in      │
                    │ _retrieve                                    │
                    │                                              │
                    │   1781 │   │   │   # exception (e.g. `Genera │
                    │   1782 │   │   │   # worker traceback.       │
                    │   1783 │   │   │   if self._aborting:        │
                    │ ❱ 1784 │   │   │   │   self._raise_error_fas │
                    │   1785 │   │   │   │   break                 │
                    │   1786 │   │   │                             │
                    │   1787 │   │   │   nb_jobs = len(self._jobs) │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:1859 in      │
                    │ _raise_error_fast                            │
                    │                                              │
                    │   1856 │   │   # calling get_result. This jo │
                    │   1857 │   │   # called directly or if the g │
                    │   1858 │   │   if error_job is not None:     │
                    │ ❱ 1859 │   │   │   error_job.get_result(self │
                    │   1860 │                                     │
                    │   1861 │   def _warn_exit_early(self):       │
                    │   1862 │   │   """Warn the user if the gener │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:758 in       │
                    │ get_result                                   │
                    │                                              │
                    │    755 │   │   │   # We assume that the resu │
                    │    756 │   │   │   # callback thread, and is │
                    │    757 │   │   │   # be returned.            │
                    │ ❱  758 │   │   │   return self._return_or_ra │
                    │    759 │   │                                 │
                    │    760 │   │   # For other backends, the mai │
                    │    761 │   │   try:                          │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:773 in       │
                    │ _return_or_raise                             │
                    │                                              │
                    │    770 │   def _return_or_raise(self):       │
                    │    771 │   │   try:                          │
                    │    772 │   │   │   if self.status == TASK_ER │
                    │ ❱  773 │   │   │   │   raise self._result    │
                    │    774 │   │   │   return self._result       │
                    │    775 │   │   finally:                      │
                    │    776 │   │   │   del self._result          │
                    ╰──────────────────────────────────────────────╯
                    SchemaError: expected output type 'Null', got
                    'Int64'; set `return_dtype` to the proper
                    datatype

希望老师能给一些解决报错的建议,万分感谢!!

xychen233 avatar Oct 13 '25 13:10 xychen233

你好,实在不好意思太久没关注这个github页面,没有及时回复。我尝试解决了这个bug,发布了新的版本 v0.2.7.beta.r304,希望对你以后再次使用能有帮助。

wangyibin avatar Nov 20 '25 13:11 wangyibin

你好,实在不好意思太久没关注这个github页面,没有及时回复。我尝试解决了这个bug,发布了新的版本 v0.2.7.beta.r304,希望对你以后再次使用能有帮助。

感谢老师答复!我使用v0.2.7.beta.r304重新运行后还是得到一样的报错:

[10:34:28] INFO Running hyperpartition with basal(haploid) cli.py:4080 mode. INFO Contig length stats: Max=1.25Mb, utilities.py:1514 95th-percentile=1.01Mb, Median=263.71Kb. INFO Contig length distribution is relatively utilities.py:1526 uniform. Deactivating splitting. INFO Load raw hypergraph from pairs file cli.py:4242 ../HiC.pairs.pqs INFO Extract edges from pairs. hypergraph.py:110 INFO Parsing pqs ... pqs.py:670 INFO Filtered the data with mapq >= 1. pqs.py:347 [10:35:23] ERROR expected output type 'Null', got 'Int64'; set cli.py:1258 return_dtype to the proper datatype _RemoteTraceback:

还请老师再帮忙看看是什么问题呢?十分感谢!

xychen233 avatar Nov 21 '25 02:11 xychen233

老师,问题解决了,是Polars版本太高了导致的

xychen233 avatar Nov 22 '25 09:11 xychen233