GenGraph icon indicating copy to clipboard operation
GenGraph copied to clipboard

set mauve scratch path

Open jambler24 opened this issue 6 years ago • 21 comments

--scratch-path-1 is hard-coded. This needs to become relative.

jambler24 avatar May 29 '19 09:05 jambler24

hi One question Why when I start running the program

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name Documents/output Conducting progressiveMauve progressiveMauve

It got stuck.

I am using a Mac Processor 2.7 GHz core intel core i7 Memory 16 GB Two sequences 4.5 MB each T hank you for your precious help Devina

Devinaseeruttun avatar Jun 11 '20 03:06 Devinaseeruttun

Hi Devina,

Thank you for raising this issue, I'll be happy to take a look at it.

Firstly which version are you running? The latest that is in the repository?

Let's see if we can pin down the issue

jambler24 avatar Jun 11 '20 08:06 jambler24

Hi,

Yes I am using the latest version in the repository.
However since I am using a Mac I have install the alignment software using the following commands 1.# Install a MSA tool

Muscle

curl -fksSL http://drive5.com/muscle/downloads3.8.31/muscle3.8.31_i86linux64.tar.gz | tar xz &&
mv muscle3.8.31_i86linux64 /usr/local/bin/muscle3.8.31_i86darwin64 curl -fksSL http://drive5.com/muscle/downloads3.8.31/muscle3.8.31_i86darwin64.tar.gz | tar xz && \ mv muscle3.8.31_i86darwin64 /usr/local/bin/muscle3.8.31_i86darwin64 2. git clone https://github.com/jambler24/GenGraph https://github.com/jambler24/GenGraph 3. # Install MAUVE RUN curl -fksSL http://darlinglab.org/mauve/snapshots/2015/2015-02-13/linux-x64/mauve_linux_snapshot_2015-02-13.tar.gz | tar xz &&
cp mauve_snapshot_2015-02-13/linux-x64/progressiveMauve /usr/local

I have downloaded MAUVE and copy to the Applications folder on my Mac Then run the following command sudo cp /Applications/Mauve.app/Contents/MacOS/progressiveMauve /usr/local/bin/

Thank you Cheers, Devina

On 11 Jun 2020, at 12:05, Jambler [email protected] wrote:

Hi Devina,

Thank you for raising this issue, I'll be happy to take a look at it.

Firstly which version are you running? The latest that is in the repository?

Let's see if we can pin down the issue

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-642483994, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTV7LFAQGVNTGYLX2DRWCF3PANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 11 '20 08:06 Devinaseeruttun

Thanks Devina,

Ok I'm going to do some testing add some checks in the code and push an update, hopefully by tonight.

I'll check back here when they are out :)

jambler24 avatar Jun 11 '20 08:06 jambler24

Hi Devina,

I have pushed some new code with a bit more detail in the log file.

In testing, it seems to be running but probably not finding progressiveMauve on your side for some reason.

I think the: sudo cp /Applications/Mauve.app/Contents/MacOS/progressiveMauve /usr/local/bin/ part is the issue. Can you try run:

/usr/local/bin/progressiveMauve

and check that it is working?

Also pull the new code and have a look in the .log file, that should have some clues.

jambler24 avatar Jun 11 '20 13:06 jambler24

Thank you. I have run /usr/local/bin/progressiveMauve

(base) devina@Devinas-MacBook-Pro ~ % /usr/local/bin/progressiveMauve progressiveMauve usage:

When each genome resides in a separate file: /usr/local/bin/progressiveMauve [options] ... <seqN filename>

When all genomes are in a single file: /usr/local/bin/progressiveMauve [options]

Options: --island-gap-size= Alignment gaps above this size in nucleotides are considered to be islands [20] --profile= (Not yet implemented) Read an existing sequence alignment in XMFA format and align it to other sequences or alignments --apply-backbone= Read an existing sequence alignment in XMFA format and apply backbone statistics to it --disable-backbone Disable backbone detection --mums Find MUMs only, do not attempt to determine locally collinear blocks (LCBs) --seed-weight= Use the specified seed weight for calculating initial anchors --output= Output file name. Prints to screen by default --backbone-output= Backbone output file name (optional). --match-input= Use specified match file instead of searching for matches --input-id-matrix= An identity matrix describing similarity among all pairs of input sequences/alignments --max-gapped-aligner-length= Maximum number of base pairs to attempt aligning with the gapped aligner --input-guide-tree= A phylogenetic guide tree in NEWICK format that describes the order in which sequences will be aligned --output-guide-tree= Write out the guide tree used for alignment to a file --version Display software version information --debug Run in debug mode (perform internal consistency checks--very slow) --scratch-path-1= Designate a path that can be used for temporary data storage. Two or more paths should be specified. --scratch-path-2= Designate a path that can be used for temporary data storage. Two or more paths should be specified. --collinear Assume that input sequences are collinear--they have no rearrangements --scoring-scheme=<ancestral|sp_ancestral|sp> Selects the anchoring score function. Default is extant sum-of-pairs (sp). --no-weight-scaling Don't scale LCB weights by conservation distance and breakpoint distance --max-breakpoint-distance-scale=<number [0,1]> Set the maximum weight scaling by breakpoint distance. Defaults to 0.5 --conservation-distance-scale=<number [0,1]> Scale conservation distances by this amount. Defaults to 0.5 --muscle-args= Additional command-line options for MUSCLE. Any quotes should be escaped with a backslas…. is working.

I refer to log file.

Thank you again Cheers Devina

On 11 Jun 2020, at 17:29, Jambler [email protected] wrote:

Hi Devina,

I have pushed some new code with a bit more detail in the log file.

In testing, it seems to be running but probably not finding progressiveMauve on your side for some reason.

I think the: sudo cp /Applications/Mauve.app/Contents/MacOS/progressiveMauve /usr/local/bin/ part is the issue. Can you try run:

/usr/local/bin/progressiveMauve

and check that it is working?

Also pull the new code and have a look in the .log file, that should have some clues.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-642650211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWU6BLNAYYLJDKUNB5DRWDL5HANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 11 '20 14:06 Devinaseeruttun

hi what is the new code? I was testing GenGraph with your reference WGSs The log file INFO:root:({'seq0': 'H37Ra', 'seq1': 'F11'}, {'H37Ra': 'Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'F11': 'Documents/genomes/F11/sequence.fasta/F11.fas'}, ['Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'Documents/genomes/F11/sequence.fasta/F11.fas'], {'H37Ra': 'NA', 'F11': 'NA'}) INFO:root:{'seq_name': 'H37Ra', 'aln_name': 'seq0', 'seq_path': 'Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'annotation_path': 'NA'} INFO:root:{'seq_name': 'F11', 'aln_name': 'seq1', 'seq_path': 'Documents/genomes/F11/sequence.fasta/F11.fas', 'annotation_path': 'NA'} INFO:root:({'seq0': 'H37Ra', 'seq1': 'F11'}, {'H37Ra': 'Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'F11': 'Documents/genomes/F11/sequence.fasta/F11.fas'}, ['Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'Documents/genomes/F11/sequence.fasta/F11.fas'], {'H37Ra': 'NA', 'F11': 'NA'})

console system logun 11 19:24:02 Devinas-MacBook-Pro login[4869]: DEAD_PROCESS: 4869 ttys000 Jun 11 19:24:09 Devinas-MacBook-Pro login[60192]: USER_PROCESS: 60192 ttys000 Jun 11 19:24:40 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.04000000-0100-0000-0000-000000000000[60191]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:25:06 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.AddressBook.abd): Service only ran for 1 seconds. Pushing respawn out by 9 seconds. Jun 11 19:25:31 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.04000000-0200-0000-0000-000000000000[60209]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:25:53 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.AddressBook.abd): Service only ran for 1 seconds. Pushing respawn out by 9 seconds. Jun 11 19:25:59 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.10000000-0500-0000-0000-000000000000[60210]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:25:59 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0100-0000-0000-000000000000[60214]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:20 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.09000000-0100-0000-0000-000000000000[60208]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:40 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.04000000-0300-0000-0000-000000000000[60218]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:47 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0D000000-0400-0000-0000-000000000000[60215]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:47 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0200-0000-0000-000000000000[60222]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:47 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.10000000-0600-0000-0000-000000000000[60223]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:27:35 Devinas-MacBook-Pro syncdefaultsd[60232]: objc[60232]: Class SYDClient is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be use

still not running. pl help. Thank you cheers

Devina

Devinaseeruttun avatar Jun 11 '20 15:06 Devinaseeruttun

Aaah, ok the problem is the file extension I think.

Try change the files to .fa not .fas

Also maybe change the "sequence.fasta" directory to "sequence_fasta", as this can cause some confusion.

jambler24 avatar Jun 12 '20 08:06 jambler24

Thank you. I’ll make the requested changes. Now I have the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name genoutput Conducting progressiveMauve progressiveMauve Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 109, in genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict) File "/Users/devina/GenGraph/gengraph.py", line 1577, in bbone_to_initGraph iso_length = len(input_parser(input_dict[1][iso])[0]['DNA_seq']) TypeError: 'NoneType' object is not subscriptable (base) devina@Devinas-MacBook-Pro ~ %

Thank you for your precious help. Cheers

On 12 Jun 2020, at 12:33, Jambler [email protected] wrote:

Aaah, ok the problem is the file extension I think.

Try change the files to .fa not .fas

Also maybe change the "sequence.fasta" directory to "sequence_fasta", as this can cause some confusion.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643147842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWVS2KJSDGQN5FD2O5LRWHR4JANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 12 '20 09:06 Devinaseeruttun

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

jambler24 avatar Jun 12 '20 11:06 jambler24

Done When I run it I got the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 117, in refine_initGraph(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 1520, in refine_initGraph presorted_list.append((a_node, abs(a_graph.node[a_node][isolate + '_leftend']), abs(a_graph.node[a_node][isolate + '_rightend']))) AttributeError: 'MultiDiGraph' object has no attribute 'node' (base) devina@Devinas-MacBook-Pro ~ %

For your help please.

Thank you Cheers

On 12 Jun 2020, at 15:19, Jambler [email protected] wrote:

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643218047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTRYIQUZ6JNITPXTGLRWIFMJANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 12 '20 15:06 Devinaseeruttun

Hi, I have network 2.4 installed to you think that why I having the following problem AttributeError: 'MultiDiGraph' object has no attribute ‘node’.

Pease help.

Thank you Cheers Devina

On 12 Jun 2020, at 19:32, Devina Bhookhun-Seeruttun [email protected] wrote:

Done When I run it I got the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 117, in refine_initGraph(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 1520, in refine_initGraph presorted_list.append((a_node, abs(a_graph.node[a_node][isolate + '_leftend']), abs(a_graph.node[a_node][isolate + '_rightend']))) AttributeError: 'MultiDiGraph' object has no attribute 'node' (base) devina@Devinas-MacBook-Pro ~ %

For your help please.

Thank you Cheers

On 12 Jun 2020, at 15:19, Jambler <[email protected] mailto:[email protected]> wrote:

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643218047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTRYIQUZ6JNITPXTGLRWIFMJANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 15 '20 16:06 Devinaseeruttun

Yes networkx 2.4 was the problem\Now I am stuck

The key error is ‘_leftend'

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete
Conducting local node realignment Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 138, in add_graph_data(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 2606, in add_graph_data if abs(int(data[an_isolate + '_leftend'])) == 1: KeyError: '_leftend'

Please help. Thank you Cheers

On 15 Jun 2020, at 20:11, Devina Bhookhun-Seeruttun [email protected] wrote:

Hi, I have network 2.4 installed to you think that why I having the following problem AttributeError: 'MultiDiGraph' object has no attribute ‘node’.

Pease help.

Thank you Cheers Devina

On 12 Jun 2020, at 19:32, Devina Bhookhun-Seeruttun <[email protected] mailto:[email protected]> wrote:

Done When I run it I got the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 117, in refine_initGraph(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 1520, in refine_initGraph presorted_list.append((a_node, abs(a_graph.node[a_node][isolate + '_leftend']), abs(a_graph.node[a_node][isolate + '_rightend']))) AttributeError: 'MultiDiGraph' object has no attribute 'node' (base) devina@Devinas-MacBook-Pro ~ %

For your help please.

Thank you Cheers

On 12 Jun 2020, at 15:19, Jambler <[email protected] mailto:[email protected]> wrote:

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643218047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTRYIQUZ6JNITPXTGLRWIFMJANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 15 '20 17:06 Devinaseeruttun

Hi Devina,

Thanks for spotting the networkx 2.4 problem, I will add a check for that too.

Taking a look at the code to see what could be causing the new error and will get back to you asap.

jambler24 avatar Jun 17 '20 10:06 jambler24

Hi, I installed your package on Ubuntu 20.04 LTS and I had to make a couple edits to the input_parses to get it to read. However, I'm stuck with an error message

File "/home/pradeep/py_vienna/lib/python3.8/site-packages/networkx/readwrite/graphml.py", line 466, in add_data raise nx.NetworkXError(msg % element_type) networkx.exception.NetworkXError: GraphML writer does not support <class 'numpy.str_'> as data values.

I'm not sure how to fix this, since i downgraded to networkx==2.3

pramesh-cfh11 avatar Jun 20 '20 17:06 pramesh-cfh11

Thank you. Devina

On 17 Jun 2020, at 14:11, Jambler [email protected] wrote:

Hi Devina,

Thanks for spotting the networkx 2.4 problem, I will add a check for that too.

Taking a look at the code to see what could be causing the new error and will get back to you asap.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-645283822, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWQ3XTEZHGSQFVUBU5TRXCJERANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 21 '20 13:06 Devinaseeruttun

Hi all,

I have been doing some testing with docker images, and it looks like networkx v2.4 is the problem. Downgrading to 2.3 works, and I am updating the docker image.

Not sure about the error you are seeing @pramesh-cfh11 , what edits were made to the input_parser?

jambler24 avatar Jun 22 '20 11:06 jambler24

Hi @jambler24 First off, thanks for being so prompt about this. Your paper and related codebase is exactly what i've been looking for in my research. I'm enumerating the order of operations so you get a sense of changes i made after initially downloading the latest git repository and running it.

I am using a python3.8 virtual environment in a Ubuntu 20.04 OS.

  1. First error: 'numpy' not found. Fix: I went through the gengraph.py file and saw that you had import numpy as np, but in the downstream block, you were still calling numpy.array. I removed the alias and set it to import numpy

  2. Second error: indexing and parsing issue with input file - numpy only accepts integer slices Fix: I saw that your code accepts a csv file, so when i prepared a file with the four columns, the outputs of parse_seq_file didn't look correct, so i modified this function with hard-coded column numbers (see below) corresponding to the column names. I had to remove the header in the csv file, and then the code started to run, before returning the error that i originally contacted you with.

  3. The code didn't run with networkx==2.4 and i got the same error as @Devinaseeruttun

def parse_seq_file(path_to_seq_file):

seq_file_dict = input_parser(path_to_seq_file)

A_seq_label_dict = {}
A_input_path_dict = {}
ordered_paths_list = []
anno_path_dict = {}

for a_seq_file in seq_file_dict:
	logging.info(a_seq_file)
	A_seq_label_dict[a_seq_file[1]] = a_seq_file[0]
	A_input_path_dict[a_seq_file[0]] = a_seq_file[2]
	ordered_paths_list.append(a_seq_file[2])
	anno_path_dict[a_seq_file[0]] = a_seq_file[3]

return A_seq_label_dict, A_input_path_dict, ordered_paths_list, anno_path_dict 

pramesh-cfh11 avatar Jun 22 '20 13:06 pramesh-cfh11

Just an update, the code now supports networkx 2.4.

The code works in testing, so I will have to test with Python 3.8 in a container and see if there is something going wrong there.

jambler24 avatar Jun 23 '20 09:06 jambler24

Great.Thank you Cheers Devina

On 23 Jun 2020, at 13:15, Jambler [email protected] wrote:

Just an update, the code now supports networkx 2.4.

The code works in testing, so I will have to test with Python 3.8 in a container and see if there is something going wrong there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-648017193, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTA7CNKLWXATLYAXLTRYBXCHANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun avatar Jun 23 '20 09:06 Devinaseeruttun

Awesoeme, thanks @jambler24 - awaiting your testing.

pramesh-cfh11 avatar Jun 23 '20 13:06 pramesh-cfh11