MoSEA icon indicating copy to clipboard operation
MoSEA copied to clipboard

variable event support

Open PeterVenhuizen opened this issue 8 years ago • 4 comments

Will variable SUPPA events be supported?

PeterVenhuizen avatar Jan 09 '18 15:01 PeterVenhuizen

Hi Peter,

Variable SUPPA events are supported. We did some benchmarking with RT-PCR experiments in plants and they work better than using the strict boundaries. JC might be able to tell you more about it.

Please, let me know if you see any issues with the variable events.

Thanks

Eduardo

On Tue, Jan 9, 2018 at 4:13 PM, PeterVenhuizen [email protected] wrote:

Will variable SUPPA events be supported?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/comprna/MoSEA/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AMWVB-dOQ4b9tEvHRQuZbRSWV9H5R8H_ks5tI4IIgaJpZM4RX-Xb .

-- Dr E Eyras

ICREA Research Professor Universitat Pompeu Fabra PRBB, Dr Aiguader 88 Tel: +34 93 316 0502 (ext 1502) E08003 Barcelona, Spain Fax: +34 93 316 0550

http://scholar.google.com/citations?user=LiojlGoAAAAJ http://www.researcherid.com/rid/L-1053-2014 http://regulatorygenomics.upf.edu/

EduEyras avatar Jan 09 '18 15:01 EduEyras

Hi Eduardo,

as of now I cannot use the suppa_to_bed.py script to generate the bed files for the RI events, because it expects 4 coordinates in the event_id, but the variable RI event_ids only contain the coordinates of the retained intron. Running the suppa_to_bed.py with variable events thus gives me IndexErrors. I assume that I could run MoSEA if I generate the bed files myself, but I was wondering whether the suppa_to_bed.py script would be updated to support variable events.

Best Peter

PeterVenhuizen avatar Jan 10 '18 07:01 PeterVenhuizen

Thanks for pointing this out. We did not develop this part yet.

Yes, you can generate bed files from the event coordinates to run MoSEA. MoSEA can read the standard events, but not yet the variable-boundary notation, but it can also read any bed file.

Thanks

Eduardo

On Wed, Jan 10, 2018 at 8:37 AM, PeterVenhuizen [email protected] wrote:

Hi Eduardo,

as of now I cannot use the suppa_to_bed.py script to generate the bed files for the RI events, because it expects 4 coordinates in the event_id, but the variable RI event_ids only contain the coordinates of the retained intron. Running the suppa_to_bed.py with variable events thus gives me IndexErrors. I assume that I could run MoSEA if I generate the bed files myself, but I was wondering whether the suppa_to_bed.py script would be updated to support variable events.

Best Peter

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/comprna/MoSEA/issues/1#issuecomment-356523328, or mute the thread https://github.com/notifications/unsubscribe-auth/AMWVBwt2B1MqBf-s4Yo8y6Cae-m3J2DKks5tJGi0gaJpZM4RX-Xb .

-- Dr E Eyras

ICREA Research Professor Universitat Pompeu Fabra PRBB, Dr Aiguader 88 Tel: +34 93 316 0502 (ext 1502) E08003 Barcelona, Spain Fax: +34 93 316 0550

http://scholar.google.com/citations?user=LiojlGoAAAAJ http://www.researcherid.com/rid/L-1053-2014 http://regulatorygenomics.upf.edu/

EduEyras avatar Jan 10 '18 10:01 EduEyras

I've written an updated version of the fun_RI_bedfile function, which supports variable events. I have thus far not extensively tested it, but it is able to generate the V1, V2, and V3 coordinates from the variable RI events. However, I think it will break if an variable RI event is given, but no ext_len, I have not yet tested for this.

The updated function is below. Feel free to use it.

def fun_RI_bedfile(in_file, out_file, len_ext, event, mediandiff):
	'''
	##ref: see Fig.2 suppa documentaion of RI e1 & s1..descriptions
	#https://bitbucket.org/regulatorygenomicsupf/suppa
	#Example event id: 
	#TIAL1|7073;RI:chr10:121336123:121336262-121336592:121336715:-
	'''

	fo = open(out_file, "a")

	variable = False
	ev_all = event.split(';')[1].split(':')
	if len(ev_all) == 6: #strict
		ev_type, ev_chr, s1, e1_s2, e2, ev_strand = event.split(';')[1].split(':')

		s1 = int(s1)
		e1, s2 = map(int, e1_s2.split('-'))
		e2 = int(e2)

	else: #variable
		ev_type, ev_chr, e1_s2, ev_strand = event.split(';')[1].split(':')
		variable = True

	e1, s2 = map(int, e1_s2.split('-'))

	if ev_strand == '+':
		V2 = "{}\t{}".format(e1, s2) 
		if len_ext or variable:
			V1 = "{}\t{}".format(e1 - len_ext, e1) 
			V3 = "{}\t{}".format(s2, s2 + len_ext)
		else: 
			V1 = "{}\t{}".format(s1, e1) 
			V3 = "{}\t{}".format(s2, e2) 

	elif ev_strand == "-":
		V2 = "{}\t{}".format(e1, s2) 
		s2 = s2 -1 #for 0-based correction

		if len_ext or variable:
			V1 = "{}\t{}".format(s2, s2 + len_ext)
			V3 = "{}\t{}".format(e1 - len_ext, e1)
		else:
			V1 = "{}\t{}".format(s2, e2) 
			V3 = "{}\t{}".format(s1, e1) 

	else:
		print("No strand information in file, for event: {}".format(event))
		sys.exit()

	fo.write("{}\t{}\t{};V1\t{}\t{}\n".format(ev_chr, V1, event, mediandiff, ev_strand))
	fo.write("{}\t{}\t{};V2\t{}\t{}\n".format(ev_chr, V2, event, mediandiff, ev_strand))
	fo.write("{}\t{}\t{};V3\t{}\t{}\n".format(ev_chr, V3, event, mediandiff, ev_strand))

	fo.close()

PeterVenhuizen avatar Jan 10 '18 10:01 PeterVenhuizen