gzip - Reading gzipped text file line-by-line for processing in python 3.2.6 -
i'm complete newbie when comes python, i've been tasked trying piece of code running on machine has different version of python (3.2.6) code built for.
i've come across issue reading in gzipped-text file line-by-line (and processing depending on first character). code (which written in python > 3.2.6) is
for line in gzip.open(input[0], 'rt'): if line[:1] != '>': out.write(line) continue chromname = match2chrom(line[1:-1]) seqname = line[1:].split()[0] print('>{}'.format(chromname), file=out) print('{}\t{}'.format(seqname, chromname), file=mappingout)
(for know, strips gzipped fasta genome files headers (with ">" @ start) , sequences, , processes lines 2 different files depending on this)
i have found https://bugs.python.org/issue13989, states mode 'rt' cannot used gzip.open in python-3.2 , use along lines of:
import io io.textiowrapper(gzip.open(input[0], "r")) fin: line in fin: if line[:1] != '>': out.write(line) continue chromname = match2chrom(line[1:-1]) seqname = line[1:].split()[0] print('>{}'.format(chromname), file=out) print('{}\t{}'.format(seqname, chromname), file=mappingout)
but above code not work:
unsupportedoperation in line <4> of /path/to/python_file.py: read1
how can rewrite routine give out want - reading gzip file line-by-line variable "line" , processing based on first character?
edit: traceback first version of routine (python 3.2.6):
mode rt not supported file "/path/to/python_file.py", line 79, in __process_genome_sequences file "/opt/python-3.2.6/lib/python3.2/gzip.py", line 46, in open file "/opt/python-3.2.6/lib/python3.2/gzip.py", line 157, in __init__
traceback second version is:
unsupportedoperation in line 81 of /path/to/python_file.py: read1 file "/path/to/python_file.py", line 81, in __process_genome_sequences
with no further traceback (the 2 lines in line count import io
, with io.textiowrapper(gzip.open(input[0], "r")) fin:
lines
i have appeared have solved problem.
in end had use shell("gunzip {input[0]}")
ensure gunzipped file read in in text mode, , read in resulting file using
for line in open(' *< resulting file >* ','r'): if line[:1] != '>': out.write(line) continue chromname = match2chrom(line[1:-1]) seqname = line[1:].split()[0] print('>{}'.format(chromname), file=out) print('{}\t{}'.format(seqname, chromname), file=mappingout)
Comments
Post a Comment