gzip - Reading gzipped text file line-by-line for processing in python 3.2.6 -

- February 15, 2014

i'm complete newbie when comes python, i've been tasked trying piece of code running on machine has different version of python (3.2.6) code built for.

i've come across issue reading in gzipped-text file line-by-line (and processing depending on first character). code (which written in python > 3.2.6) is

for line in gzip.open(input[0], 'rt'):     if line[:1] != '>':         out.write(line)         continue      chromname = match2chrom(line[1:-1])     seqname = line[1:].split()[0]      print('>{}'.format(chromname), file=out)     print('{}\t{}'.format(seqname, chromname), file=mappingout)

(for know, strips gzipped fasta genome files headers (with ">" @ start) , sequences, , processes lines 2 different files depending on this)

i have found https://bugs.python.org/issue13989, states mode 'rt' cannot used gzip.open in python-3.2 , use along lines of:

import io  io.textiowrapper(gzip.open(input[0], "r")) fin:      line in fin:          if line[:1] != '>':              out.write(line)              continue           chromname = match2chrom(line[1:-1])          seqname = line[1:].split()[0]           print('>{}'.format(chromname), file=out)          print('{}\t{}'.format(seqname, chromname), file=mappingout)

but above code not work:

unsupportedoperation in line <4> of /path/to/python_file.py: read1

how can rewrite routine give out want - reading gzip file line-by-line variable "line" , processing based on first character?

edit: traceback first version of routine (python 3.2.6):

mode rt not supported   file "/path/to/python_file.py", line 79, in __process_genome_sequences   file "/opt/python-3.2.6/lib/python3.2/gzip.py", line 46, in open   file "/opt/python-3.2.6/lib/python3.2/gzip.py", line 157, in __init__

traceback second version is:

unsupportedoperation in line 81 of /path/to/python_file.py: read1 file "/path/to/python_file.py", line 81, in __process_genome_sequences

with no further traceback (the 2 lines in line count import io , with io.textiowrapper(gzip.open(input[0], "r")) fin: lines

i have appeared have solved problem.

in end had use shell("gunzip {input[0]}") ensure gunzipped file read in in text mode, , read in resulting file using

for line in open(' *< resulting file >* ','r'):     if line[:1] != '>':         out.write(line)         continue      chromname = match2chrom(line[1:-1])     seqname = line[1:].split()[0]      print('>{}'.format(chromname), file=out)     print('{}\t{}'.format(seqname, chromname), file=mappingout)

Search This Blog

Maxid

gzip - Reading gzipped text file line-by-line for processing in python 3.2.6 -

Comments

Post a Comment

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -