使用csv.DictReader

时间:2016-01-15 09:55:33

标签: python csv

我想使用csv.DictReader读取一列,并根据下面的值,我想打印不同列中相应值之间的差异:下面的值 - 上面的值。

我写了这个剧本:

import csv

next=None
last = None
test_file = 'data.tsv'
csv_file = csv.DictReader(open(test_file, 'rU'), delimiter='\t')
for row in csv_file:
    if row['GT'] == "0/1":
        genotype = row['GT']
        if next is not None:
            if next == "0/1":
                position = int(row['pos'])
                if last is not None:
                    print  (position - last)
                last = position
        next = genotype

当我在data.tsv上运行它(见下文)时,它执行它想要做的事情,即打印80.在GT列下,0/1发生在0/1之后,并且832398-832318 = 80

pos GT
815069  0/0
825069  0/1
825410  ./.
830181  1/1
832318  0/1
832398  0/1
832756  0/0

然而,当我设置

如果下一个==" 0/0" :( - >如果第一个GT = 0/1且下一个GT = 0/0,则打印差异bewteen pos列中的相应值,即832756-832398 = 358)

它没有打印任何东西!也在改变时

如果下一个==" ./."

它什么都不做

import csv

next=None
last = None
test_file = 'data.tsv'
csv_file = csv.DictReader(open(test_file, 'rU'), delimiter='\t')
for row in csv_file:
    if row['GT'] == "0/1":
        genotype = row['GT']
        if next is not None:
            **if next == "0/0":**
                position = int(row['pos'])
                if last is not None:
                    print  (position - last)
                last = position
        next = genotype

为什么会出现这种情况? 感谢任何帮助!让我知道我是否应该澄清问题的描述(Python初学者)

此致 乔安娜

1 个答案:

答案 0 :(得分:1)

第一个脚本中的变量next令人困惑,实际上它不是下一个,而是当前的GT。该脚本只是偶然的,因为两个GT都是相同的(因此顺序并不重要)。

当您逐行迭代文件时,几乎无法向前看,相反,您可以回顾并将当前GT与最后一个GT进行比较:

import csv

last_gt = None
last_pos = None
test_file = 'data.tsv'
csv_file = csv.DictReader(open(test_file, 'rU'), delimiter='\t')
    for row in csv_file:
        curr_gt = row['GT']
        curr_pos = int(row['pos'])
        if (curr_gt == "0/0") and (last_gt == "0/1"): # EDIT: 'and' instead of '&'
            print(curr_pos - last_pos)
        last_pos = curr_pos                           # EDIT: delete 'else' statement
        last_gt = curr_gt