我有一个列表列表,我知道每种类型的元素[Str, Str, Str, Int, Int, Int, Str, Int]
。我有一个转换函数,它猜测类型:
def convert(val):
constructors = [int, str]
for c in constructors:
try:
return c(val)
except ValueError:
pass
如何更换convert函数,因为我知道每个元素的类型(请参阅下面的完整代码)?
from __future__ import absolute_import, division, print_function
from itertools import groupby
DATA = [["Test", "A", "B01", 828288, 1, 7, 'C', 5],
["Test", "A", "B01", 828288, 1, 7, 'T', 6],
["Test", "A", "B01", 171878, 3, 7, 'C', 5],
["Test", "A", "B01", 171878, 3, 7, 'T', 6],
["Test", "A", "B01", 871963, 3, 9, 'A', 5],
["Test", "A", "B01", 871963, 3, 9, 'G', 6],
["Test", "A", "B01", 1932523, 1, 10, 'T', 4],
["Test", "A", "B01", 1932523, 1, 10, 'A', 5],
["Test", "A", "B01", 1932523, 1, 10, 'X', 6],
["Test", "A", "B01", 667214, 1, 14, 'T', 4],
["Test", "A", "B01", 667214, 1, 14, 'G', 5],
["Test", "A", "B01", 667214, 1, 14, 'G', 6]]
def convert(val):
constructors = [int, str]
for c in constructors:
try:
return c(val)
except ValueError:
pass
def main():
with open("/home/mic/tmp/test.txt") as f:
for line in f:
try:
data = [convert(part.strip()) for part in line.split(',')]
print(data)
except IndexError:
continue
更新 感谢所有给我新想法的回复,因此我也修改了代码(方法1 - 4 ),目前无效:
#!/usr/bin/env python
from __future__ import absolute_import, division, print_function
from itertools import groupby
import csv
parts = [["Test", "A", "B01", 828288, 1, 7, 'C', 5],
["Test", "A", "B01", 828288, 1, 7, 'T', 6],
["Test", "A", "B01", 171878, 3, 7, 'C', 5],
["Test", "A", "B01", 171878, 3, 7, 'T', 6],
["Test", "A", "B01", 871963, 3, 9, 'A', 5],
["Test", "A", "B01", 871963, 3, 9, 'G', 6],
["Test", "A", "B01", 1932523, 1, 10, 'T', 4],
["Test", "A", "B01", 1932523, 1, 10, 'A', 5],
["Test", "A", "B01", 1932523, 1, 10, 'X', 6],
["Test", "A", "B01", 667214, 1, 14, 'T', 4],
["Test", "A", "B01", 667214, 1, 14, 'G', 5],
["Test", "A", "B01", 667214, 1, 14, 'G', 6]]
def iter_something(rows):
key_names = ['type', 'name', 'sub_name', 'pos', 's_type', 'x_type']
chr_key_names = ['letter', 'no']
for keys, group in groupby(rows, lambda row: row[:6]):
result = dict(zip(key_names, keys))
result['chr'] = [dict(zip(chr_key_names, row[6:])) for row in group]
yield result
def main():
#Method 1
converters = [str, str, str, int, int, int, str, int]
with open("/home/mic/tmp/test.txt") as f:
parts = (line.strip().split(',') for line in f)
column = (con(part) for con, part in zip(converters, parts))
for object_ in iter_something(column):
print(object_)
#Method 2
with open("/home/mic/tmp/test.txt") as f:
parts = (line.strip().split(',') for line in f)
parts[3], parts[4], parts[5], parts[7] = int(parts[3]),\
int(parts[4]),\
int(parts[5]),\
int(parts[7])
column = (con(part) for con, part in zip(converters, parts))
for object_ in iter_something(column):
print(object_)
#Method 3
converters = [str, str, str, int, int, int, str, int]
with open("/home/mic/tmp/test.txt", 'rb') as f:
reader = csv.reader(f, skipinitialspace=True)
for object_ in iter_something(reader):
print(object_)
#Method 4
with open("/home/mic/tmp/test.txt", 'rb') as f:
reader = csv.reader(f, skipinitialspace=True)
reader[3], reader[4], reader[5], reader[7] = int(reader[3]),\
int(reader[4]),\
int(reader[5]),\
int(reader[7])
for object_ in iter_something(reader):
print(object_)
if __name__ == '__main__':
main()
答案 0 :(得分:2)
您可以使用zip()
将类型与列配对:
converters = [str, str, str, int, int, int, str, int]
for line in f:
data = [convert(part.strip())
for convert, part in zip(converters, line.split(','))]
在您的更新中,您再次犯了与您在其他问题中相同的错误;您在行和列之间感到困惑,并将该技术应用于行:
parts = (line.strip().split(',') for line in f)
column = ([con(col) for con, col in zip(converters, row)] for row in parts)
我是否可以重申考虑再次使用csv
module,就像我为previous question所做的那样?你在这里重新发明了一个CSV解析轮:
with open("/home/mic/tmp/test.txt") as f:
reader = csv.reader(f, skipinitialspace=True)
converted = ([conv(col) for conv, col in zip(converters, row)] for row in reader)
答案 1 :(得分:2)
根据你在问题开头描述的constructors
列表,你可以这样做:
reader = csv.reader(f)
data = [[con(val) for con, val in zip(constructors, line)] for line in reader]
那会给你一个二维列表;这是您希望通过您提供的代码判断的结构。
编辑:我修改了解决方案以使用csv
模块,您需要在顶部导入该模块。上述代码当然会在您的with
声明中。
答案 2 :(得分:1)
我会尝试回答您提出的问题:
由于元素已经是字符串,你只需要强制转换:
data = ["Test", "A", "B01", "667214", "1", "14", 'G',"6"]
data[3], data[4], data[5], data[7],= int(data[3]),int(data[4]), int(data[5]), int(data[7])
所以你的主要看起来像:
def main():
with open("/home/mic/tmp/test.txt") as f:
for line in f:
try:
data = [part.strip() for part in line.split(',')]
data[3], data[4], data[5], data[7],= int(data[3]),int(data[4]), int(data[5]), int(data[7])
except IndexError:
continue
但由于您已经在尝试使用,因此最好为ValueError添加例外
except (IndexError,ValueError):
你不需要转换函数,在你的主要内容中强制转换就足够了,没有点转换已经是字符串的字符串