我正在尝试为我的数据的python中的每一列创建一个列表,如下所示:
399.75833 561.572000000 399.75833 561.572000000 a_Fe I 399.73920 nm
399.78316 523.227000000 399.78316 523.227000000
399.80799 455.923000000 399.80799 455.923000000 a_Fe I 401.45340 nm
399.83282 389.436000000 399.83282 389.436000000
399.85765 289.804000000 399.85765 289.804000000
问题是我的数据的每一行都有不同的长度。无论如何用空格格式化较短行的剩余空格,使它们的长度都相同?
我希望我的数据采用以下形式:
list one= [399.75833, 399.78316, 399.80799, 399.83282, 399.85765]
list two= [561.572000000, 523.227000000, 455.923000000, 389.436000000, 289.804000000]
list three= [a_Fe, " ", a_Fe, " ", " "]
这是我用来将数据导入python的代码:
fh = open('help.bsp').read()
the_list = []
for line in fh.split('\n'):
print line.strip()
splits = line.split()
if len(splits) ==1 and splits[0]== line.strip():
splits = line.strip().split(',')
if splits:the_list.append(splits)
答案 0 :(得分:1)
您需要使用izip_longest
来制作列列表,因为标准zip
只能运行到给定数组列表中的最短长度。
from itertools import izip_longest
with open('workfile', 'r') as f:
fh = f.readlines()
# Process all the rows line by line
rows = [line.strip().split() for line in fh]
# Use izip_longest to get all columns, with None's filled in blank spots
cols = [col for col in izip_longest(*rows)]
# Then run your type conversions for your final data lists
list_one = [float(i) for i in cols[2]]
list_two = [float(i) for i in cols[3]]
# Since you want " " instead of None for blanks
list_three = [i if i else " " for i in cols[4]]
输出:
>>> print list_one
[399.75833, 399.78316, 399.80799, 399.83282, 399.85765]
>>> print list_two
[561.572, 523.227, 455.923, 389.436, 289.804]
>>> print list_three
['a_Fe', ' ', 'a_Fe', ' ', ' ']
答案 1 :(得分:0)
那么,你的行是用空格分隔的还是用逗号分隔的,如果用逗号分隔,那么这行不包含空格? (请注意,如果len(splits)==1
为真,则splits[0]==line.strip()
也为真)。这不是您要显示的数据,而不是您所描述的数据。
从您显示的数据中获取所需的列表:
with open('help.bsp') as h:
the_list = [ line.strip().split() for line in h.readlines() ]
list_one = [ d[0] for d in the_list ]
list_two = [ d[1] for d in the_list ]
list_three = [ d[4] if len(d) > 4 else ' ' for d in the_list ]
如果你正在阅读逗号分隔(或类似分隔)的文件,我总是建议使用csv
模块 - 它会处理许多你可能没有考虑过的边缘情况。