拆分格式化数据列表

时间:2014-07-01 13:39:30

标签: python string list split

我有一个以列均格式格式化的数据列表:

['[ 0.93913063  0.28020878  0.2769496 ]',
'[ 0.21672141  0.29633945  0.19763641]',
'[ 0.74718183  0.33466203  0.13866566]',
'[ 0.1067503   0.20448574  0.16817043]',
'[ 0.1223612   0.11653754  0.13288494]',
'[ 0.48761208  0.78240743  0.38697977]',
'[ 0.4300345   0.50380231  0.48102237]']

我想将这些数据拆分成如下列表:

[(0.93913063,0.28020878,0.2769496),(0.21672141,0.29633945,0.19763641),(0.74718183,0.33466203,0.13866566),(0.1067503, 0.20448574,0.16817043),(0.1223612,0.11653754,0.13288494),(0.48761208,0.78240743,0.38697977),(0.4300345,0.50380231,0.48102237)]

我遇到的问题是我要分割数据,然后以我想要的格式将它们全部拼凑在一起,但是括号中的值之间没有一致的空格。 所以这就是我在做的事情:

def removefront(s):
    return s[2:]
def removeend(s):
    return s[:-2]

valuelist = []
i = 0
for x in xrange(0,len(data)):
    print data[i]
    a,b,d = data[i].split('  ')
    p1 = removefront(a)
    p3 = removeend(d)
    p1 = float(p1)
    p2 = float(b)
    p3 = float(p3)
    coord = (p1, p2, p3)
    i += 1
    valuelist.append(coord)

非常感谢任何帮助。谢谢!

3 个答案:

答案 0 :(得分:2)

您可以在此处使用列表推导和一些字符串方法:

>>> s = ['[ 0.93913063  0.28020878  0.2769496 ]', '[ 0.21672141  0.29633945  0.19763641]', '[ 0.74718183  0.33466203  0.13866566]', '[ 0.1067503   0.20448574  0.16817043]', '[ 0.1223612   0.11653754  0.13288494]', '[ 0.48761208  0.78240743  0.38697977]', '[ 0.4300345   0.50380231  0.48102237]']
>>> [map(float, x.strip('[]').split()) for x in s]
[[0.93913063, 0.28020878, 0.2769496], [0.21672141, 0.29633945, 0.19763641], [0.74718183, 0.33466203, 0.13866566], [0.1067503, 0.20448574, 0.16817043], [0.1223612, 0.11653754, 0.13288494], [0.48761208, 0.78240743, 0.38697977], [0.4300345, 0.50380231, 0.48102237]]

此处str.strip('[]')从字符串中删除[],然后我们将这些数据拆分为空格,然后我们将float()应用于每个项目。

另一个选择是在这里使用ast.literal_eval和正则表达式:

>>> import re
>>> from ast import literal_eval
>>> r = re.compile(r'(\d)\s')
>>> [literal_eval(r.sub(r'\1,', x)) for x in s]
[[0.93913063, 0.28020878, 0.2769496], [0.21672141, 0.29633945, 0.19763641], [0.74718183, 0.33466203, 0.13866566], [0.1067503, 0.20448574, 0.16817043], [0.1223612, 0.11653754, 0.13288494], [0.48761208, 0.78240743, 0.38697977], [0.4300345, 0.50380231, 0.48102237]]

答案 1 :(得分:1)

这个列表理解感觉有点Rube Goldberg-ish,但这是我对它的抨击。

>>> l = ['[ 0.93913063  0.28020878  0.2769496 ]',
'[ 0.21672141  0.29633945  0.19763641]',
'[ 0.74718183  0.33466203  0.13866566]',
'[ 0.1067503   0.20448574  0.16817043]',
'[ 0.1223612   0.11653754  0.13288494]',
'[ 0.48761208  0.78240743  0.38697977]',
'[ 0.4300345   0.50380231  0.48102237]']

>>> [tuple(map(float,i[2:-1].split())) for i in l]

输出

[(0.93913063, 0.28020878, 0.2769496),
 (0.21672141, 0.29633945, 0.19763641),
 (0.74718183, 0.33466203, 0.13866566),
 (0.1067503, 0.20448574, 0.16817043),
 (0.1223612, 0.11653754, 0.13288494),
 (0.48761208, 0.78240743, 0.38697977),
 (0.4300345, 0.50380231, 0.48102237)]

答案 2 :(得分:0)

这非常适合正则表达式:

>>> import re
>>> data = '[ 0.93913063  0.28020878  0.2769496 ]'
>>> tuple(map(float, re.findall(r"([\d\.]+)", data)))
(0.93913063, 0.28020878, 0.2769496)

这将忽略任何空格并提取所有数字和小数点组。