我有一个2D文本文件:
[[1406], [1408], [1402], [1394, 102462], [1393], [20388], [20387, 20386], [1386], [1443, 1446, 766], [1432, 1438, 1430, 1416], [1442], [1434], [1430, 1416, 1417, 1419, 3446], [1429], [20011], [20015], [4435], [4441], [4443], [4444], [4448], [2433, 1413, 1418], [4450], [3444], [2478, 823, 3447], [3447], [2481, 1425, 942, 2476, 4449], [2482, 120, 3444], [13512], [3446], [13528]]
有什么方法可以将这个文件读入python吗? 我尝试过:
from numpy import genfromtxt
con2 = genfromtxt('muti.txt', delimiter=',')
con2 = con2.astype(np.int64)
结果显示如下:
nan
nan
nan
nan
nan
nan
nan
1413.0
nan
nan
nan
nan
823.0
nan
nan
nan
1425.0
942.0
2476.0
nan
nan
120.0
nan
nan
nan
nan
数组中有很多nan
。有人可以帮帮我吗?
答案 0 :(得分:3)
我不知道numpy对此有什么功能,但由于您的文本文件恰好是有效的JSON,您可以将其加载为JSON,展平它,然后将结果转换为numpy数组。 / p>
>>> import json
>>> import numpy as np
>>> with open('muti.txt', 'r') as f: arr = json.load(f)
>>> np_arr = np.array([n for subarr in arr for n in subarr]).astype(np.int64)
>>> np_arr
array([ 1406, 1408, 1402, 1394, 102462, 1393, 20388, 20387,
20386, 1386, 1443, 1446, 766, 1432, 1438, 1430,
1416, 1442, 1434, 1430, 1416, 1417, 1419, 3446,
1429, 20011, 20015, 4435, 4441, 4443, 4444, 4448,
2433, 1413, 1418, 4450, 3444, 2478, 823, 3447,
3447, 2481, 1425, 942, 2476, 4449, 2482, 120,
3444, 13512, 3446, 13528], dtype=int64)
答案 1 :(得分:1)
如果这只是文本,那么只需替换/删除括号,拆分,转换为int。
txt = "[[1406], [1408], [1402], [1394, 102462], [1393], [20388], [20387, 20386], [1386], [1443, 1446, 766], [1432, 1438, 1430, 1416], [1442], [1434], [1430, 1416, 1417, 1419, 3446], [1429], [20011], [20015], [4435], [4441], [4443], [4444], [4448], [2433, 1413, 1418], [4450], [3444], [2478, 823, 3447], [3447], [2481, 1425, 942, 2476, 4449], [2482, 120, 3444], [13512], [3446], [13528]]"
data = [int(t) for t in txt.replace("]","").replace("[","").split(',')]
[1406,
1408,
1402,
1394,
...
13512,
3446,
13528]