我试图在python中打开一个.txt文件作为数组,所以我可以对其中的元素进行操作。 .txt文件(abc.txt)看起来像这样。
AL192012, TONY, 20,
20121021, 1800, , LO, 20.1N, 50.8W, 25, 1011,
20121022, 0000, , LO, 20.4N, 51.2W, 25, 1011,
20121022, 0600, , LO, 20.8N, 51.5W, 25, 1010,
20121022, 1200, , LO, 21.3N, 51.7W, 30, 1009,
AL182012, SANDY, 45,
20121021, 1800, , LO, 14.3N, 77.4W, 25, 1006,
20121022, 0000, , LO, 13.9N, 77.8W, 25, 1005,
20121022, 0600, , LO, 13.5N, 78.2W, 25, 1003,
20121022, 1200, , TD, 13.1N, 78.6W, 30, 1002,
我尝试了pd.read_csv('abc.txt')
,loadtxt("abc.txt")
和genfromtxt("abc.txt")
。但是他们只生成了包含三列的数组,可能是因为第一行只有三列。但我希望它与.txt文件具有相同的八列。这可能吗?谢谢!
答案 0 :(得分:2)
尝试这样的事情:
data = []
with open("filename") as f:
for line in f:
data.append(line.split(","))
并且它将为您提供可以操作的数据的2D数组。
如果你想转置它,你不能只使用普通的zip,你需要使用itertools.izip_longest
,如上所述here。
所以你转换它就像:
data = list(itertools.izip_longest(*data))
答案 1 :(得分:1)
>>> with open(filename) as f:
data = [[cell.strip() for cell in row.rstrip(',').split(',')] for row in f]
>>> for row in data:
print(row)
['AL192012', 'TONY', '20']
['20121021', '1800', '', 'LO', '20.1N', '50.8W', '25', '1011']
['20121022', '0000', '', 'LO', '20.4N', '51.2W', '25', '1011']
['20121022', '0600', '', 'LO', '20.8N', '51.5W', '25', '1010']
['20121022', '1200', '', 'LO', '21.3N', '51.7W', '30', '1009']
['AL182012', 'SANDY', '45']
['20121021', '1800', '', 'LO', '14.3N', '77.4W', '25', '1006']
['20121022', '0000', '', 'LO', '13.9N', '77.8W', '25', '1005']
['20121022', '0600', '', 'LO', '13.5N', '78.2W', '25', '1003']
['20121022', '1200', '', 'TD', '13.1N', '78.6W', '30', '1002']
如果你想修复短线的索引,你可以明确地做到这一点:
>>> data = [row if len(row) == 8 else row[0:1] + [''] * 3 + row[1:3] + [''] * 2 for row in data]
>>> for row in data:
print(row)
['AL192012', '', '', '', 'TONY', '20', '', '']
['20121021', '1800', '', 'LO', '20.1N', '50.8W', '25', '1011']
['20121022', '0000', '', 'LO', '20.4N', '51.2W', '25', '1011']
['20121022', '0600', '', 'LO', '20.8N', '51.5W', '25', '1010']
['20121022', '1200', '', 'LO', '21.3N', '51.7W', '30', '1009']
['AL182012', '', '', '', 'SANDY', '45', '', '']
['20121021', '1800', '', 'LO', '14.3N', '77.4W', '25', '1006']
['20121022', '0000', '', 'LO', '13.9N', '77.8W', '25', '1005']
['20121022', '0600', '', 'LO', '13.5N', '78.2W', '25', '1003']
['20121022', '1200', '', 'TD', '13.1N', '78.6W', '30', '1002']
答案 2 :(得分:0)
这是一个片段:
#!/usr/bin/python
import sys
with open(sys.argv[1], 'r') as f:
content = f.readlines()
for w in content:
print w
# split and loop again -> w.split(',')
f.readlines()
返回一个数组
w
是一个数组。