我是Python新手,这是我在这里发表的第一篇文章,所以我希望你能为我留下深刻印象。我在将csv文件读成所需格式时遇到了大麻烦。我的文件由132列组成,文件头部如下所示:
['10520', ' 386681375.82149398', ' 85.25775430', ' -56.07840500', ' 173', ' 153', ' 151', ' 161', ' 180', ' 167', ' 189', ' 171', ' 173', ' 171', ' 207', ' 169', ' 173', ' 168', ' 184', ' 168', ' 201', ' 197', ' 204', ' 201', ' 210', ' 239', ' 211', ' 227', ' 247', ' 248', ' 266', ' 276', ' 322', ' 336', ' 331', ' 381', ' 358', ' 483', ' 532', ' 709', ' 841', ' 1004', ' 1128', ' 1540', ' 1945', ' 2747', ' 3718', ' 5378', ' 6273', ' 8415', ' 12727', ' 18248', ' 24103', ' 33688', ' 40744', ' 52821', ' 65535', ' 59114', ' 55225', ' 49919', ' 51894', ' 58381', ' 50376', ' 48315', ' 42337', ' 30577', ' 24078', ' 24337', ' 22432', ' 20191', ' 19999', ' 17674', ' 22519', ' 22542', ' 22644', ' 23966', ' 21033', ' 21326', ' 20257', ' 20441', ' 21859', ' 26976', ' 32514', ' 34732', ' 45555', ' 48416', ' 34952', ' 28511', ' 24611', ' 18843', ' 17081', ' 14592', ' 13550', ' 13011', ' 15370', ' 15827', ' 15232', ' 16054', ' 14823', ' 14538', ' 12544', ' 11865', ' 11442', ' 10089', ' 10340', ' 11269', ' 11336', ' 11873', ' 10012', ' 9824', ' 9488', ' 7696', ' 9273', ' 9502', ' 8752', ' 8341', ' 8192', ' 8293', ' 8067', ' 8402', ' 9258', ' 9290', ' 8144', ' 8009', ' 7660', ' 6772', ' 6008', ' 6792', ' 6993', ' 6662', ' 7047', ' 6662 ']
['10520', ' 386681375.86699998', ' 85.25527360', ' -56.09263480', ' 113', ' 102', ' 120', ' 124', ' 117', ' 127', ' 124', ' 118', ' 128', ' 120', ' 125', ' 120', ' 140', ' 135', ' 144', ' 127', ' 143', ' 148', ' 141', ' 153', ' 142', ' 142', ' 149', ' 152', ' 168', ' 180', ' 196', ' 188', ' 196', ' 246', ' 259', ' 270', ' 337', ' 360', ' 506', ' 540', ' 625', ' 887', ' 1122', ' 1251', ' 2007', ' 2883', ' 3238', ' 4370', ' 6240', ' 9164', ' 10751', ' 16656', ' 20996', ' 27753', ' 37774', ' 35377', ' 38637', ' 39265', ' 35183', ' 38830', ' 32149', ' 25455', ' 27272', ' 24488', ' 21036', ' 20931', ' 17166', ' 17019', ' 18196', ' 15450', ' 15120', ' 15934', ' 15021', ' 14936', ' 16253', ' 16457', ' 15873', ' 19667', ' 23150', ' 26140', ' 35761', ' 42594', ' 61758', ' 65535', ' 42354', ' 28672', ' 25173', ' 20344', ' 15883', ' 14432', ' 10575', ' 11342', ' 12348', ' 13229', ' 19632', ' 23456', ' 18102', ' 15600', ' 13425', ' 9962', ' 8281', ' 7609', ' 6948', ' 7391', ' 8878', ' 10006', ' 11295', ' 10073', ' 9410', ' 10354', ' 10667', ' 10054', ' 9011', ' 8793', ' 9055', ' 7463', ' 6692', ' 8051', ' 8330', ' 7369', ' 6612', ' 6328', ' 6545', ' 6235', ' 5895', ' 5085', ' 4876', ' 5154', ' 4649', ' 5226', ' 6137', ' 5354 ']
我有兴趣获得:
所以代码看起来像这样
import sys, math, numpy
from numpy import *
from scipy import *
import csv
try:
ifile = sys.argv[1]
#; ofile = sys.argv[2]
except:
print "Usage:", sys.argv[0], "ifile"; sys.exit(1)
# Open and read file from std, and assign first four (orbit, time, lat, lon) columns to four lists, and last 128 columns (waveforms) to an array.
ifile = open(ifile)
orbit = []
time = []
lat = []
lon = []
#wvf= [[],[]]
try:
reader = csv.reader(ifile, delimiter=',')
for row in reader:
orbit.append(row[0])
time.append(row[1])
lat.append(row[2])
lon.append(row[3])
# wvf = [row[4:132] for row in reader] row[0:128] for col in len(reader)]
wvf = [row[4:132]],[row[1:128]]
finally:
ifile.close()
...and now do something with data...
我已经考虑过首先分割所有行,然后将最后128列收集到数组中,但我还没有成功。
我希望你对我想要完成的事情有所了解,并且能够帮助我。 感谢
答案 0 :(得分:2)
您可以使用np.genfromtxt将文件加载到numpy数组中。这样做的一个优点是数据直接从文件转移到节省空间的numpy数组。如果您使用csv
模块,并将数据存储在Python列表中,那么您的数据将消耗更多内存。
import sys
import numpy as np
try:
ifile = sys.argv[1]
#; ofile = sys.argv[2]
except:
print "Usage:", sys.argv[0], "ifile"; sys.exit(1)
# Open and read file from std, and assign first four (orbit, time, lat, lon)
# columns to four lists, and last 128 columns (waveforms) to an array.
def remove_bracket(line):
return float(line.strip("][ '"))
data = np.genfromtxt(ifile, delimiter = ',',
dtype = 'float',
converters = {i:remove_bracket for i in range(132)}
)
orbit = data[:,0]
time = data[:,1]
lat = data[:,2]
lon = data[:,3]
wvf = data[:,4:128]
print(wvf)
请注意,变量orbit
,time
等是data
的“视图” - 它们不是data
的副本,因此不需要(额外的记忆。这也意味着修改orbit
也会影响data
,反之亦然。
答案 1 :(得分:0)
简单地:
wvf = []
try:
reader = csv.reader(ifile, delimiter=',')
for row in reader:
# ...
wvf.append(row[4:132])
将wvf
初始化为与其他数据类似的空数组,然后append
每行数据一个子列表(切片)。
(以防您的数据非常大而且您希望优化内存使用情况:array
模块可以实现高效存储。)