我将excel的工作表保存为csv格式。在使用代码导入python中的数据后:
import csv
with open('45deg_marbles.csv', 'r') as f:
reader = csv.reader(f,dialect='excel')
basis = []
for row in reader:
print(row)
输出:
['1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16']
['0.001;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363']
['0.002;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363;11.00127363']
['0.003;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283;10.94525283']
基本上它有16列和1399行。我意识到每一行都包含一个长字符串,然后我替换了所有';'使用','有希望有助于将字符串列转换为矩阵,我可以使用它来操作数据。现在我最终得到一个矩阵,或者更确切地说是一行包含所有字符串的列表。这就是我到目前为止在代码和输出方面的分别:
import csv
with open('45deg_marbles.csv', 'r') as f:
reader = csv.reader(f,dialect='excel')
basis = []
for row in reader:
#print(row)
for i in range(len(row)):
new_row = (row[i].replace(';', ','))
basis.append(new_row)
print(basis)
>> ['1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16', '0.001,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363', '0.002,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363', '0.003,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', '0.004,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', '0.005,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', '0.006,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', '0.007,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', '0.008,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', '0.009,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', '0.01,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283', ... , '1.396,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0', '1.397,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0', '1.398,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0']
但这是我想要的形式,矩阵等于:
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],[0.001,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363],[0.002,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363,11.00127363], [0.003,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283,10.94525283]]
为了对数据进行操作
我非常感谢任何帮助。提前谢谢。
答案 0 :(得分:2)
将分隔符更改为分号(默认为逗号,由于输入数据中包含分号,因此在此处不起作用)(我认为您可以省略dialect='excel'
部分)
import csv
with open('45deg_marbles.csv', 'r') as f:
reader = csv.reader(f,dialect='excel',delimiter=";")
basis = list(reader)
现在basis
是包含数据为文本的行列表。
但是你想要它们作为整数/浮点数。所以你必须做更多的后处理:list comprehension转换为整数,如果它是一个整数(负整数也起作用),否则转换为float(当然,如果有字母数字行,则需要添加另一个测试,但不是这里的情况)
import csv,re
intre = re.compile(r"-?\d+$")
with open('45deg_marbles.csv', 'r') as f:
reader = csv.reader(f,dialect='excel',delimiter=";")
basis = []
for row in reader:
basis.append([int(x) if intre.match(x) else float(x) for x in row])
print(basis)
结果
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], [0.001, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363], [0.002, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363, 11.00127363], [0.003, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283, 10.94525283]]
请注意,如果保证整数为正数,则存在变量。保存正则表达式评估:
basis.append([int(x) if x.isdigit() else float(x) for x in row])
答案 1 :(得分:-2)
你需要做的是
for row in reader:
basis.append(row.split(';'))
你做错了就是你替换';'使用逗号','这不会从字符串中生成列表,只是替换此字符串中的符号。您应该将字符串拆分为元素。