我有一个匹配算法,可以将学生与项目联系起来。它正在工作,我无法将数据导出到csv文件。只有当要导出200个值时,它才会使用最后一个值并仅导出。
当我希望得到整个数据时,导出的数据会将每个数字用作一个值。\ n'而不是构成“三个”的三个数字,它们被分成三列。我已经附上了下面的图片。任何帮助将不胜感激。
#Imports for Pandas
import pandas as pd
from pandas import DataFrame
SPA()
for m in M:
s = m['student']
l = m['lecturer']
Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
id = m['projectid']
p = Project[id]['title']
c = Project[id]['sourceid']
r = str(getRank("Single_Projects1copy.csv",s,c))
print(s+","+l+","+p+","+c+","+r)
dataPack = (s+","+l+","+p+","+c+","+r)
df = pd.DataFrame.from_records([dataPack])
df.to_csv('try.csv')
答案 0 :(得分:1)
你继续覆盖循环,所以你只得到最后一位数据,你需要用df.to_csv('try.csv',mode="a",header=False)
附加到csv或创建一个df并追加到那个并在循环外写,类似于:
df = pd.DataFrame()
for m in M:
s = m['student']
l = m['lecturer']
Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
id = m['projectid']
p = Project[id]['title']
c = Project[id]['sourceid']
r = str(getRank("Single_Projects1copy.csv",s,c))
print(s+","+l+","+p+","+c+","+r)
dataPack = (s+","+l+","+p+","+c+","+r)
df.append(pd.DataFrame.from_records([dataPack]))
df.to_csv('try.csv') # write all data once outside the loop
更好的选择是打开文件并将该文件对象传递给to_csv
:
with open('try.csv', 'w') as f:
for m in M:
s = m['student']
l = m['lecturer']
Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
id = m['projectid']
p = Project[id]['title']
c = Project[id]['sourceid']
r = str(getRank("Single_Projects1copy.csv",s,c))
print(s+","+l+","+p+","+c+","+r)
dataPack = (s+","+l+","+p+","+c+","+r)
pd.DataFrame.from_records([dataPack]).to_csv(f, header=False)
你得到个别字符,因为你使用from_records传递一个字符串dataPack
作为值,所以它迭代字符:
In [18]: df = pd.DataFrame.from_records(["foobar,"+"bar"])
In [19]: df
Out[19]:
0 1 2 3 4 5 6 7 8 9
0 f o o b a r , b a r
In [20]: df = pd.DataFrame(["foobar,"+"bar"])
In [21]: df
Out[21]:
0
0 foobar,bar
我认为你基本上想要留下作为元组dataPack = (s, l, p,c, r)
并使用pd.DataFrame(dataPack)
。你根本不需要pandas,csv lib会为你做这一切而不需要创建Dataframes。