使用Pandas将多行数据导出到csv

时间:2016-03-26 16:56:44

标签: python csv pandas

我有一个匹配算法,可以将学生与项目联系起来。它正在工作,我无法将数据导出到csv文件。只有当要导出200个值时,它才会使用最后一个值并仅导出。

当我希望得到整个数据时,导出的数据会将每个数字用作一个值。\ n'而不是构成“三个”的三个数字,它们被分成三列。我已经附上了下面的图片。任何帮助将不胜感激。

What it looks like

What it should look like

#Imports for Pandas

import pandas as pd
from pandas import DataFrame 

SPA()
for m in M:
   s = m['student']
   l = m['lecturer']
   Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
   id = m['projectid']
   p = Project[id]['title']
   c = Project[id]['sourceid']
   r = str(getRank("Single_Projects1copy.csv",s,c))


   print(s+","+l+","+p+","+c+","+r)

   dataPack = (s+","+l+","+p+","+c+","+r)

   df = pd.DataFrame.from_records([dataPack])
   df.to_csv('try.csv')

1 个答案:

答案 0 :(得分:1)

你继续覆盖循环,所以你只得到最后一位数据,你需要用df.to_csv('try.csv',mode="a",header=False)附加到csv或创建一个df并追加到那个并在循环外写,类似于:

df = pd.DataFrame()
for m in M:
   s = m['student']
   l = m['lecturer']
   Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
   id = m['projectid']
   p = Project[id]['title']
   c = Project[id]['sourceid']
   r = str(getRank("Single_Projects1copy.csv",s,c))


   print(s+","+l+","+p+","+c+","+r)

   dataPack = (s+","+l+","+p+","+c+","+r)

   df.append(pd.DataFrame.from_records([dataPack]))
df.to_csv('try.csv') # write all data once outside the loop

更好的选择是打开文件并将该文件对象传递给to_csv

with open('try.csv', 'w') as f:
    for m in M:
       s = m['student']
       l = m['lecturer']
       Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
       id = m['projectid']
       p = Project[id]['title']
       c = Project[id]['sourceid']
       r = str(getRank("Single_Projects1copy.csv",s,c))
       print(s+","+l+","+p+","+c+","+r)

       dataPack = (s+","+l+","+p+","+c+","+r)
       pd.DataFrame.from_records([dataPack]).to_csv(f, header=False)

你得到个别字符,因为你使用from_records传递一个字符串dataPack作为值,所以它迭代字符:

In [18]: df = pd.DataFrame.from_records(["foobar,"+"bar"])

In [19]: df
Out[19]: 
   0  1  2  3  4  5  6  7  8  9
0  f  o  o  b  a  r  ,  b  a  r

In [20]: df = pd.DataFrame(["foobar,"+"bar"])

In [21]: df
Out[21]: 
            0
0  foobar,bar

我认为你基本上想要留下作为元组dataPack = (s, l, p,c, r)并使用pd.DataFrame(dataPack)。你根本不需要pandas,csv lib会为你做这一切而不需要创建Dataframes。