在Pandas中按序列迭代创建csv文件

时间:2016-04-11 16:03:43

标签: python csv pandas

我有一个数据框df,如下所示:

Site    Roadname    Count   id  Count_norm
9   A316 Twickenham Rd, Richmond    1474    9SOUTHBOUND 1428
9   A316 Twickenham Rd, Richmond    1375    9SOUTHBOUND 1329
9   A316 Twickenham Rd, Richmond    1052    9SOUTHBOUND 1006
9   A316 Twickenham Rd, Richmond    986     9SOUTHBOUND 940
9   A316 Twickenham Rd, Richmond    1071    9SOUTHBOUND 1025
9   A316 Twickenham Rd, Richmond    1206    9SOUTHBOUND 1160
9   A316 Twickenham Rd, Richmond    1474    9NORTHBOUND 1428
9   A316 Twickenham Rd, Richmond    1375    9NORTHBOUND 1329
9   A316 Twickenham Rd, Richmond    1052    9NORTHBOUND 1006
9   A316 Twickenham Rd, Richmond    986     9NORTHBOUND 940
9   A316 Twickenham Rd, Richmond    1071    9NORTHBOUND 1025
9   A316 Twickenham Rd, Richmond    1206    9NORTHBOUND 1160

我可以通过以下方式创建单独的csv:

11N_series = results[results.id == "11NORTHBOUND"]
11N_series.to_csv('./11NORTHBOUND.csv')

但是,这需要我定义每个系列的名称(id

如何遍历df数据框并按id导出csv?

我可以通过以下方式查看每个ID的计数和名称:

[in] id_count = results.groupby(["id"]).size()
print(id_count)

[out]
id
11NORTHBOUND    467
11SOUTHBOUND    467
15NORTHBOUND    467

1 个答案:

答案 0 :(得分:2)

这是一种可行的方法:

import pandas as pd
from StringIO import StringIO

st = """
Site|Roadname|Count|id|Count_norm
9|A316 Twickenham Rd, Richmond|1474|9SOUTHBOUND|1428
9|A316 Twickenham Rd, Richmond|1375|9SOUTHBOUND|1329
9|A316 Twickenham Rd, Richmond|1052|9SOUTHBOUND|1006
9|A316 Twickenham Rd, Richmond|986|9SOUTHBOUND|940
9|A316 Twickenham Rd, Richmond|1071|9SOUTHBOUND|1025
9|A316 Twickenham Rd, Richmond|1206|9SOUTHBOUND|1160
9|A316 Twickenham Rd, Richmond|1474|9NORTHBOUND|1428
9|A316 Twickenham Rd, Richmond|1375|9NORTHBOUND|1329
9|A316 Twickenham Rd, Richmond|1052|9NORTHBOUND|1006
9|A316 Twickenham Rd, Richmond|986|9NORTHBOUND|940
9|A316 Twickenham Rd, Richmond|1071|9NORTHBOUND|1025
9|A316 Twickenham Rd, Richmond|1206|9NORTHBOUND|1160
""" 

data = pd.read_csv(StringIO(st), delimiter="|", error_bad_lines=False) 

#get a list of unique ids 
ids = pd.unique(data["id"].values.ravel())

grouped_data = data.groupby("id")
for id in ids:
    #get the dataframe for the current id 
    df = grouped_data.get_group(id)
    #export current id's dataframe to a csv file with its name 
    df.to_csv(str(id)+".csv", sep="|", index=False)