我有一个数据框df
,如下所示:
Site Roadname Count id Count_norm
9 A316 Twickenham Rd, Richmond 1474 9SOUTHBOUND 1428
9 A316 Twickenham Rd, Richmond 1375 9SOUTHBOUND 1329
9 A316 Twickenham Rd, Richmond 1052 9SOUTHBOUND 1006
9 A316 Twickenham Rd, Richmond 986 9SOUTHBOUND 940
9 A316 Twickenham Rd, Richmond 1071 9SOUTHBOUND 1025
9 A316 Twickenham Rd, Richmond 1206 9SOUTHBOUND 1160
9 A316 Twickenham Rd, Richmond 1474 9NORTHBOUND 1428
9 A316 Twickenham Rd, Richmond 1375 9NORTHBOUND 1329
9 A316 Twickenham Rd, Richmond 1052 9NORTHBOUND 1006
9 A316 Twickenham Rd, Richmond 986 9NORTHBOUND 940
9 A316 Twickenham Rd, Richmond 1071 9NORTHBOUND 1025
9 A316 Twickenham Rd, Richmond 1206 9NORTHBOUND 1160
我可以通过以下方式创建单独的csv:
11N_series = results[results.id == "11NORTHBOUND"]
11N_series.to_csv('./11NORTHBOUND.csv')
但是,这需要我定义每个系列的名称(id
)
如何遍历df
数据框并按id
导出csv?
我可以通过以下方式查看每个ID的计数和名称:
[in] id_count = results.groupby(["id"]).size()
print(id_count)
[out]
id
11NORTHBOUND 467
11SOUTHBOUND 467
15NORTHBOUND 467
答案 0 :(得分:2)
这是一种可行的方法:
import pandas as pd
from StringIO import StringIO
st = """
Site|Roadname|Count|id|Count_norm
9|A316 Twickenham Rd, Richmond|1474|9SOUTHBOUND|1428
9|A316 Twickenham Rd, Richmond|1375|9SOUTHBOUND|1329
9|A316 Twickenham Rd, Richmond|1052|9SOUTHBOUND|1006
9|A316 Twickenham Rd, Richmond|986|9SOUTHBOUND|940
9|A316 Twickenham Rd, Richmond|1071|9SOUTHBOUND|1025
9|A316 Twickenham Rd, Richmond|1206|9SOUTHBOUND|1160
9|A316 Twickenham Rd, Richmond|1474|9NORTHBOUND|1428
9|A316 Twickenham Rd, Richmond|1375|9NORTHBOUND|1329
9|A316 Twickenham Rd, Richmond|1052|9NORTHBOUND|1006
9|A316 Twickenham Rd, Richmond|986|9NORTHBOUND|940
9|A316 Twickenham Rd, Richmond|1071|9NORTHBOUND|1025
9|A316 Twickenham Rd, Richmond|1206|9NORTHBOUND|1160
"""
data = pd.read_csv(StringIO(st), delimiter="|", error_bad_lines=False)
#get a list of unique ids
ids = pd.unique(data["id"].values.ravel())
grouped_data = data.groupby("id")
for id in ids:
#get the dataframe for the current id
df = grouped_data.get_group(id)
#export current id's dataframe to a csv file with its name
df.to_csv(str(id)+".csv", sep="|", index=False)