我正在尝试从多个小型csv文件创建修改后的CSV文件。 field1.csv
和field2.csv
中共有一列。最终的csv文件final.csv
将包含来自column["NAME"]
的{{1}},column["ACC"]
和来自field1.csv
column1["SCORE"]
的{{1}} column["TEAM"]
field2.csv
来自column["ID"]
的{{1}}来自field1.csv
的{{1}}。如果没有值,那么它应该是空白的。我正在使用Python熊猫。
field1.csv: -
column["ID"]
field2.csv: -
field2.csv
final.csv: -
"ID","NAME","ACC","POINT"
"123","TRR","OOP","64"
"124","DEE","OOP","78"
"125","EWR","PLO","98"
我正在尝试的Python代码,
"ID","SCORE","TEAM","END"
"111","92","BCC","0"
"121","80","CSS","1"
"123","87","BCC","0"
答案 0 :(得分:0)
我认为需要一个参数index_col
才能将第一列转换为index
,其中过滤列由usecols
加上join
默认为左连接:
df1 = pd.read_csv("field1.csv", index_col=[0], usecols=["ID","NAME","ACC"])
df2 = pd.read_csv("field2.csv", index_col=[0], usecols=["ID","SCORE","TEAM"])
finaldf = df1.join(df2)
print (finaldf)
NAME ACC SCORE TEAM
ID
123 TRR OOP 87.0 BCC
124 DEE OOP NaN NaN
125 EWR PLO NaN NaN
另一种可能的解决方案是通过子集在join
之前过滤列:
df1 = pd.read_csv("field1.csv", index_col=[0])
df2 = pd.read_csv("field2.csv", index_col=[0])
finaldf = df1[["NAME","ACC"]].join(df2[["SCORE","TEAM"]])
上次写入文件时忽略index
:
finaldf.to_csv('final.csv', index=False)