AttributeError:无法访问可调用属性' to_csv' ' DataFrameGroupBy'对象,尝试使用' apply'方法

时间:2017-09-03 09:56:55

标签: python python-3.x pandas csv

我有多个csv,其中包含一些列,例如:

match_id,   start_time, win,  leaguename,     team,    opposing, min
2992096687, 1486840800, True, Captains Draft, 3729377, 2642171,  1453382256
2992217489, 1486845476, True, Captains Draft, 3729377, 2642171,  1453382256
2659805546, 1474478411, False,The BTS,        55     , 2642171,  1454281287
2760844196, 1478440750, True, ESL One 2016,   1883502, 2642171,  1459782261
...and so on

我想加入所有csv,按照' min'在通过' leaguename'对其进行分组时,删除重复的匹配项。

我尝试使用此代码执行此操作:

import pandas as pd

af = pd.read_csv('af.csv',keep_default_na=False,na_values=[""])
dc = pd.read_csv('dc.csv',keep_default_na=False,na_values=[""])
eg = pd.read_csv('eg.csv',keep_default_na=False,na_values=[""])
ehome = pd.read_csv('ehome.csv',keep_default_na=False,na_values=[""])
fnatic = pd.read_csv('fnatic.csv',keep_default_na=False,na_values=[""])
ig = pd.read_csv('ig.csv',keep_default_na=False,na_values=[""])
lgd = pd.read_csv('lgd.csv',keep_default_na=False,na_values=[""])
liquid= pd.read_csv('liquid.csv',keep_default_na=False,na_values=[""])
mvp = pd.read_csv('mvp.csv',keep_default_na=False,na_values=[""])
newbee = pd.read_csv('newbee.csv',keep_default_na=False,na_values=[""])
og = pd.read_csv('og.csv',keep_default_na=False,na_values=[""])
secret = pd.read_csv('secret.csv',keep_default_na=False,na_values=[""])
vp = pd.read_csv('vp.csv',keep_default_na=False,na_values=[""])
wings = pd.read_csv('wings.csv',keep_default_na=False,na_values=[""])

df = pd.concat([af, dc, eg, ehome, fnatic, ig, lgd, liquid, mvp, newbee, og, secret, vp, wings],axis=0).drop_duplicates()
df2 = df.sort_values("min").groupby("leaguename", as_index=False)
df2.to_csv('out.csv')

返回

AttributeError: Cannot access callable attribute 'to_csv' of 'DataFrameGroupBy' objects, try using the 'apply' method

我该如何解决这个问题?

编辑:我尝试使用

df2 = df.apply(pd.DataFrame.sort_values, 'min').groupby("leaguename", as_index=False)

并返回另一个错误:

ValueError: No axis named min for object type <class 'pandas.core.frame.DataFrame'>

然后我试了

df2 = df.apply(pd.DataFrame.sort_values, 'min',axis=0).groupby("leaguename", as_index=False)

它返回

TypeError: apply() got multiple values for argument 'axis'

我还有一个关于重复的小问题。如何删除不完全重复的重复项(合并csv之后)? 例如:

2992217489, 1486845476, True, Captains Draft, 3729377, 2642171,  1453382256
2992217489, 1486845476, False,Captains Draft, 2642171, 3729377,  1453382256

在上面的数据中,第1行和第2行是重复的,因为它是相同的匹配(相同的match_id),但是&#39; team&#39;,&#39;反对&#39;并且&#39;赢得&#39;是不同的数据。

0 个答案:

没有答案