Pandas .apply()函数有多个args

时间:2017-09-07 11:43:44

标签: python pandas dataframe apply

df = rocksfile snapshot of my dataframe

问题:编写一个函数,该函数将占用一行DataFrame并打印出歌曲,艺术家以及发布日期是否为< 1970。

定义我的功能:

def release_info(row):
    """Checks if song is released before or after 1970."""
    if rocksfile.loc[row, 'Release_Year'] < 1970:
        print str(rocksfile.loc[row,'Song_Clean']) + " by " + 
str(rocksfile.loc[row,'Artist_Clean']) \
            + " was released before 1970."
    else:
        print str(rocksfile.loc[row,'Song_Clean']) + " by " + str(rocksfile.loc[row,'Artist_Clean']) \
            + " was released after 1970."

使用.apply()函数,将您编写的函数应用于DataFrame的前四行。 您需要告诉apply函数逐行操作。将关键字参数设置为axis = 1表示该函数应单独应用于每一行。

使用.apply:

rocksfile.apply(release_info, axis = 1, row=1)

错误讯息:

TypeError                                 Traceback (most recent call last)
<ipython-input-61-fe0405b4d1e8> in <module>()
  1 #a = [1]
  2 
----> 3 rocksfile.apply(release_info, axis = 1, row=1)


TypeError: ("release_info() got multiple values for keyword argument 'row'", u'occurred at index 0')

release_info(1)

3 个答案:

答案 0 :(得分:1)

在使用array s(SeriesDataFrames)的pandas中,使用矢量化pandasnumpy函数更好用,这里最好用{ {3}}:

#condition
m = rocksfile['Release_Year'] < 1970
#concatenate columns together
a = rocksfile['Song_Clean'] + " by " + rocksfile['Artist_Clean']
#add different string to end
b =  a + " was released before 1970."
c =  a + " was released after 1970."

rocksfile['new'] = np.where(m, a, b)
print (rocksfile)

答案 1 :(得分:0)

您可以使用np.where并将其减少到1行。

s = rocksfile['Song_Clean'] 
    + ' was released by ' 
    + rocksfile['Artist_Clean'] 
    + pd.Series(np.where(rocksfile['Release_Year'] < 1970, 'before', 'after'))
    + ' 1970'

rocksfile['new'] = s

答案 2 :(得分:0)

下面:

rocksfile.apply(release_info, axis = 1, row=1)

row不属于DataFrame.apply()预期参数,因此it get passed as a keyword argrelease_info()除了第一个位置参数,因此{最终被称为{1}}:

release_info()