Question

问题：编写一个函数，该函数将占用一行DataFrame并打印出歌曲，艺术家以及发布日期是否为＆lt; 1970。

定义我的功能：

def release_info(row):
    """Checks if song is released before or after 1970."""
    if rocksfile.loc[row, 'Release_Year'] < 1970:
        print str(rocksfile.loc[row,'Song_Clean']) + " by " + 
str(rocksfile.loc[row,'Artist_Clean']) \
            + " was released before 1970."
    else:
        print str(rocksfile.loc[row,'Song_Clean']) + " by " + str(rocksfile.loc[row,'Artist_Clean']) \
            + " was released after 1970."

使用.apply（）函数，将您编写的函数应用于DataFrame的前四行。您需要告诉apply函数逐行操作。将关键字参数设置为axis = 1表示该函数应单独应用于每一行。

使用.apply：

rocksfile.apply(release_info, axis = 1, row=1)

错误讯息：

TypeError                                 Traceback (most recent call last)
<ipython-input-61-fe0405b4d1e8> in <module>()
  1 #a = [1]
  2 
----> 3 rocksfile.apply(release_info, axis = 1, row=1)


TypeError: ("release_info() got multiple values for keyword argument 'row'", u'occurred at index 0')

release_info（1）

Answer 1

在使用array s（Series，DataFrames）的pandas中，使用矢量化pandas或numpy函数更好用，这里最好用{ {3}}：

#condition
m = rocksfile['Release_Year'] < 1970
#concatenate columns together
a = rocksfile['Song_Clean'] + " by " + rocksfile['Artist_Clean']
#add different string to end
b =  a + " was released before 1970."
c =  a + " was released after 1970."

rocksfile['new'] = np.where(m, a, b)
print (rocksfile)

Answer 2

您可以使用np.where并将其减少到1行。

s = rocksfile['Song_Clean'] 
    + ' was released by ' 
    + rocksfile['Artist_Clean'] 
    + pd.Series(np.where(rocksfile['Release_Year'] < 1970, 'before', 'after'))
    + ' 1970'

rocksfile['new'] = s

Answer 3

下面：

rocksfile.apply(release_info, axis = 1, row=1)

row不属于DataFrame.apply()预期参数，因此it get passed as a keyword arg至release_info()，除了第一个位置参数，因此{最终被称为{1}}：

release_info()

Pandas .apply（）函数有多个args

3 个答案: