df = rocksfile snapshot of my dataframe
问题:编写一个函数,该函数将占用一行DataFrame并打印出歌曲,艺术家以及发布日期是否为< 1970。
定义我的功能:
def release_info(row):
"""Checks if song is released before or after 1970."""
if rocksfile.loc[row, 'Release_Year'] < 1970:
print str(rocksfile.loc[row,'Song_Clean']) + " by " +
str(rocksfile.loc[row,'Artist_Clean']) \
+ " was released before 1970."
else:
print str(rocksfile.loc[row,'Song_Clean']) + " by " + str(rocksfile.loc[row,'Artist_Clean']) \
+ " was released after 1970."
使用.apply()函数,将您编写的函数应用于DataFrame的前四行。 您需要告诉apply函数逐行操作。将关键字参数设置为axis = 1表示该函数应单独应用于每一行。
使用.apply:
rocksfile.apply(release_info, axis = 1, row=1)
错误讯息:
TypeError Traceback (most recent call last)
<ipython-input-61-fe0405b4d1e8> in <module>()
1 #a = [1]
2
----> 3 rocksfile.apply(release_info, axis = 1, row=1)
TypeError: ("release_info() got multiple values for keyword argument 'row'", u'occurred at index 0')
release_info(1)
答案 0 :(得分:1)
在使用array
s(Series
,DataFrames
)的pandas中,使用矢量化pandas
或numpy
函数更好用,这里最好用{ {3}}:
#condition
m = rocksfile['Release_Year'] < 1970
#concatenate columns together
a = rocksfile['Song_Clean'] + " by " + rocksfile['Artist_Clean']
#add different string to end
b = a + " was released before 1970."
c = a + " was released after 1970."
rocksfile['new'] = np.where(m, a, b)
print (rocksfile)
答案 1 :(得分:0)
您可以使用np.where
并将其减少到1行。
s = rocksfile['Song_Clean']
+ ' was released by '
+ rocksfile['Artist_Clean']
+ pd.Series(np.where(rocksfile['Release_Year'] < 1970, 'before', 'after'))
+ ' 1970'
rocksfile['new'] = s
答案 2 :(得分:0)
下面:
rocksfile.apply(release_info, axis = 1, row=1)
row
不属于DataFrame.apply()
预期参数,因此it get passed as a keyword arg至release_info()
,除了第一个位置参数,因此{最终被称为{1}}:
release_info()