Question

因此friends是一个列，每个实例中都有一个列表，例如df['friends][0] = [id1, id2, ..., idn]。我试图计算单独列中的朋友数量，例如df['friend_counts'][0] = n。

我做了以下事情。我已在其他数据集中使用此代码，但由于某种原因，它会永远占用，而数据集只有300,000个实例。

df_user['friend_counts'] = df_user['friends'].apply(lambda x: len(df_user.friends[x]))

此外，由于某些原因，以下代码会创建一个season列但未填充，即它只是空格。这很麻烦，因为我为每个其他数据集执行了完全相同的代码。他们改变了.apply()方法吗？

#Convert 'date' to a date time object
df_reviews["date"] = pd.to_datetime(df_reviews["date"])
#Splitting up 'release_date' -> 'release_weekday', 'release_month', 
'release_year'
df_reviews["weekday"] = df_reviews["date"].dt.weekday_name
df_reviews["month"] = df_reviews["date"].dt.month
df_reviews["year"] = df_reviews["date"].dt.year
### Helper function
def season_converter(month_name):
""" Returns the season a particular month is in """
season = ""`enter code here`
#Winter
if month_name in ['Jan', 'Feb', 'Dec']:
    season = "Winter" 
#Spring
if month_name in ['Mar', 'Apr', 'May']:
    season = "Spring" 
#Summer
if month_name in ['Jun', 'Jul', 'Aug'] : 
    season = "Summer"
#Fall
if month_name in ['Sep', 'Oct', 'Nov']: 
    season = "Fall"
#Other
if month_name == "NA":
    season = "NA"
return season
#Create a new column that holds seasonal information
df_reviews['season'] = df_reviews['month'].apply(lambda x: 
season_converter(x))

Answer 1

我建议dictionary使用map来提高效果：

d = {1:'Winter', 2:'Winter', 12:'Winter', 3: 'Spring', .... np.nan:'NA', 'NA':'NA'}
df_reviews['season'] = df_reviews['month'].map(d)

另一种解决方案是否可以使用数字季节：

df_reviews['season'] = (df_reviews['month'] % 12 + 3) // 3

.apply（）方法不起作用

1 个答案: