我在电影的data.frame中有一个变量(distributor,format = factor)。我想用少于10次的所有分销商名称来代替“小型公司”。我能够拿出一个清单并使用
进行计数import pandas as pd
file = "file.csv"
df = pd.read_csv(file)
pd.options.display.max_columns = len(df.columns)
print(df)
但是我无法在我的data.frame中进行替换。
答案 0 :(得分:1)
这是使用dplyr
的解决方案。
library(dplyr)
## make some dummy data
df <- tribble(
~distributor, ~something,
"dist1", 89,
"dist2", 92,
"dist3", 29,
"dist1", 89
)
df %>%
group_by(distributor) %>%
## this counts the number of occurences of each distributor
mutate(occurrences = n()) %>%
ungroup() %>%
## change the name of the distributor if the occurrences are less than 2
mutate(distributor = ifelse(occurrences < 2, "small company", distributor))