我有一张excel(xlsx)牌桌,在“玩家”栏目中,欧洲玩家的名字中有一个星号,而南美人则没有。像这样的东西
PLAYERS
Neymar
*Bale*
Messi
*Ronaldo*
*Benzema*
*Iniesta*
DiMaria
有什么方法可以使用R(或excel本身)将这个数据集拆分成一个欧洲人(带星号)和另一个带南美人的数据集?当然,数据集包含其他列,如“SALARY”,“SCORED GOALS”,“OFFSITE”,“AGE”等等。
谢谢, 迭。
答案 0 :(得分:1)
您可以检查玩家名称中是否有“*”并在新列中写入“欧洲”或“南美”,如果需要,您可以将数据框拆分为包含两个数据的列表。框架,一个与欧洲人,另一个与南美人:
df <- data.frame(PLAYERS = c("Neymar", "*Ronaldo*", "Messi"), SALARY = 5:7)
df
# PLAYERS SALARY
#1 Neymar 5
#2 *Ronaldo* 6
#3 Messi 7
# check if there's a * in the PLAYERS column
df$Location <- ifelse(grepl("\\*", df$PLAYERS), "European", "South American")
df
# PLAYERS SALARY Location
#1 Neymar 5 South American
#2 *Ronaldo* 6 European
#3 Messi 7 South American
#split the data based on location:
dflist <- split(df, df$Location)
dflist
#$European
# PLAYERS SALARY Location
#2 *Ronaldo* 6 European
#
#$`South American`
# PLAYERS SALARY Location
#1 Neymar 5 South American
#3 Messi 7 South American
现在,您可以通过键入
来访问每个列表元素(这是一个data.frame)dflist[["European"]] # or "South American" instead
# PLAYERS SALARY Location
#2 *Ronaldo* 6 European
答案 1 :(得分:1)
您可以拆分此特定列,并使用split
和setNames
> dat <- structure(list(PLAYERS = structure(c(6L, 1L, 5L, 7L, 2L, 4L, 3L),
.Label = c("*Bale*", "*Benzema*", "DiMaria", "*Iniesta*",
"Messi", "Neymar", "*Ronaldo*"), class = "factor")),
.Names = "PLAYERS", class = "data.frame", row.names = c(NA,-7L))
> setNames(split(dat, grepl("[*]", dat$PLAYERS)), nm = c("Euro", "SoAm"))
#$Euro
# PLAYERS
# 1 Neymar
# 3 Messi
# 7 DiMaria
#
# $SoAm
# PLAYERS
# 2 *Bale*
# 4 *Ronaldo*
# 5 *Benzema*
# 6 *Iniesta*
答案 2 :(得分:0)
使用PLAYERS
为源数据创建一个数据透视表,用于ROWS。使用标签过滤器进行过滤,包含... ~*
,然后点击Grand Total
。返回PT,选择不包含...并再次单击Grand Total
。