我有大数据集,前两列看起来与此相似:
team year
Arizona 2006
Arizona 2006
Arizona 2011
Oregon 2011
Oklahoma 2008
Colorado 2005
Colorado 2005
Colorado 2011
我想根据团队和年份创建团队所在会议的另一个变量。我在考虑类似if(data$team="Arizona|Oregon|Colorado" & year=2011){data$conf='Pac-12}
之类的东西,但这并不起作用,因为通常会有多行拥有团队和我正在寻找的年份。这有意义吗?
谢谢!
答案 0 :(得分:4)
也许你需要ifelse
?
teams <- c("Arizona", "Oregon", "Colorado")
data$conf <- ifelse(data$team %in% teams & data$year == 2011,
"Pac-12", "something else")
修改的
您可以通过子集化来更改内容,因为@Simplefish已经向您显示了您是否不想要if
。虽然你的问题确实要求if
。
另一种方法是一次完成所有更改,因此您不要过度复制答案。您可以将if
嵌套为:
ifelse(data$team %in% teams & data$year == 2011,
"Pac-12", ifelse(data$team %in% "Oklahoma" & data$year == 2008,
"second answer", "third answer"))
但是在很多条件下这很麻烦,所以也许你想要:
reference <- matrix(c(rep("Pac-12",3),rep("third answer",4),
"Second Answer",rep("fourth answer",8)),
4, 4,
dimnames=list(c("Arizona","Oregon", "Colorado", "Oklahoma"),
c("2011","2008","2006","2005") )
)
#> reference
# 2011 2008 2006 2005
#Arizona "Pac-12" "third answer" "fourth answer" "fourth answer"
#Oregon "Pac-12" "third answer" "fourth answer" "fourth answer"
#Colorado "Pac-12" "third answer" "fourth answer" "fourth answer"
#Oklahoma "third answer" "Second Answer" "fourth answer" "fourth answer"
data$conf <- with( data, reference [ cbind(team,year) ] )
# > data
# team year conf
#1 Arizona 2006 fourth answer
#2 Arizona 2006 fourth answer
#3 Arizona 2011 Pac-12
#4 Oregon 2011 Pac-12
#5 Oklahoma 2008 Second Answer
#6 Colorado 2005 fourth answer
#7 Colorado 2005 fourth answer
#8 Colorado 2011 Pac-12
最后一种方法是merge
使用data.frame版本的引用.....我相信其他人可能会证明这一点。
答案 1 :(得分:1)
或者你可以通过直接子集来实现:
data$conf <- rep(NA,nrow(data))
data$conf[(data$team == 'Arizona' | data$team == 'Oregon' | data$team == 'Colorado') & data$year == 2011]='PAC-12'