我需要区分列ID(带字母),如果它们在其他变量中不同(" Art"在这种情况下)。像这样:
Id<-c("RoLu1976", "RoLu1976", "AlBlKyFy1989", "ThSa1996", "AlBlKyFy1989","ThSa1996")
Art<-c("Econometric Policy Evaluation", "Policy Right", "Rules", "Expectations", "Nonneutrality of money","Expectations")
Yr<-c(1976, 1976, 1989, 1996, 1989, 1996)
df<-data.frame(Id,Art,Yr)
在上面,Ids应该是:
Id Art Yr
RoLu1976a Econometric Policy Evaluation 1976
RoLu1976b Policy Right 1976
AlBlKyFi1989a Rules 1989
ThSa1996 Expectations 1996
AlBlKyFi1989b Nonneutrality of money 1989
ThSa1996 Expectations 1996
在这种情况下,列ID在某些情况下是相同的(例如RoLu1976
},但在&#34; Art&#34;列。
答案 0 :(得分:4)
使用dplyr
包:
library(dplyr)
df %>%
arrange(Id, Art) %>%
group_by(Id) %>%
mutate(Id2 = if(length(unique(Art)) > 1) paste0(Id, "_", letters[as.numeric(factor(Art))]) else as.character(Id)) %>%
ungroup %>%
select(Id=Id2, everything(), -Id)
Id Art Yr 1 AlBlKyFy1989_a Nonneutrality of money 1989 2 AlBlKyFy1989_b Rules 1989 3 RoLu1976_a Econometric Policy Evaluation 1976 4 RoLu1976_b Policy Right 1976 5 ThSa1996 Expectations 1996 6 ThSa1996 Expectations 1996
答案 1 :(得分:1)
使用dplyr
:
df%>%group_by(Id)%>%
mutate(nb_art=length(unique(Art)))%>%
mutate(lettre=letters[seq(nb_art)])%>%
mutate(Id_letters=paste0(Id,ifelse(nb_art>1,lettre,"")))%>%
ungroup()%>%
mutate(Id=Id_letters)%>%
select(Id,Art,Yr)
这可以缩短,但它使得阅读非常清楚(我希望)。
# A tibble: 7 x 3
Id Art Yr
<chr> <fctr> <dbl>
1 RoLu1976a Econometric Policy Evaluation 1976
2 RoLu1976b Policy Right 1976
3 AlBlKyFy1989a Rules 1989
4 ThSa1996 Expectations 1996
5 AlBlKyFy1989b Nonneutrality of money 1989
6 ThSa1996 Expectations 1996
答案 2 :(得分:1)
data.table
解决方案
library(data.table)
setDT(df)
df[, tmp := seq(uniqueN(Art)), by = Id]
df[, addition := ifelse(.N>1, "",letters[tmp]), by = .(Id, Art)]
df[, Id := paste0(Id, addition)]
df[, c("tmp", "addition") := NULL]
答案 3 :(得分:1)
使用for循环:
SELECT ID
FROM USERS U
JOIN (
-- Users in two projects
SELECT USER_ID
FROM USER_PROJECT
WHERE PROJECT_ID IN (1,2)
GROUP BY USER_ID
HAVING COUNT(DISTINCT PROJECT_ID) = 2
) UP ON U.ID = UP.USER_ID
JOIN (
-- user ids that have appointments on two dates:
SELECT USER_ID
FROM APPOINTMENT
WHERE DATE IN ('2016-10-07','2016-11-15')
GROUP BY USER_ID
HAVING COUNT(DISTINCT DATE) = 2
) A ON U.ID = A.USER_ID