R替换因子级别中的重复值

时间:2015-01-29 17:42:06

标签: r replace

这是我尝试为我的问题创建可重现的数据:

df<-as.data.frame(cbind("value"=rnorm(29),"both.dates"=c(
"July July","July August","July October","July November","July December",
"August August","August October","August November","August December",
"September August","September September", "September October",
"September November","September December","October August",
"October September", "October October","October November",
"October December","November August","November September", 
"November October", "November November","November December",
"December August", "December September", "December October", 
"December November","December December")))
df$value<-as.numeric(df$value)
head(df)
> head(df)
value    both.dates
1     2     July July
2     8   July August
3    22  July October
4     3 July November
5    12 July December
6    17 August August

我的数据看起来像列#34; both.dates&#34;。 &#34; 8月12月&#34;和#34; 12月8月和#34;是一回事。我想取代所有出现的&#34; 12月8月和#34;与&#34; 8月12月&#34;。

我尝试了replace(),但这并不适用于各种因素。 谢谢。

1 个答案:

答案 0 :(得分:4)

假设您想要替换every重复项,您可以split字符串,然后在指定级别为sortmonth.name。这将确保订单与月订单中的订单相同。

df$both.dates <- sapply(strsplit(as.character(df$both.dates), ' '),
       function(x) paste(sort(factor(x, levels= month.name)),
                 collapse=' '))