如何从现有字符和数字列的组合创建新列?

时间:2016-07-04 03:43:46

标签: r

我的数据结构如下:

 Name   Drill Movement Repetition    DV    
1 RUTH 90_Turn   Sprint          1   10   
2 RUTH 90_Turn   Sprint          1   12   
2 RUTH 90_Turn   Sprint          2   12   
2 RUTH 90_Turn   Sprint          2   9    
3 RUTH 90_Turn   Sprint          3   14   
3 RUTH 90_Turn   Sprint          3   12    
4 RUTH 90_Turn   Walk            1   13  
4 RUTH 90_Turn   Walk            1   17   
5 RUTH 90_Turn   Walk            2   11   
5 RUTH 90_Turn   Walk            2   15      

我想添加一个Trial列,其中包含MovementRepetition的每个唯一组合的代码,例如:

 Name   Drill Movement Repetition    DV     Trial
1 RUTH 90_Turn   Sprint          1   10   D90_Sprint1
2 RUTH 90_Turn   Sprint          1   12   D90_Sprint1
2 RUTH 90_Turn   Sprint          2   12   D90_Sprint2 
2 RUTH 90_Turn   Sprint          2   9    D90_Sprint2 
3 RUTH 90_Turn   Sprint          3   14   D90_Sprint3 
3 RUTH 90_Turn   Sprint          3   12   D90_Sprint3 
4 RUTH 90_Turn   Walk            1   13   D90_Walk1 
4 RUTH 90_Turn   Walk            1   17   D90_Walk1 
5 RUTH 90_Turn   Walk            2   11   D90_Walk2
5 RUTH 90_Turn   Walk            2   15   D90_Walk2

考虑到Drill保持不变,以及Name - data.frame仅包含此次演习的露丝数据。 DVMovementRepetition至少测量两次。

是否可以这样做?

我的数据框是10140 obs。所以快速解决方案将是理想的。谢谢!

2 个答案:

答案 0 :(得分:1)

我们可以使用paste

df1$Trial <- paste0("D90_", df1$Movement, df1$Repetition)

如果'90'来自'钻孔'栏

df1$Trial <- paste0("D", sub("_.*", "", df1$Drill), "_", df1$Movement, df1$Repetition)

sprintf

sprintf("D%s_%s%d", sub("_.*", "", df1$Drill), df1$Movement, df1$Repetition)
#[1] "D90_Sprint1" "D90_Sprint1" "D90_Sprint2" "D90_Sprint2" "D90_Sprint3" "D90_Sprint3" "D90_Walk1"   "D90_Walk1"   "D90_Walk2"  
#[10] "D90_Walk2" 

答案 1 :(得分:1)

使用paste0()

df$Trial <- paste0(sub("_.*", "", df$Drill),
                   "_",
                   df$Movement,
                   df$Repetition)

sub()的调用会提取最终Drill字符串的Trial分量。