Shifting a column of non-numeric variables

时间:2018-09-19 08:17:57

标签: r dataframe shift

If I have a dataframe of variables, how do I shift the entries in one column (e.g. Column 4) up by one and replace empty cells with "NA"?

For numeric data:

mydata <- data.frame(replicate(5,sample(1:20,10,rep=TRUE)))

> mydata
   X1 X2 X3 X4 X5
1  12  2  4  7 10
2  15  2 15  3  8
3  11 12 18 10  3
4  18  8  4 17 12
5  16 17  2  8 10
6   6  3 14 15 18
7  14  3 14 14 13
8  16 15 15  9 14
9  14 12 15 20  3
10 10 16  8 18  5

I can achieve this with a 'shift' function:

shift <- function(x, n){
 c(x[-(seq(n))], rep(NA, n))
 }

mydata[,4] <- shift(mydata[,4], 1)

> mydata
   X1 X2 X3 X4 X5
1  12  2  4  3 10
2  15  2 15 10  8
3  11 12 18 17  3
4  18  8  4  8 12
5  16 17  2 15 10
6   6  3 14 14 18
7  14  3 14  9 13
8  16 15 15 20 14
9  14 12 15 18  3
10 10 16  8 NA  5

If my data is numeric, this works. But if my data is non-numeric, it changes my column to numeric representation.

mydata<- data.frame(replicate(5,sample(c("apple", "banana", "peach", "grape"),10,rep=TRUE)))

> mydata
   X1     X2     X3     X4    X5
1  banana banana banana  grape apple
2   apple  peach  grape  grape apple
3   grape  grape banana  peach peach
4   apple  apple  peach banana peach
5   grape banana  grape  apple peach
6   grape  grape  grape banana apple
7   grape  grape  peach  apple peach
8  banana  grape banana  apple grape
9   peach  apple  peach  peach grape
10  apple  peach banana  grape grape


shift <- function(x, n){
 c(x[-(seq(n))], rep(NA, n))
 }
mydata[,4] <- shift(mydata[,4], 1)

> mydata
   X1     X2     X3 X4    X5
1  banana banana banana  3 apple
2   apple  peach  grape  4 apple
3   grape  grape banana  2 peach
4   apple  apple  peach  1 peach
5   grape banana  grape  2 peach
6   grape  grape  grape  1 apple
7   grape  grape  peach  1 peach
8  banana  grape banana  4 grape
9   peach  apple  peach  3 grape
10  apple  peach banana NA grape

Any ideas how to retain the "apple/banana/peach/grape" words after the shift? Or perhaps another approach is better? Thank you!

Desired result:

> mydata
   X1     X2     X3     X4    X5
1  banana banana banana  grape apple
2   apple  peach  grape  peach apple
3   grape  grape banana banana peach
4   apple  apple  peach  apple peach
5   grape banana  grape banana peach
6   grape  grape  grape  apple apple
7   grape  grape  peach  apple peach
8  banana  grape banana  peach grape
9   peach  apple  peach  grape grape
10  apple  peach banana     NA grape

1 个答案:

答案 0 :(得分:0)

问题在于data.frame将字符串视为因素。

set.seed(0)
fruit <- c("apple", "banana", "peach", "grape")
mydata <- data.frame(replicate(5,sample(fruit, 10, rep=T)))

> mydata
       X1     X2     X3     X4     X5
1   grape  apple  grape banana banana
2  banana  apple  grape banana  grape
3  banana  apple  apple  peach  peach
4   peach  peach  peach banana  grape
5   grape banana  apple  apple  peach
6   apple  grape banana  grape  peach
7   grape banana banana  peach  grape
8   grape  peach  apple  grape  apple
9   peach  grape banana  apple banana
10  peach banana  grape  peach  peach

> class(mydata[, 'X4'])
[1] "factor"

要解决此问题,您可以使用data.table软件包,该软件包默认不会将st当作因素。它还随您需要的shift函数一起提供。要将值“ up”上移一个,请设置参数type='lead'

library(data.table)
setDT(mydata)
mydata[, X4 := shift(X4, 1, type='lead')]

> mydata
        X1     X2     X3     X4     X5
 1:  grape  apple  grape banana banana
 2: banana  apple  grape  peach  grape
 3: banana  apple  apple banana  peach
 4:  peach  peach  peach  apple  grape
 5:  grape banana  apple  grape  peach
 6:  apple  grape banana  peach  peach
 7:  grape banana banana  grape  grape
 8:  grape  peach  apple  apple  apple
 9:  peach  grape banana  peach banana
10:  peach banana  grape   <NA>  peach