Question

我有一个data.frame，其中包含一个字符变量和多个数字变量，如下所示：

sampleDF <- data.frame(a = c(1,2,3,"String"), b = c(1,2,3,4), c= c(5,6,7,8), stringsAsFactors = FALSE)

看起来像这样：

       a b c
1      1 1 5
2      2 2 6
3      3 3 7
4 String 4 8

我想转置此data.frame并使它看起来像这样：

  V1 V2 V3     V4
1  1  2  3 String
2  1  2  3      4
3  5  6  7      8

我尝试了

c<-t(sampleDF)

以及

d<-transpose(sampleDF)

但是这两种方法都导致V1，V2和V3现在具有字符型，尽管它们只有数字值。

我知道这个问题已经被问过多次了。但是，对于为什么在这种情况下还将V1，V2和V3转换为字符，我还没有找到合适的答案。

有什么方法可以确保这些列保持数字状态？

对于此问题的重复性质，已经非常感谢您的道歉。

编辑：

as.data.frame(t(sampleDF)

不能解决问题：

'data.frame':   3 obs. of  4 variables:
 $ V1: Factor w/ 2 levels "1","5": 1 1 2
  ..- attr(*, "names")= chr  "a" "b" "c"
 $ V2: Factor w/ 2 levels "2","6": 1 1 2
  ..- attr(*, "names")= chr  "a" "b" "c"
 $ V3: Factor w/ 2 levels "3","7": 1 1 2
  ..- attr(*, "names")= chr  "a" "b" "c"
 $ V4: Factor w/ 3 levels "4","8","String": 3 1 2
  ..- attr(*, "names")= chr  "a" "b" "c"

Answer 1

转置后，用rdd.groupByKey().mapValues(lambda x: SparseVector(N, dict(x)))将列转换为numeric

type.convert

或者out <- as.data.frame(t(sampleDF), stringsAsFactors = FALSE) out[] <- lapply(out, type.convert, as.is = TRUE) row.names(out) <- NULL out # V1 V2 V3 V4 #1 1 2 3 String #2 1 2 3 4 #3 5 6 7 8 str(out) #'data.frame': 3 obs. of 4 variables: # $ V1: int 1 1 5 # $ V2: int 2 2 6 # $ V3: int 3 3 7 # $ V4: chr "String" "4" "8"将第一列与其他转置的列转换为相应的“类型”

rbind

注意：第一种方法会更有效

或者另一种方法是将rbind(lapply(sampleDF[,1], type.convert, as.is = TRUE), as.data.frame(t(sampleDF[2:3])))的值一起放在每一列中，然后再次读取

paste

或者我们可以将'data.frame'转换为'data.matrix'，从而将read.table(text=paste(sapply(sampleDF, paste, collapse=" "), collapse="\n"), header = FALSE, stringsAsFactors = FALSE) # V1 V2 V3 V4 #1 1 2 3 String #2 1 2 3 4 #3 5 6 7 8元素更改为character，使用NA查找以下元素的索引不适用，替换为原始字符串值

is.na

或者另一个选择是m1 <- data.matrix(sampleDF) out <- as.data.frame(t(m1)) out[is.na(out)] <- sampleDF[is.na(m1)]中的type_convert

readr

R转置数字data.frame产生字符变量

1 个答案: