在命名数据框列时使用unique()

时间:2018-07-24 13:46:38

标签: r

我在命名重塑的数据框时遇到麻烦。仅使用reshape,我会得到错误的标题,因此我尝试为自己命名,但我无法在正确的位置获得正确的名称。

df<-data.frame(color=rep(c("red", "blue", "green"), 10), letter=c(letter=c("a", "b", "c", "d", "e", "b", "c", "d", "e", "f", "c", "d", "e", "f", "g", "d", "e", "f", "g", "h", "e", "b", "c", "d", "e", "f", "c", "d", "e", "f"))
b<-as.data.frame(table(df))
c<-reshape(b, direction="wide", idvar="color", timevar="letter")

  color Freq.a Freq.b Freq.c Freq.d Freq.e Freq.f Freq.g Freq.h
1  blue      0      1      2      1      3      2      0      1
2 green      0      1      2      2      2      2      1      0
3   red      1      1      1      3      2      1      1      0

要摆脱“频率”,我添加了names,但这并没有为列名提供正确的数字。我为第一列命名的任何内容都会发生这种情况。

names(c)<-c("color", unique(b$letter))
  color 1 2 3 4 5 6 7 8
1  blue 0 1 2 1 3 2 0 1
2 green 0 1 2 2 2 2 1 0
3   red 1 1 1 3 2 1 1 0

我只尝试unique而不将第一列连接起来,正确的数字是列名,但是显然它们放在错误的位置。如何在正确的列上获得正确的唯一值?

names(c)<-unique(b$letter)

      a b c d e f g h NA
1  blue 0 1 2 1 3 2 0  1
2 green 0 1 2 2 2 2 1  0
3   red 1 1 1 3 2 1 1  0

2 个答案:

答案 0 :(得分:1)

这是你的意思吗?

> setNames(reshape(b, timevar="numbers", idvar="color", direction="wide"), 
      c("Name", unique(b$numbers)))
   Name 1 2 3 4 5 6 7 8
1  blue 0 1 2 1 3 2 0 1
2 green 0 1 2 2 2 2 1 0
3   red 1 1 1 3 2 1 1 0

答案 1 :(得分:1)

您的b$letter列是一个因素(unique(b$letter)也将是factor),因此,当与字符连接时,R会隐式强制其“值”(而不是“级别”) )字符,为您提供数字。

df <- data.frame(color=rep(c("red", "blue", "green"), 10), 
               letter=c(letter=c("a", "b", "c", "d", "e", 
                                 "b", "c", "d", "e", "f", 
                                 "c", "d", "e", "f", "g", 
                                 "d", "e", "f", "g", "h", 
                                 "e", "b", "c", "d", "e", 
                                 "f", "c", "d", "e", "f")))

b <- as.data.frame(table(df))
c <- reshape(b, direction="wide", idvar="color", timevar="letter")

您可以通过比较以下内容轻松地验证这一点:

> unique(b$letter)
[1] a b c d e f g h
Levels: a b c d e f g h

> class(unique(b$letter))
[1] "factor"

> as.character(unique(b$letter))
[1] "a" "b" "c" "d" "e" "f" "g" "h"

> class(as.character(unique(b$letter)))
[1] "character"

要解决此问题,就像使用第二个版本一样简单:

names(c) <- c("color", as.character(unique(b$letter)))

或者,您也可以使用sub"Freq."中删除names(c)(IMO是一种更安全,更轻松的方法):

names(c) <- sub('^Freq\\.', '', names(c))

结果:

  color a b c d e f g h
1  blue 0 1 2 1 3 2 0 1
2 green 0 1 2 2 2 2 1 0
3   red 1 1 1 3 2 1 1 0