更加了解reshape()函数

时间:2019-01-07 22:08:45

标签: r function reshape

请考虑以下来自该问题的数据集:Going from wide to long w/ coupled-columns: Is there a more R way to do this (i.e. - without using a for loop)?

(问题末尾的{dput

   phrase wo1sp wo2sp wo3sp wo1sc wo2sc wo3sc
1   hello   dan  mark  todd    10     5     4
2   hello  mark   dan chris     8     9     4
3 goodbye  mark   dan   kev     2     4    10
4    what   kev   dan  mark     4     5     5

目标是考虑到列名中存在某种模式,从而将数据从宽到长整形。预期的输出是

    phrase time    sp sc
1    hello    1   dan 10
2    hello    1  mark  8
3  goodbye    1  mark  2
4     what    1   kev  4
5    hello    2  mark  5
6    hello    2   dan  9
7  goodbye    2   dan  4
8     what    2   dan  5
9    hello    3  todd  4
10   hello    3 chris  4
11 goodbye    3   kev 10
12    what    3  mark  5

@docendodiscimus使用了melt中的data.table提供了一种声音解决方案,但是为了实践起见,我想使用reshape()库中的stats

该函数当然很强大,但是我几乎总是遇到一个参数问题,所以这次是我以前很少使用的new.row.names参数。

我尝试过

reshape(
  dat,
  idvar = "phrase",
  varying = list(
    "sp" = grep("sp$", names(dat)), 
    "sc" = grep("sc$", names(dat))
  ),
  direction = "long", 
  v.names = c("sp", "sc") # name of cols in long format
)

这将返回错误

  row.names<-.data.frame*tmp*中的

错误,值= paste(ids,times [i],:     不允许重复的“ row.names”

     

此外:警告消息:   设置“ row.names”时的非唯一值:“ hello.1”

读取错误消息后,我发现“解决方案”是new.row.names参数,我将其设置为1:12,请参见下文。 (我在这里作弊是因为我查看了data.table解决方案返回了多少行。)

我的问题是该问题的通用解决方案是什么?

# works!
reshape(
  dat,
  idvar = "phrase",
  varying = list(
    "sp" = grep("sp$", names(dat)),
    "sc" = grep("sc$", names(dat))
  ),
  direction = "long", 
  v.names = c("sp", "sc"),
  new.row.names = 1:12 # 1:10000 would also work
)

数据

dat <- structure(list(phrase = c("hello", "hello", "goodbye", "what"
), wo1sp = c("dan", "mark", "mark", "kev"), wo2sp = c("mark", 
"dan", "dan", "dan"), wo3sp = c("todd", "chris", "kev", "mark"
), wo1sc = c(10L, 8L, 2L, 4L), wo2sc = c(5L, 9L, 4L, 5L), wo3sc = c(4L, 
4L, 10L, 5L)), .Names = c("phrase", "wo1sp", "wo2sp", "wo3sp", 
"wo1sc", "wo2sc", "wo3sc"), class = "data.frame", row.names = c(NA, 
-4L))

0 个答案:

没有答案