如何用R来整理混合变量和观测?

时间:2018-01-03 22:00:35

标签: r dataset tidyr

我找不到任何解决我从html表导入的数据集的解决方案。这个结合了观察和变量作为行(噩梦)。

它看起来像这样:

    w <- c(5,"A",1,2)
x <- c(5,"B",3,4)
y <- c(10,"A",5,6)
z <- c(10,"B",7,8)

df <- data.frame(w,x,y,z)

rownames(df) <- c("temp","cat","obs1", "obs2")
colnames(df) <- NA

df

temp  5  5 10 10
cat   A  B  A  B
obs1  1  3  5  7
obs2  2  4  6  8

变量是temp和cat而obs1和obs2是观察值。我想要获得的是:

obs   temp cat value
obs1  5    A   1  
obs1  5    B   3  
obs2  5    A   2  
obs2  5    B   4  
obs1  10   A   5  
obs1  10   B   6  
obs2  10   A   7  
obs2  10   B   8

我和gather()spread()搞砸了,但没有......

有什么建议吗?

谢谢你!

2 个答案:

答案 0 :(得分:2)

你不能只是转置它吗?

library(tidyverse)
w <- c(5,"A",1,2)
x <- c(5,"B",3,4)
y <- c(10,"A",5,6)
z <- c(10,"B",7,8)
df <- data.frame(w,x,y,z)

rownames(df) <- c("temp","cat","obs1", "obs2")
colnames(df) <- NA

t(df) %>% 
  as.data.frame() %>% 
  gather(key = "k", value = "value", "obs1", "obs2") %>% 
  select(-k) %>% 
  arrange(desc(temp))

  temp cat value
1    5   A     1
2    5   B     3
3    5   A     2
4    5   B     4
5   10   A     5
6   10   B     7
7   10   A     6
8   10   B     8

答案 1 :(得分:0)

使用data.table的解决方案。 df3是最终输出。

library(data.table)

new_col <- rownames(df)      # Save row names as the new column name
df2 <- transpose(df)         # Transpose the data frame
names(df2) <- new_col        # Assign the column name
setDT(df2)                   # Convert to data.table

# Perform the transformation
df3 <- melt(df2, measure.vars = c("obs1", "obs2"), 
            variable.name = "obs")[
  order(-temp, obs), .(obs, temp, cat, value)
]

# Print df3
df3
#     obs temp cat value
# 1: obs1    5   A     1
# 2: obs1    5   B     3
# 3: obs2    5   A     2
# 4: obs2    5   B     4
# 5: obs1   10   A     5
# 6: obs1   10   B     7
# 7: obs2   10   A     6
# 8: obs2   10   B     8