我想重新格式化这个数据框:
mydf <- read.table(
text = "FORM DOSE gmean_AUC mean_AUC gmean_Cmax mean_Cmax
A 100 150 160 50 55
B 50 70 75 30 32",
header = TRUE, stringsAsFactors = FALSE)
进入以下内容:
mydfout <-
EXPOSURE FORM DOSE gmean mean
AUC A 100 150 160
AUC B 50 70 75
Cmax A 100 50 55
Cmax B 50 30 32
如何在R中执行此操作。重新格式化将使我能够轻松地在R中生成和导出我的表。
答案 0 :(得分:5)
这是一个非常标准的“从长到长”的重塑问题,因此一个很好的起点是reshape()
函数。
reshape(mydf, direction = "long", idvar = 1:2, varying = 3:ncol(mydf),
timevar = "EXPOSURE", sep = "_")
## FORM DOSE EXPOSURE gmean mean
## A.100.AUC A 100 AUC 150 160
## B.50.AUC B 50 AUC 70 75
## A.100.Cmax A 100 Cmax 50 55
## B.50.Cmax B 50 Cmax 30 32
另一个选项是来自“data.table”的melt()
(而不是来自“reshape2”的melt()
):
melt(as.data.table(mydf), measure.vars = patterns("^gmean", "^mean"))
缺点是您没有获得“AUC”和“Cmax”值,但您可以手动重新引入这些值:
melt(as.data.table(mydf), measure.vars = patterns("^gmean", "^mean"))[
, variable := factor(variable, labels = c("AUC", "Cmax"))][]
要解决这个问题,而“data.table”团队可以使用它,您也可以尝试ReshapeLong_()
from this Gist。
用法是:
ReshapeLong_(mydf, c(gmean = "^gmean_", mean = "^mean_"), variable.name = "EXPOSURE")
## DOSE FORM EXPOSURE gmean mean
## 1: 100 A AUC 150 160
## 2: 50 B AUC 70 75
## 3: 100 A Cmax 50 55
## 4: 50 B Cmax 30 32
答案 1 :(得分:3)
library(dplyr)
library(tidyr)
mydfout <- mydf %>%
gather(Type, Value, -FORM, -DOSE) %>%
separate(Type, into = c("Summary", "EXPOSURE")) %>%
spread(Summary, Value) %>%
select(EXPOSURE, FORM, DOSE, gmean, mean) %>%
arrange(EXPOSURE)
mydfout
# EXPOSURE FORM DOSE gmean mean
# 1 AUC A 100 150 160
# 2 AUC B 50 70 75
# 3 Cmax A 100 50 55
# 4 Cmax B 50 30 32
数据强>
mydf <- read.table(text = "FORM DOSE gmean_AUC mean_AUC gmean_Cmax mean_Cmax
A 100 150 160 50 55
B 50 70 75 30 32",
header = TRUE, stringsAsFactors = FALSE)