Question

我的数据框看起来像

                ML1    ML1 SD       ML2    ML2 SD ...
aPhysics0 0.8730469 0.3329205 0.5950521 0.4908820
aPhysics1 0.8471074 0.3598839 0.6473829 0.4777848
aPhysics2 0.8593750 0.3476343 0.7031250 0.4568810
aPhysics3 0.8875000 0.3159806 0.7000000 0.4582576
aPhysics4 0.7962963 0.4027512 0.7654321 0.4237285
...

我想使用行名来创建一个看起来像

的数据框

     Institution Subject Class       ML1    ML1 SD       ML2    ML2 SD ...
[1,]           A Physics     0 0.8730469 0.3329205 0.5950521 0.4908820
[2,]           A Physics     1 0.8471074 0.3598839 0.6473829 0.4777848
[3,]           A Physics     2 0.8593750 0.3476343 0.7031250 0.4568810
[4,]           A Physics     3 0.8875000 0.3159806 0.7000000 0.4582576
[5,]           A Physics     4 0.7962963 0.4027512 0.7654321 0.4237285
...

最好的方法是什么？

Answer 1

假设您的data.frame是df，

header <- as.data.frame(do.call(rbind, strsplit(gsub("Physics", " Physics ", 
                rownames(df)), " ")))
names(header) <- c("Institution", "Subject", "Class")
cbind(header, df)
df.out <- cbind(header, df)
df.out$Institution <- toupper(df.out$Institution)

如果你有更多科目（广义解决方案）：

header <- as.data.frame(do.call(rbind, strsplit(gsub("^([a-z])(.*)([0-9])$", 
                 "\\1 \\2 \\3", rownames(df)), " ")))
names(header) <- c("Institution", "Subject", "Class")
df.out <- cbind(header, df)
df.out$Institution <- toupper(df.out$Institution)

Answer 2

假设行名称的形式（1个小写字符字符串-1位数字），您可以使用gsub的一些正则表达式：

#test data
x <- data.frame(ML1=runif(5),ML2=runif(5),row.names=paste0("aPhysics",1:5))

#logic
transform(x, Institution=toupper(gsub("^([a-z])([a-zA-Z]+)([0-9])$","\\1",rownames(x))), Subject=gsub("^([a-z])([a-zA-Z]+)([0-9])$","\\2",rownames(x)), Class=gsub("^([a-z])([a-zA-Z]+)([0-9])$","\\3",rownames(x)))
                 ML1       ML2 Institution Subject Class
aPhysics1 0.51680701 0.4102757           A Physics     1
aPhysics2 0.60388358 0.7438400           A Physics     2
aPhysics3 0.26504243 0.7598557           A Physics     3
aPhysics4 0.55900273 0.5263205           A Physics     4
aPhysics5 0.05589591 0.7903568           A Physics     5

从行名称中提取数据并插入表格中

2 个答案: