这是我的问题:
df1 <- data.frame(x = 1:5, y = 2:6, z = 3:7)
rownames(df1) <- LETTERS[1:5]
df1
x y z
A 1 2 3
B 2 3 4
C 3 4 5
D 4 5 6
E 5 6 7
df2 <- data.frame(x = 1:5, y = 2:6, z = 3:7)
rownames(df2) <- LETTERS[3:7]
df2
x y z
C 1 2 3
D 2 3 4
E 3 4 5
F 4 5 6
G 5 6 7
我想要的是:
x y z
A 1 2 3
B 2 3 4
C 4 6 8
D 6 8 10
E 8 10 12
F 4 5 6
G 5 6 7
其中重复的行由同一个变量加起来。
答案 0 :(得分:10)
使用:
# create a new variable from the rownames
df1$rn <- rownames(df1)
df2$rn <- rownames(df2)
# bind the two dataframes together by row and aggregate
res <- aggregate(cbind(x,y,z) ~ rn, rbind(df1,df2), sum)
# or (thx to @alistaire for reminding me):
res <- aggregate(. ~ rn, rbind(df1,df2), sum)
# assign the rownames again
rownames(res) <- res$rn
# get rid of the 'rn' column
res <- res[, -1]
你得到:
> res
x y z
A 1 2 3
B 2 3 4
C 4 6 8
D 6 8 10
E 8 10 12
F 4 5 6
G 5 6 7
答案 1 :(得分:6)
使用dplyr,
library(dplyr)
# add rownames as a column in each data.frame and bind rows
bind_rows(df1 %>% add_rownames(),
df2 %>% add_rownames()) %>%
# evaluate following calls for each value in the rowname column
group_by(rowname) %>%
# add all non-grouping variables
summarise_all(sum)
## # A tibble: 7 x 4
## rowname x y z
## <chr> <int> <int> <int>
## 1 A 1 2 3
## 2 B 2 3 4
## 3 C 4 6 8
## 4 D 6 8 10
## 5 E 8 10 12
## 6 F 4 5 6
## 7 G 5 6 7
答案 2 :(得分:2)
这可能需要一些诀窍来让rownames逻辑工作在一个更长的例子上:
function DisplayOE(){
$link = mysqli_connect('local','kw','pass'); //Creates a connection
if(!$link){die(' Could not connect: '.mysql_error());}
mysqli_select_db($link,'kvw') or die(mysqli_error());
$rows = mysqli_query($link, "SELECT * FROM CreateOE");
$oe = mysqli_fetch_array($rows);
//$OEquestions =
return $oe;
mysqli_close($link);
}
如果rownames都可以假定为字母字符,则dfr <-rbind(df1,df2)
do.call(rbind, lapply( split(dfr, sapply(rownames(dfr),substr,1,1)), colSums))
x y z
A 1 2 3
B 2 3 4
C 4 6 8
D 6 8 10
E 8 10 12
F 4 5 6
G 5 6 7
解决方案应该很容易。
答案 3 :(得分:1)
另一种方法是融化数据并进行投射。首先,我们将行名称设置为两个数据帧的最后一列,这要归功于@Procrastinatus Maximus
df1$rn <- rownames(df1)
df2$rn <- rownames(df2)
然后我们根据名称
来融化数据melt(list(df1, df2), id.vars = "rn")
然后我们使用带有mget函数的dcast,它用于一次检索多个变量。
mydf<- dcast(melt(mget(ls(pattern = "df\\d+")), id.vars = "rn"),
rn ~ variable, value.var = "value", fun.aggregate = sum)
rownames(mydf) <- mydf$rn
# get rid of the 'rn' column
mydf <- mydf[, -1]
> mydf
# x y z
#A 1 2 3
#B 2 3 4
#C 4 6 8
#D 6 8 10
#E 8 10 12
#F 4 5 6
#G 5 6 7
答案 4 :(得分:0)
还可以向量化将dfs转换为矩阵的操作:
result_df <- as.data.frame(as.matrix(df1) + as.matrix(df2))