我在R中有一个数据帧df,这里是前6行。
df <- data.frame (npi_one = c('n1487','n1952','n1952','n1467','n1467','n1538'),
npi_two = c('n1467','n1467','n1487','n1508','n1538','n1508'),
weight = c(1,1,2,1,1,1),
hee_provn1=c(rep(015171,3),rep(015443,3)))
我想通过hee_provn1分组,然后做一个循环,第一个循环的代码是:
library(igraph)
library(dplyr)
library(data.table)
df2 <- filter(df, hee_provn1 == 015171)
df3 <- df2 [,c("npi_one","npi_two")]
l = c(apply(df3,1,c))
G <- graph(l,directed = FALSE)
d <- degree(G)
c <- closeness(G,weight = df2$weight)
b <- betweenness(G, weight = df2$weight)
e <- eigen_centrality(G,weight = df2$weight)$vector
cent_df = data.frame(d,c,b,e)
colnames(cent_df) <- c('degree', 'closeness','betweenness','eigen')
setDT(cent_df, keep.rownames = TRUE)[]
setnames(cent_df,1,"npi")
cbind(hee_provn1 = 015171,cent_df)
第一个循环的结果表(hee_provn1 == 015171)是
hee_provn1 npi degree closeness betweenness eigen
1: 15171 n1487 2 0.3333333 0.0 1.0000000
2: 15171 n1467 2 0.5000000 0.5 0.7320508
3: 15171 n1952 2 0.3333333 0.0 1.0000000
第二个循环的结果表(hee_provn1 == 015171)是
hee_provn1 npi degree closeness betweenness eigen
1: 15443 n1467 2 0.5 0 1
2: 15443 n1508 2 0.5 0 1
3: 15443 n1538 2 0.5 0 1
我是R的新手,我不知道如何根据数据框的一列进行分组和循环。
另外,我希望我的最终结果是将所有表放在一起的大表,例如:
hee_provn1 npi degree closeness betweenness eigen
1: 15171 n1487 2 0.3333333 0.0 1.0000000
2: 15171 n1467 2 0.5000000 0.5 0.7320508
3: 15171 n1952 2 0.3333333 0.0 1.0000000
4: 15443 n1467 2 0.5 0 1
5: 15443 n1508 2 0.5 0 1
6: 15443 n1538 2 0.5 0 1
由于某种原因,我不能使用R包tidyverse,谢谢
我尝试了Balter的方法,
df <- data.frame (npi_one = c('n1487','n1952','n1952','n1467','n1467','n1538'),
npi_two = c('n1467','n1467','n1487','n1508','n1538','n1508'),
weight = c(1,1,2,1,1,1),
hee_provn1=c(rep(015171,3),rep(015443,3)))
library(igraph)
library(dplyr)
library(data.table)
final.df <- c()
for(x in unique(df$hee_provn1)){
df2 <- subset(df, subset = hee_provn1 == x)
df3 <- df2 [,c("npi_one","npi_two")]
l = c(apply(df3,1,c))
G <- graph(l,directed = FALSE)
d <- degree(G)
c <- closeness(G,weight = df2$weight)
b <- betweenness(G, weight = df2$weight)
e <- eigen_centrality(G,weight = df2$weight)$vector
result <- data.frame(d,c,b,e)
setDT(result, keep.rownames = TRUE)[]
setnames(result,1,"npi")
cbind(hee_provn1 = x,result)
final.df <- rbind(final.df, result)
}
colnames(final.df) <- c('npi','degree', 'closeness','betweenness','eigen')
结果是:
npi degree closeness betweenness eigen
1: n1487 2 0.3333333 0.0 1.0000000
2: n1467 2 0.5000000 0.5 0.7320508
3: n1952 2 0.3333333 0.0 1.0000000
4: n1467 2 0.5000000 0.0 1.0000000
5: n1508 2 0.5000000 0.0 1.0000000
6: n1538 2 0.5000000 0.0 1.0000000
看起来它与我的理想结果有什么不同,如何成功地跟踪产生它的迭代?
答案 0 :(得分:2)
在不加载dplyr的情况下重新开始。然后...
library(data.table)
library(igraph)
setDT(df)
# clean bad formatting
df[, `:=`(npi_one = as.character(npi_one), npi_two = as.character(npi_two))]
df[, {
G = graph_from_edgelist(cbind(npi_one, npi_two), directed = FALSE)
.(
v = V(G)$name,
d = degree(G),
c = closeness(G, weight = weight),
b = betweenness(G, weight = weight),
e = eigen_centrality(G, weight = weight)$vector
)
}, by=hee_provn1]
给出了......
hee_provn1 v d c b e
1: 15171 n1487 2 0.3333333 0.0 1.0000000
2: 15171 n1467 2 0.5000000 0.5 0.7320508
3: 15171 n1952 2 0.3333333 0.0 1.0000000
4: 15443 n1467 2 0.5000000 0.0 1.0000000
5: 15443 n1508 2 0.5000000 0.0 1.0000000
6: 15443 n1538 2 0.5000000 0.0 1.0000000
工作原理
Data.table语法为DT[i, j, by=]
,按i
(此处不需要),按by=
分组,然后计算j
。 j
应评估为列表,list()
可以写为.()
作为简写。
为什么不加载dplyr?它不是必需的,igraph已经有足够的命名空间冲突。
如果你真的想使用dplyr,我强烈建议你不要同时使用data.table ......
library(dplyr)
library(magrittr)
library(igraph)
# fix bad formatting
df %<>% mutate(npi_one = as.character(npi_one), npi_two = as.character(npi_two))
df %>% group_by(hee_provn1) %>% do(with(., {
G = graph_from_edgelist(cbind(npi_one, npi_two), directed = FALSE)
data.frame(
v = V(G)$name,
d = degree(G),
c = closeness(G, weight = weight),
b = betweenness(G, weight = weight),
e = eigen_centrality(G, weight = weight)$vector
)
}))
# A tibble: 6 x 6
# Groups: hee_provn1 [2]
hee_provn1 v d c b e
<dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 15171 n1487 2 0.3333333 0.0 1.0000000
2 15171 n1467 2 0.5000000 0.5 0.7320508
3 15171 n1952 2 0.3333333 0.0 1.0000000
4 15443 n1467 2 0.5000000 0.0 1.0000000
5 15443 n1508 2 0.5000000 0.0 1.0000000
6 15443 n1538 2 0.5000000 0.0 1.0000000
答案 1 :(得分:1)
我能想到的最简单的方法(无需重新创建整个代码):
.flex-row>*{height:100px;}
因此,您需要为hee_provn1中的每个唯一值对表进行子集化,执行您的操作,然后在结果中附加数据帧。