过滤存储在列表中的数据帧行并创建新列表

时间:2017-06-16 10:52:44

标签: r list dataframe filtering

我有一个包含64个数据帧的列表。 Dataframe 1和Dataframe 5必须具有相同的行名。 与2和6,3和7相同,依此类推。 我能够运行for循环并创建一个新列表,但是有些东西不起作用:我的行数不正确。

这是一个重现它的简化示例:

# Create dataframes and store in list
dfA <- data.frame(v1=c(1:6), v2=c("x1","x2","x3","x4","x5","x6"))
dfB <- data.frame(v1=c(1:6), v2=c("x1","x2","x3","x4","x5","x6"))
dfC <- data.frame(v1=c(1:5), v2=c("x1","x2","x3","x4","x5"))
dfD <- data.frame(v1=c(1:4), v2=c("x1","x2","x3","x4"))
example_dataframes = list(dfA, dfB, dfC, dfD)

# These vectors give the order of the process
vectorA = c(1,2)
vectorB = c(3,4)

# Create new list and start for loop
filtered_dataframes = list()
for (i in vectorA) {
   for (j in vectorB) {
df1 = example_dataframes[[i]]
df2 = example_dataframes[[j]]
test = intersect(df1$v2, df2$v2)
filtered_dataframes[[i]] <- df1[which(df1$v2 %in% test),]
filtered_dataframes[[j]] <- df2[which(df2$v2 %in% test),]
 }
}

对于这个例子,我希望得到:

sapply(filtered_dataframes, nrow)
> 5 4 5 4

1 个答案:

答案 0 :(得分:0)

此修改版本应该可以获得您需要的结果。

dfA <- data.frame(v1=c(1:6), v2=c("x1","x2","x3","x4","x5","x6"))
dfB <- data.frame(v1=c(1:6), v2=c("x1","x2","x3","x4","x5","x6"))
dfC <- data.frame(v1=c(1:5), v2=c("x1","x2","x3","x4","x5"))
dfD <- data.frame(v1=c(1:4), v2=c("x1","x2","x3","x4"))
example_dataframes = list(dfA, dfB, dfC, dfD)

# Put the comparison vectors into a list. Exampl: To compare dataframes 1 and 3, put in c(1,3)
vector.list <- list(c(1,3),c(2,4))

# Create new list and start for loop
filtered_dataframes = list()

# Loop through the list of vectors
for (i in vector.list) {
    # Get the first dataframe from the current vector being processed
    df1 = example_dataframes[[i[1]]]

    # Get the second dataframe from the current vector being processed
    df2 = example_dataframes[[i[2]]]

    # Get the intersection of the two dataframes
    test = intersect(df1$v2, df2$v2)

    # Add the first filtered dataframe to the list of filtered dataframes
    filtered_dataframes[[i[1]]] <- df1[which(df1$v2 %in% test),]

    # Add the second filtered dataframe to the list of filtered dataframes
    filtered_dataframes[[i[2]]] <- df2[which(df2$v2 %in% test),]
  }