如何通过单个列合并数据框列表?

时间:2014-10-06 16:35:40

标签: r merge dataframe

我有一个数据框列表,每个数据框都有一个"名称"列和"样本"列。

df1:

Name Sample1
A 23
B 445
C 456

df2:

Name Sample2
A 45
B 984
C 374

如何合并列表中的所有数据框,以便它们最终如下:

merged:

Name Sample1 Sample2
A 23 45
B 445 984
C 456 374

我在SO上尝试过类似问题的答案,但没有一个能产生预期的结果。例如。 merged.data.frame = Reduce(function(...) merge(..., all=T), list.of.data.frames)

编辑:

我的原始脚本列在下面

# List all files in current working directory
fs <- list.files()

# Load the data from each file into a list of data frames
dfs <- lapply(fs, read.table, header=TRUE, sep="\t")

# Select only the Name and Concentration columns from the list of data frames
dfs <- lapply(dfs, subset, select=c(Name, Concentration))

# Sort each dataframe alphabetically by the Name column
dfs <- lapply(dfs, function(df){df[order(df$Name),]})

# Rename each Concentration heading with the basename of the filename where the data originates
for (i in 1:length(dfs)){colnames(dfs[[i]])[2] <- substr(fs[i], 1, nchar(fs[i]) - 4)}

# Merge all the dataframes together by the Name column

# Write merged dataframe out to a tab-delimited file
write.table(dfs, ".", sep="\t")

2 个答案:

答案 0 :(得分:3)

应该有效(如下):

dats <- list(df1=data.frame(Name=c("A", "B", "C"), Sample1=c(23, 445, 456), stringsAsFactors=FALSE),
             df2=data.frame(Name=c("A", "B", "C"), Sample2=c(45, 984, 374), stringsAsFactors=FALSE),
             df3=data.frame(Name=c("A", "B", "C"), Sample3=c(66, 111, 2), stringsAsFactors=FALSE))

dats

## $df1
##   Name Sample1
## 1    A      23
## 2    B     445
## 3    C     456
## 
## $df2
##   Name Sample2
## 1    A      45
## 2    B     984
## 3    C     374
## 
## $df3
##   Name Sample3
## 1    A      66
## 2    B     111
## 3    C       2

# with by="Name"
merged.data.frame <- Reduce(function(...) merge(..., by="Name", all=TRUE), dats)

merged.data.frame

##   Name Sample1 Sample2 Sample3
## 1    A      23      45      66
## 2    B     445     984     111
## 3    C     456     374       2

# without by="Name" (same result)
merged.data.frame <- Reduce(function(...) merge(..., all=TRUE), dats)

##   Name Sample1 Sample2 Sample3
## 1    A      23      45      66
## 2    B     445     984     111
## 3    C     456     374       2

答案 1 :(得分:1)

它对我有用。这是您正在尝试解决的问题的准确再现吗?

> X <- replicate(20, data.frame(name=letters, r=runif(26)), simplify=FALSE)
> for(i in 1:20) names(X[[i]])[2] <- paste0("Sample", i)
> M <- Reduce(function(x,y)merge(x,y,by="name"), X)
> str(M)
'data.frame':   26 obs. of  21 variables:
 $ name    : Factor w/ 26 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Sample1 : num  0.17 0.957 0.443 0.181 0.113 ...
 $ Sample2 : num  0.8983 0.1802 0.7817 0.0818 0.7741 ...
 $ Sample3 : num  0.7473 0.6888 0.0195 0.9815 0.9674 ...
 $ Sample4 : num  0.557 0.331 0.902 0.177 0.504 ...
 $ Sample5 : num  0.0784 0.1561 0.5524 0.2631 0.2082 ...
 $ Sample6 : num  0.7455 0.5604 0.7232 0.5651 0.0727 ...
 $ Sample7 : num  0.721 0.807 0.902 0.965 0.41 ...
 $ Sample8 : num  0.209 0.17 0.207 0.303 0.258 ...
 $ Sample9 : num  0.736 0.566 0.125 0.417 0.521 ...
 $ Sample10: num  0.639 0.778 0.499 0.57 0.934 ...
 $ Sample11: num  0.0104 0.1629 0.4513 0.4821 0.383 ...
 $ Sample12: num  0.20123 0.95563 0.39992 0.00256 0.69283 ...
 $ Sample13: num  0.466 0.735 0.857 0.695 0.673 ...
 $ Sample14: num  0.562 0.873 0.269 0.151 0.628 ...
 $ Sample15: num  0.809 0.75 0.414 0.644 0.953 ...
 $ Sample16: num  0.7729 0.0129 0.5654 0.5705 0.7514 ...
 $ Sample17: num  0.239 0.454 0.538 0.596 0.743 ...
 $ Sample18: num  0.29 0.1 0.806 0.66 0.668 ...
 $ Sample19: num  0.461 0.739 0.474 0.64 0.418 ...
 $ Sample20: num  0.631 0.369 0.913 0.655 0.641 ...