我有一个包含4个数据框的列表(selected_key_ratios)($ nestle; $ unilever; $ pepsico; $ abf)。每个数据框都包含公司的财务数据。所有数据帧都具有相同的行索引和几乎相同的列(有时只有货币不同)。这是列表的截图。
我正在尝试创建一个新列表,其中每个项目都是数据框的列,按公司分组。这是一个图形示例:
对于每一列数据帧都是如此。我用lapply尝试了几个小时,但没有任何东西能产生预期的结果。
你有任何线索吗?非常感谢!答案 0 :(得分:1)
你可以尝试这样嵌套的lapply
:
# Recreation of your list of dataframes
w <- list(
abc = data.frame(
"eps_usd" = runif(10) * 10,
"eps_gbp" = runif(10) * 8
),
def = data.frame(
"eps_usd" = runif(10) * 15,
"eps_eur" = runif(10) * 13
),
ghi = data.frame(
"eps_gbp" = runif(10) * 35,
"eps_aud" = runif(10) * 19
),
jkl = data.frame(
"eps_usd" = runif(10) * 2,
"eps_aud" = runif(10) * 1.4
)
)
# Create a new dataframe with the year column
result <- data.frame(year = 2007:2016)
# Apply to each name in the list
lapply(names(w), function(tbl) {
# Apply to each colname of each df
lapply(colnames(w[[tbl]]), function(col) {
# Assign to the reult df column the corresponding column int he list of df's
result[[paste0(tbl, "_", col)]] <<- w[[tbl]][[col]]
})
})
输出:
> result
year abc_eps_usd abc_eps_gbp def_eps_usd def_eps_eur ghi_eps_gbp ghi_eps_aud jkl_eps_usd jkl_eps_aud
1 2007 8.107360 3.419094 11.660133 9.9744151 3.801628 1.936746299 1.36976914 0.58472812
2 2008 7.527040 2.342307 11.407357 5.6755403 13.433364 8.595490269 0.31085568 0.06655984
3 2009 5.155562 4.272123 8.506886 8.5367400 20.305427 18.191703109 0.01993349 0.31829031
4 2010 2.947270 2.983519 5.686625 5.2630734 14.064397 9.049538589 0.92122668 0.55233980
5 2011 8.645507 2.657100 12.445061 6.9406141 5.056093 18.787235097 0.41227465 0.01664083
6 2012 7.192367 5.695391 3.620765 9.1173421 26.452499 0.002014068 1.84031115 0.38873530
7 2013 4.878473 1.527182 11.769227 9.6991108 16.232696 6.934076956 1.07328960 0.28808505
8 2014 1.766486 5.272151 12.656086 0.7318888 32.855694 15.643783443 1.33677381 1.09871196
9 2015 9.428541 6.462755 11.473938 4.3658361 7.547359 17.634770134 1.27743503 1.35510589
10 2016 6.047083 3.437785 13.845070 12.9766045 7.401827 18.032713128 1.73208881 0.03394082
答案 1 :(得分:0)
没有数据集,我就组成了一个。
set.seed(5489)
n <- 20
df_list <- list(
nestle = data.frame(A = runif(n), B = runif(n), C = runif(n)),
unilever = data.frame(D = runif(n), E = runif(n), F = runif(n)),
abf = data.frame(G = runif(n), H = runif(n), I = runif(n))
)
下面的代码假定您要提取每个数据框的第一列,并且您希望使用原始df名称和第一列的名称组合来命名结果列。
result <- as.data.frame(do.call(cbind, lapply(df_list, `[[`, 1)))
names(result) <- paste(names(result), sapply(df_list, function(DF) names(DF)[1]))
row.names(result) <- row.names(df_list[[1]])
head(result)
# nestle A unilever D abf G
#1 0.2348625 0.007785561 0.6453142
#2 0.5951392 0.494773356 0.2167643
#3 0.3001674 0.381868381 0.7182713
#4 0.1745270 0.983473145 0.8829462
#5 0.3387269 0.178523104 0.6042962
#6 0.1103261 0.211874225 0.4545857