R - 列表操作

时间:2018-02-08 14:32:31

标签: r list dataframe

我有一个包含4个数据框的列表(selected_key_ratios)($ nestle; $ unilever; $ pepsico; $ abf)。每个数据框都包含公司的财务数据。所有数据帧都具有相同的行索引和几乎相同的列(有时只有货币不同)。这是列表的截图。

enter image description here

我正在尝试创建一个新列表,其中每个项目都是数据框的列,按公司分组。这是一个图形示例:

enter image description here

对于每一列数据帧都是如此。我用lapply尝试了几个小时,但没有任何东西能产生预期的结果。

你有任何线索吗?非常感谢!

2 个答案:

答案 0 :(得分:1)

你可以尝试这样嵌套的lapply

# Recreation of your list of dataframes
w <- list(
  abc = data.frame(
    "eps_usd" = runif(10) * 10,
    "eps_gbp" = runif(10) * 8
  ),
  def = data.frame(
    "eps_usd" = runif(10) * 15,
    "eps_eur" = runif(10) * 13
  ),
  ghi = data.frame(
    "eps_gbp" = runif(10) * 35,
    "eps_aud" = runif(10) * 19
  ),
  jkl = data.frame(
    "eps_usd" = runif(10) * 2,
    "eps_aud" = runif(10) * 1.4
  )
)

# Create a new dataframe with the year column
result <- data.frame(year = 2007:2016)

# Apply to each name in the list
lapply(names(w), function(tbl) {
  # Apply to each colname of each df
  lapply(colnames(w[[tbl]]), function(col) {
    # Assign to the reult df column the corresponding column int he list of df's
    result[[paste0(tbl, "_", col)]] <<- w[[tbl]][[col]]
  })
})

输出:

> result
   year abc_eps_usd abc_eps_gbp def_eps_usd def_eps_eur ghi_eps_gbp  ghi_eps_aud jkl_eps_usd jkl_eps_aud
1  2007    8.107360    3.419094   11.660133   9.9744151    3.801628  1.936746299  1.36976914  0.58472812
2  2008    7.527040    2.342307   11.407357   5.6755403   13.433364  8.595490269  0.31085568  0.06655984
3  2009    5.155562    4.272123    8.506886   8.5367400   20.305427 18.191703109  0.01993349  0.31829031
4  2010    2.947270    2.983519    5.686625   5.2630734   14.064397  9.049538589  0.92122668  0.55233980
5  2011    8.645507    2.657100   12.445061   6.9406141    5.056093 18.787235097  0.41227465  0.01664083
6  2012    7.192367    5.695391    3.620765   9.1173421   26.452499  0.002014068  1.84031115  0.38873530
7  2013    4.878473    1.527182   11.769227   9.6991108   16.232696  6.934076956  1.07328960  0.28808505
8  2014    1.766486    5.272151   12.656086   0.7318888   32.855694 15.643783443  1.33677381  1.09871196
9  2015    9.428541    6.462755   11.473938   4.3658361    7.547359 17.634770134  1.27743503  1.35510589
10 2016    6.047083    3.437785   13.845070  12.9766045    7.401827 18.032713128  1.73208881  0.03394082

答案 1 :(得分:0)

没有数据集,我就组成了一个。

set.seed(5489)
n <- 20
df_list <- list(
    nestle = data.frame(A = runif(n), B = runif(n), C = runif(n)),
    unilever = data.frame(D = runif(n), E = runif(n), F = runif(n)),
    abf = data.frame(G = runif(n), H = runif(n), I = runif(n))
)

下面的代码假定您要提取每个数据框的第一列,并且您希望使用原始df名称和第一列的名称组合来命名结果列。

result <- as.data.frame(do.call(cbind, lapply(df_list, `[[`, 1)))
names(result) <- paste(names(result), sapply(df_list, function(DF) names(DF)[1]))
row.names(result) <- row.names(df_list[[1]])

head(result)
#   nestle A  unilever D     abf G
#1 0.2348625 0.007785561 0.6453142
#2 0.5951392 0.494773356 0.2167643
#3 0.3001674 0.381868381 0.7182713
#4 0.1745270 0.983473145 0.8829462
#5 0.3387269 0.178523104 0.6042962
#6 0.1103261 0.211874225 0.4545857