R中的“未知列”

时间:2019-01-05 18:07:12

标签: r dataframe filtering regression

我正在从事一项经济研究,并使用melt & tidy包中的broom函数填充了回归系数的数据框。我的df:

    > head(LmModGDP, 10)
       Country            variable        term      estimate    std.error statistic      p.value
1  Netherlands   FDI_InFlow_MilUSD (Intercept)  5.354083e+02 5.974760e+01  8.961167 1.976417e-09
2  Netherlands   FDI_InFlow_MilUSD       value  2.400677e-03 1.409779e-03  1.702875 1.005189e-01
3  Netherlands  FDI_InFlow_percGDP (Intercept)  6.184273e+02 6.723554e+01  9.197923 1.173719e-09
4  Netherlands  FDI_InFlow_percGDP       value -1.261933e+00 1.008740e+01 -0.125100 9.014067e-01
5  Netherlands  FDI_InStock_MilUSD (Intercept)  3.110956e+02 2.719577e+01 11.439116 1.201802e-11
6  Netherlands  FDI_InStock_MilUSD       value  7.025298e-04 5.307147e-05 13.237429 4.620706e-13
7  Netherlands  FDI_OutFlow_MilUSD (Intercept)  5.106762e+02 5.939921e+01  8.597356 4.465840e-09
8  Netherlands  FDI_OutFlow_MilUSD       value  1.920313e-03 8.646908e-04  2.220808 3.528536e-02
9  Netherlands FDI_OutFlow_percGDP (Intercept)  2.593453e+02 5.334202e+01  4.861932 4.838082e-05
10 Netherlands FDI_OutFlow_percGDP       value  3.931491e+00 5.332541e-01  7.372641 7.896681e-08

使用任何方法(甚至只是通过子集化或使用dplyr包对df进行过滤后):

LmModGDP[LmModGDP$variable == "FDI_InStock_MilUSD",]

LmModGDP %>%
  filter(variable == "FDI_InStock_MilUSD")

它返回所需的df,但是当我将鼠标拖到RStudio查看器的最后一列(p.value)上时,它告诉我它是“未知列”,并且数据仍然正确。另外,当我在其上使用strclass函数时,它表明它是数字,但在查看器中却显示了其他东西。

我想要的df:

    Country           variable        term     estimate    std.error statistic      p.value
5  Netherlands FDI_InStock_MilUSD (Intercept) 3.110956e+02 2.719577e+01 11.439116 1.201802e-11
6  Netherlands FDI_InStock_MilUSD       value 7.025298e-04 5.307147e-05 13.237429 4.620706e-13
19     Romania FDI_InStock_MilUSD (Intercept) 3.122229e+01 3.313134e+00  9.423796 7.188216e-10
20     Romania FDI_InStock_MilUSD       value 2.128223e-03 7.035679e-05 30.249006 8.588104e-22

当我尝试使用kable函数将其显示在降价报告p.value列中时,仅显示0个值,而不是实际值。

有人可以帮助我吗?

!上!!

这是str的输出:

Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 28 obs. of  7 variables:
 $ Country  : chr  "Netherlands" "Netherlands" "Netherlands" "Netherlands" ...
 $ variable : Factor w/ 7 levels "FDI_InFlow_MilUSD",..: 1 1 2 2 3 3 4 4 5 5 ...
 $ term     : chr  "(Intercept)" "value" "(Intercept)" "value" ...
 $ estimate : num  535.4083 0.0024 618.4273 -1.2619 311.0956 ...
 $ std.error: num  59.7476 0.00141 67.23554 10.0874 27.19577 ...
 $ statistic: num  8.961 1.703 9.198 -0.125 11.439 ...
 $ p.value  : num  1.98e-09 1.01e-01 1.17e-09 9.01e-01 1.20e-11 ...
 - attr(*, "vars")= chr  "Country" "variable"
 - attr(*, "drop")= logi TRUE
 - attr(*, "indices")=List of 14
  ..$ : int  0 1
  ..$ : int  2 3
  ..$ : int  4 5
  ..$ : int  6 7
  ..$ : int  8 9
  ..$ : int  10 11
  ..$ : int  12 13
  ..$ : int  14 15
  ..$ : int  16 17
  ..$ : int  18 19
  ..$ : int  20 21
  ..$ : int  22 23
  ..$ : int  24 25
  ..$ : int  26 27
 - attr(*, "group_sizes")= int  2 2 2 2 2 2 2 2 2 2 ...
 - attr(*, "biggest_group_size")= int 2
 - attr(*, "labels")='data.frame':  14 obs. of  2 variables:
  ..$ Country : chr  "Netherlands" "Netherlands" "Netherlands" "Netherlands" ...
  ..$ variable: Factor w/ 7 levels "FDI_InFlow_MilUSD",..: 1 2 3 4 5 6 7 1 2 3 ...
  ..- attr(*, "vars")= chr  "Country" "variable"
  ..- attr(*, "drop")= logi TRUE

1 个答案:

答案 0 :(得分:0)

我还不能发表评论,这就是为什么我在这里写下答案的原因。

您能告诉我们str(LmModGDP)的输出吗?也许df是嵌套的?也许它不是纯df,但具有特殊的属性。您是否尝试过强制LmModGDP <-as.data.frame(LmModGDP)?

您是否尝试过强制LmModGDP $ p.value <-as.numeric(LmModGDP $ p.value)?

您是否尝试过转换为data.table,并在对其应用过滤器后查看其行为是否有所不同?

UPDATE1: 感谢您发布str()。您的对象是“ grouped_df”。您是否尝试取消分组(LmModGDP)?