我有一份包括体重在内的调查。我想计算每个状态组(承租人,所有者,承租人),年份和地区(Ost_B)住房成本的加权中位数。理想情况下,然后将值分配给数据帧额外一栏中的正确观察值。我使用软件包 dplyr 和 spatstat 来实现。
当我使用 weighted.mean 函数时,我的代码工作正常,但是,如果使用 weighted.median 函数,则会收到以下警告。
Error in mutate_impl(.data, dots) :
Evaluation error: All weights are zero.
这是我的数据的样子:
Ost_B hgowner bWohnk syear hhrf
109457 West [2] Main Tenant NA 2009 2249.8874
277839 West [1] Owner 883.3333 2002 1104.3820
245712 Ost [2] Main Tenant 352.0000 1997 863.4752
122197 Ost [1] Owner NA 2005 2618.3637
188236 West [2] Main Tenant 231.0000 2008 1397.6606
195256 West [1] Owner 1339.5833 1996 4940.1033
271210 Ost [2] Main Tenant 546.0000 2000 3711.1919
158995 West [1] Owner 788.0833 2014 3916.8540
20414 West [2] Main Tenant 706.0000 2015 385.5527
251343 West [2] Main Tenant 521.0000 1993 1082.2479
这是我到目前为止使用的代码:
library('ggplot2')
library('dplyr')
library('tidyr')
library('ggrepel')
library('spatstat')
data.stack <- data.stack %>%
group_by(syear, hgowner, Ost_B) %>%
mutate(mean_Wohnk=weighted.mean(bWohnk, hhrf, na.rm=TRUE)) ##works fine
data.stack <- data.stack %>%
group_by(syear, hgowner, Ost_B) %>%
mutate(med_Wohnk=weighted.median(bWohnk, hhrf, na.rm=TRUE)) ##doesn't work
我想这与 group_by 和 weighted.median 函数如何协同工作有关。但是,我不知道如何使它工作。
非常感谢您的帮助,我也将使用其他软件包,但 Hmisc 软件包中的 wtd.median()也会遇到相同的问题
稍后,我需要对decil执行相同的操作。
非常感谢您的帮助!
以下是我的数据示例:
structure(list(Ost_B = c("West", "West", "West", "West", "West",
"West", "West", "West", "West", "West", "West", "Ost", "Ost",
"West", "West", "Ost", "West", "West", "Ost", "Ost", "West",
"West", "West", "West", "Ost", "West", "West", "Ost", "West",
"West", "West", "Ost", "West", "West", "West", "West", "West",
"West", "Ost", "Ost", "West", "West", "West", "Ost", "West",
"West", "West", "West", "West", "West", "West", "West", "West",
"West", "West", "West", "Ost", "West", "West", "West", "Ost",
"Ost", "West", "West", "West", "West", "Ost", "West", "West",
"West", "West", "Ost", "West", "West", "West", NA, "Ost", "West",
"Ost", "West", "West", "Ost", "West", "West", "West", "West",
"Ost", "West", "West", "West", "West", "West", "West", "West",
NA, "West", "West", "West", "West", "West"), hgowner = structure(c(8L,
9L, 9L, 9L, 9L, 9L, 9L, 8L, 9L, 9L, 9L, 9L, 8L, 9L, 8L, 9L, 9L,
8L, 9L, 9L, 8L, 9L, 9L, 8L, 9L, 8L, 8L, 9L, 8L, 9L, 8L, 9L, 8L,
9L, 9L, 9L, 9L, 8L, 9L, 9L, 8L, 9L, 9L, 8L, 9L, 9L, 8L, 8L, 9L,
9L, 9L, 8L, 9L, 8L, 10L, 8L, 10L, 9L, 9L, 9L, 8L, 8L, 9L, 9L,
9L, 8L, 8L, 8L, 8L, 9L, 8L, 9L, 9L, 9L, 8L, 9L, 9L, 9L, 8L, 9L,
9L, 9L, 9L, 9L, 8L, 9L, 9L, 9L, 8L, 8L, 9L, 9L, 8L, 8L, 8L, 9L,
8L, 8L, 9L, 8L), .Label = c("[-8] Question this year not part of Survey program",
"[-6] Questionnaire Version With Modified Filter", "[-5] Not Included In Questionnaire Version",
"[-4] Forbidden Multiple Response", "[-3] Not Valid", "[-2] Does Not Apply",
"[-1] No Answer", "[1] Owner", "[2] Main Tenant", "[3] Sub-Tenant",
"[4] Tenant", "[5] Living In A Home"), class = "factor"), bWohnk = c(741.666666666667,
536, 675, NA, 316, 348, 301, 191.75, 723, 1035, 990, 41, 72.5,
640, 200, 351, 613, 666.666666666667, 280, 280, 1486.91666666667,
496, 367, NA, 756, 221.666666666667, NA, 407, 345.583333333333,
229, 584, 266, 1074.08333333333, 742, 62, 570, 564, 250, 306,
392, 3251.66666666667, NA, 560, 1041.66666666667, 500, 580, 349.166666666667,
1250, 383, 470, 620, NA, 510, 711.666666666667, NA, 400, NA,
521, 350, 385, 333.916666666667, 1637.5, 229, 780, 467, 844.083333333333,
NA, 199.166666666667, 262.916666666667, 249, 345.583333333333,
343, 255, 306, 860, NA, 360, 728, 81, 670, 590, 345, NA, 147,
NA, 800, 390, 409, 2279.16666666667, 858.833333333333, 252, 240,
129.166666666667, 1158.33333333333, NA, 370, 247.083333333333,
225, 350, 2048.33333333333), syear = c(2013L, 2000L, 2010L, 1991L,
1990L, 1999L, 1992L, 1996L, 2006L, 2013L, 2012L, 1991L, 2001L,
2014L, 2009L, 2008L, 1998L, 2007L, 2004L, 2010L, 2009L, 2013L,
2000L, 2004L, 1998L, 2005L, 2015L, 2006L, 2007L, 1988L, 1996L,
2015L, 2000L, 2013L, 2006L, 2001L, 2003L, 2008L, 1995L, 2015L,
2004L, 1985L, 2011L, 2007L, 2009L, 2008L, 2007L, 2007L, 1996L,
2015L, 2014L, 2001L, 2013L, 2008L, 2008L, 2007L, 1998L, 1997L,
2012L, 2009L, 1993L, 2010L, 1999L, 2010L, 2015L, 1991L, 2001L,
2004L, 1990L, 2000L, 1986L, 2012L, 2009L, 1991L, 2013L, 1984L,
2005L, 1991L, 1993L, 2010L, 2013L, 2012L, 1985L, 1988L, 2000L,
2013L, 2005L, 1999L, 2004L, 1994L, 1991L, 2000L, 2012L, 2002L,
1984L, 2006L, 1989L, 2004L, 2005L, 2009L), hhrf = c(921.252509823361,
2532.30483182391, 1087.25498433646, 5360.12838117604, 4452.44496134467,
7171.86195726776, 6954.99482158864, 5293.69413376784, 954.587233459912,
770.057906194315, 890.961226274402, 1722.1699042344, 3461.79376227083,
581.158893623584, 664.806040953876, 675.625047925726, 1574.07382275557,
3704.07702821766, 2314.05879164323, 9140.05504547961, 1005.97073867107,
681.798909716727, 3677.30043072148, 3520.56757967251, 2204.70262296716,
2713.90174652616, 2590.78246030509, 1596.21626704554, 1121.92427792032,
6002.35785636737, 2767.4864747415, 344.526432577656, 2086.31596835848,
4291.12236938177, 6949.80424056846, 1863.21490205662, 2931.02198881285,
3053.75478333405, 4740.24457203908, 3108.19722193157, 848.300965349622,
6782.34863226538, 5728.42824611808, 529.221471230179, 0, 3894.38448724548,
6191.35484028036, 5087.22552915895, 10586.6710234059, 310.171496020192,
395.575830034726, 3119.21033893799, 760.153320489404, 3242.17431586629,
938.995243924511, 1516.05334458071, 795.550785734506, 4310.61532022633,
449.076675395395, 2379.3439598687, 6342.0726785268, 0, 10800.2931706199,
10001.6994303796, 868.391937643362, 4053.41583759845, 2128.14552166684,
1992.9512907382, 5749.10182602249, 384.310560302864, 4499.40446551783,
3552.06832694212, 569.031301845848, 2192.44389667968, 142.161789518523,
5093.0220545886, 1890.93497387136, 2484.45841921283, 4870.87928765949,
0, 487.614412884924, 2649.97192542962, 736.845961792243, 894.230153879632,
4765.82775377142, 562.849351807804, 2365.25925889019, 2378.82322189005,
1087.29238473588, 6727.53539570428, 22616.3513598868, 1996.36668869512,
4727.07641240618, 992.560599469024, 5716.97578413211, 1774.61201799785,
5633.78476350193, 6772.43186294665, 3854.55479513783, 2691.89802789182
)), row.names = c(37169L, 72350L, 37533L, 269481L, 170828L, 38185L,
290578L, 85410L, 275082L, 52371L, 42623L, 229396L, 206395L, 65165L,
76781L, 13795L, 198625L, 103913L, 225237L, 171747L, 299594L,
38141L, 120416L, 272962L, 227829L, 91750L, 48097L, 218908L, 266404L,
157923L, 195375L, 44572L, 192725L, 150943L, 267952L, 263899L,
104001L, 91037L, 230106L, 135626L, 118682L, 75336L, 220588L,
221938L, 292125L, 103957L, 87702L, 281763L, 255340L, 115224L,
5651L, 114331L, 26278L, 78108L, 245838L, 90290L, 246453L, 265473L,
52804L, 180766L, 229731L, 292792L, 32416L, 111111L, 7194L, 28970L,
142415L, 265198L, 271291L, 1989L, 77440L, 131535L, 22310L, 177106L,
59567L, 11177L, 226138L, 177877L, 206061L, 294689L, 16921L, 151844L,
195428L, 237935L, 87825L, 285575L, 131939L, 267954L, 280806L,
145326L, 193033L, 96553L, 149493L, 77576L, 174725L, 101594L,
193089L, 116896L, 87604L, 156306L), class = "data.frame")
答案 0 :(得分:2)
如果您测试在一个或多个组中发现的情况并返回一个NA,则此代码成功:
# Apparently `spatstat::weighted.median` needs at least 3 points or it fails
> dat3 <- dat %>%
+ group_by(syear, hgowner, Ost_B) %>%
+ mutate(med_Wohnk=if( sum(!is.na(hhrf))<2 ){NA} else { weighted.median(bWohnk, hhrf, na.rm=TRUE)})
Error in mutate_impl(.data, dots) :
Evaluation error: All weights are zero.
dat3 <- dat %>%
group_by(syear, hgowner, Ost_B) %>%
mutate(med_Wohnk=if( sum(!is.na(hhrf))<3 ){NA} else { weighted.median(bWohnk, hhrf, na.rm=TRUE)})
dat3
# A tibble: 100 x 6
# Groups: syear, hgowner, Ost_B [57]
Ost_B hgowner bWohnk syear hhrf med_Wohnk
<chr> <fct> <dbl> <int> <dbl> <dbl>
1 West [1] Owner 742. 2013 921. NA
2 West [2] Main Tenant 536 2000 2532. 310.
3 West [2] Main Tenant 675 2010 1087. 722.
4 West [2] Main Tenant NA 1991 5360. 252
5 West [2] Main Tenant 316 1990 4452. NA
6 West [2] Main Tenant 348 1999 7172. 229
7 West [2] Main Tenant 301 1992 6955. NA
8 West [1] Owner 192. 1996 5294. NA
9 West [2] Main Tenant 723 2006 955. 62
10 West [2] Main Tenant 1035 2013 770. 655.
# ... with 90 more rows