仅将一个变量播送到新列R中

时间:2018-07-25 02:30:37

标签: r data.table melt dcast

我正在尝试<id>数据,以便仅将dcast值拆分为新列。但是,我设法做到这一点的唯一方法是先Actual,然后再dcast。我想知道是否有更有效的解决方案。

第1步:

我已经为数据做了一些准备,但这是这样的:

melt

现在,我想在> test_m <- melt(test, id.vars = c("category", "Budget_year", "State")) > test_m <- test_m[,c("Year", "Type_of_observation"):= tstrsplit(variable, " ", fixed = TRUE)] > test_m[,variable := NULL] > head(test_m, n = 10) category Budget_year State value Year Type_of_observation 1: Transfer Duty 2000_01 N 1916 1998-99 Actual 2: Land Tax 2000_01 N 948 1998-99 Actual 3: Payroll Tax 2000_01 N 3605 1998-99 Actual 4: Total Gambling 2000_01 N 1419 1998-99 Actual 5: GST 2000_01 N 4705 1998-99 Actual 6: Transfer Duty 2000_01 N 1747 1999-00 Budget 7: Land Tax 2000_01 N 830 1999-00 Budget 8: Payroll Tax 2000_01 N 3616 1999-00 Budget 9: Total Gambling 2000_01 N 1558 1999-00 Budget 10: GST 2000_01 N 5162 1999-00 Budget 列中添加一个新列,但只使用Type_of_observation观察值,而将所有其他观察类型留在后面。我当前的方法是先Actual,然后dcast,如下所示:

第2步:所需的输出

melt

因此,现在我有一个用于> test_c <- dcast(test_m, category + Budget_year + State + Year ~ Type_of_observation) > test_mc <- melt(test_c, id.vars = c("category", "Budget_year", "State", "Year", "Actual"), measure.vars = c("Budget", "Estimate", "Revised")) > head(test_mc, n = 10) category Budget_year State Year Actual variable value 1: GST 2000_01 N 1998-99 4705 Budget NA 2: GST 2000_01 N 1999-00 NA Budget 5162 3: GST 2000_01 N 2000-01 NA Budget 8318 4: GST 2000_01 N 2001-02 NA Budget NA 5: GST 2000_01 N 2002-03 NA Budget NA 6: GST 2000_01 N 2003-04 NA Budget NA 7: Land Tax 2000_01 N 1998-99 948 Budget NA 8: Land Tax 2000_01 N 1999-00 NA Budget 830 9: Land Tax 2000_01 N 2000-01 NA Budget 921 10: Land Tax 2000_01 N 2001-02 NA Budget NA 的列,并且所有其他类型的观察值都保留在Actuals列中。

有没有一种方法可以使我从variable转到test_m,而不必同时进行test_mcdcast?我最好采用melt解决方案,但对任何事物都开放。

这里是data.table的{​​{1}}:

dput

2 个答案:

答案 0 :(得分:1)

您可以先完成案例,然后再加入数据集。

最后,您进行更新联接以查找实际值。

#create complete cases
ans <- test_m[CJ(category=category, Budget_year=Budget_year, State=State, Year=Year, Type_of_observation=c("Budget", "Estimate", "Revised"), unique=TRUE),
    on=.(category, Budget_year, State, Year, Type_of_observation)][
        #update join
        test_m[Type_of_observation=="Actual"], 
        Actual := i.value,
        on=.(category, Budget_year, State, Year)]

#order to match test_mc
setorder(ans, category, Budget_year, State, Year, Type_of_observation)[]

答案 1 :(得分:0)

我认为我有一个简单的data.table方法,可以使用setkey并加入方括号内。

我将使用一个更简单的data.table。目标是将interest_rate放入其自己的列中。

samp <- data.table(
  group=c("a","a","a","b","b","b","c","c","c"),
  variable=c("balance", "end_balance","interest_rate"),
  value=c(1000, 940, .05, 1200, 1040, .08, 980, 970, .10)
)


setkey(samp, group)

#  This will create a data.table with just our desired variable value, interest_rate, by group
samp[variable=="interest_rate", .(interest_rate=unique(value)), by=.(group)]

#  We then join this to the original data.table using the already set key and
#  drop the interest_rate rows in the final data.table
samp[samp[variable=="interest_rate", .(interest_rate=unique(value)), by=.(group)]][variable!="interest_rate"]