使用R中的dplyr创建任意数量的新列

时间:2017-08-07 11:02:07

标签: r loops dplyr metadata mutate

我不确定标题是否措辞得当,但情况如下:

我有一个元数据数据集,其中可以包含任意数量的行,例如:

Control_DF <- cbind.data.frame(
  Scenario = c("A","B","C")
  ,Variable = c("V1","V2","V3")
  ,Weight = c("w1","w2","w3")
)

使用Control_DF中包含的数据,我想在主数据集上创建每个变量的新版本,其中我将变量乘以权重。因此,如果我的主数据集如下所示:

Main_Data <- cbind.data.frame(
  V1 = c(1,2,3,4)
  ,V2 = c(2,3,4,5)
  ,V2 = c(3,4,5,6)
  ,w1 = c(0.1,0.5,1,0.8)
  ,w2 = c(0.2,1,0.3,0.6)
  ,w2 = c(0.3,0.7,0.1,0.2)   
)

然后,在开放代码中,我想要做的事情是这样的:

New_Data <- Main_Data %>%
  mutate(
    weighted_V1 = V1 * w1
    ,weighted_V2 = V2 * w2
    ,weighted_V3 = V3 * w3
  )

但是,我需要一种不硬编码的方法,并且引用的变量数量是任意的。

任何人都可以帮助我吗?

1 个答案:

答案 0 :(得分:1)

R lapply Map基础cbind# with Control_DF create a list with pairs of <varName,wgt> controlVarList = lapply(Control_DF$Scenario,function(x) as.vector(as.matrix(Control_DF[Control_DF$Scenario==x,c("Variable","Weight")] )) ) controlVarList #[[1]] #[1] "V1" "w1" # #[[2]] #[1] "V2" "w2" # #[[3]] #[1] "V3" "w3" # A custom function for multiplication of both columns fn_weightedVars = function(x) { # x = c("V1","w1"); hence x[1] = "V1",x[2] = "w2" # reference these columns in Main_Data and do scaling wgtedCol = matrix(Main_Data[,x[1]] * Main_Data[,x[2]],ncol=1) #rename as required colnames(wgtedCol)= paste0("weighted_",x[1]) #return var wgtedCol } #call function on each each list element scaledList = Map(fn_weightedVars ,controlVarList) scaledDF = do.call(cbind,scaledList) #combine datasets New_Data = data.frame(Main_Data,scaledDF) New_Data # V1 V2 V3 w1 w2 w3 weighted_V1 weighted_V2 weighted_V3 #1 1 2 3 0.1 0.2 0.3 0.1 0.4 0.9 #2 2 3 4 0.5 1.0 0.7 1.0 3.0 2.8 #3 3 4 5 1.0 0.3 0.1 3.0 1.2 0.5 #4 4 5 6 0.8 0.6 0.2 3.2 3.0 1.2 中,您可以执行以下操作:

Get-ADUser -Filter {Enabled -eq $true} -Properties LastLogonDate, createTimeStamp, mail |
    Select Name, SamAccountName, LastLogonDate, createTimeStamp, mail,
        @{n='MailboxSize';e={
            Get-Mailbox $_.mail |
                Get-MailboxStatistics |
                Select-Object -Expand TotalItemSize
        }}

<强>输出:

% Operation 1
Year = 2008;
PartOfYear = 1;
PlantType = 1;
string200811 = 'blabla'; % some random result
number200811 = rand(1); % some other random result
vector200811 = [rand(1); rand(1); rand(1); rand(1)]; % some other random result

% Operation 2
Year = 2008;
PartOfYear = 1;
PlantType = 2;
string200812 = 'blablablubb';
number200812 = rand(1);
vector200812 = [rand(1); rand(1); rand(1); rand(1)];

% Operation 3
Year = 2008;
PartOfYear = 2;
PlantType = 1;
string200821 = 'blablabla';
number200821 = rand(1);
vector200821 = [rand(1); rand(1); rand(1); rand(1)];

% Operation 4
Year = 2008;
PartOfYear = 2;
PlantType = 2;
string200822 = 'blablablablubb';
number200822 = rand(1);
vector200822 = [rand(1); rand(1); rand(1); rand(1)];

% Concatenate results
Results = {2008, 1, 1, string200811, number200811;...
           2008, 1, 2, string200812, number200812;...
           2008, 2, 1, string200821, number200821;...
           2008, 2, 2, string200822, number200822}
Table = cell2table(Results);
writetable(Table,'ResultsTest.xls','Sheet',1);

vectors = vertcat(vector200811, vector200812, vector200821, vector200822)