我为数据设置了这些变量(分支,项目,销售,库存),我需要进行一个for循环以提取具有以下内容的数据
具有
的相同项目
1个不同的分支
2-它的销售额高于库存
并将结果保存在数据框中 我使用的代码是
trials <- sample_n(Data_with_stock,1000)
for (i in 1:nrow(trials))
{
if(trials$sales[i] > trials$stock[i] & trials$item[i] == trials$item[i+1] & trials$branch[i] != trials$branch[i+1])
{s <-data.frame( (trials$NAME[i])
,(trials$branch[i]))
}
}
答案 0 :(得分:1)
您只想修复代码:
您未在代码中设置一个=
。
使用:
trials <- sample_n(Data_with_stock,1000)
# next you need first to define s used in your loop
s <- array(NA, dim = c(1,2)) # as you only save 2 things in s per iteration
for (i in 1:nrow(trials)) {
# but I dont get why you compare the second condition.
if(trials$sales[i] > trials$stock[i] & trials$item[i] == trials$item[i] & trials$branch[i] != trials$branch[i+1]) {
s[i,] <- cbind(trials$NAME[i], trials$branch[i])
} else {
s[i,] <- NA # just to have no problem with the index i, you can delete the one with na afterwards with na.omit()
}
答案 1 :(得分:1)
建议您使用 dplyr 库,在安装后考虑“ df”是您的数据集,对问题1和2使用以下命令
question_one = df %>%
group_by(Item) %>%
summarise(No_of_branches = n_distinct(Branch))
items_with_more_than_one_branch = question_one[which(question_one$No_of_branches>1)"Item"]
question_two = df %>%
group_by(Item) %>%
summarise(Stock_Val = sum(Stock), Sales_Val = sum(Sales))
item_with_sales_greater_than_stock = question_two[which(question_two$Sales > question_two$Stock),"Item"]
在没有 dplyr 的情况下束手无策,但是建议,如果尚未使用 dplyr ,它将始终对数据处理非常有用