我有一个数据集,其中包含有关不同产品的买入和卖出价格的信息。但是,不是将买入和卖出的价格存储在同一行中,而是存储在两个单独的行中,这些行由买卖变量标识,如下所示。
Product|Product Type|Price|Bought|Sold
---------------------------------------
Apples | Green | 1 | 0 | 1
---------------------------------------
Apples | Green | 2 | 1 | 0
---------------------------------------
Apples | Red | 3 | 0 | 1
---------------------------------------
Apples | Red | 4 | 1 | 0
---------------------------------------
我想将买入和卖出的价格加入一行,所以它看起来像这样:
Product|Product Type|Bought Price|Sold Price
---------------------------------------------
Apples | Green | 1 | 2
---------------------------------------------
Apples | Red | 4 | 3
以下是创建我的示例数据集的代码。提前感谢您的帮助。
Product <- c("Apples", "Apples", "Apples", "Apples", "Apples", "Apples",
"Oranges", "Oranges", "Oranges", "Oranges", "Oranges", "Oranges",
"Buscuits", "Buscuits", "Buscuits", "Buscuits", "Buscuits", "Buscuits")
ProductType <- c("Green", "Green", "Red", "Red", "Pink", "Pink",
"Big", "Big", "Medium", "Medium", "Small", "Small",
"Chocolate", "Chocolate", "Oat", "Oat", "Digestive", "Digestive")
Price <- c(2, 1, 3, 4, 1, 2,
5, 3, 2, 1, 2, 3,
6, 4, 1, 8, 6, 2)
Bought <- c(0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1)
Sold <- c(1, 0, 1, 0, 1, 0,
1, 0, 1, 0, 1, 0,
1, 0, 1, 0, 1, 0)
sales <- data.frame(Product, ProductType, Price, Bought, Sold)
答案 0 :(得分:4)
使用dplyr:
library(dplyr)
sales %>%
group_by(Product, ProductType) %>%
summarise(BoughtPrice = Price[ Bought == 1 ],
SoldPrice = Price[ Sold == 1 ]) %>%
ungroup()
答案 1 :(得分:3)
library(dplyr)
df <- data.frame(Product, ProductType, Price, Bought, Sold)
df %>% group_by(Product, ProductType) %>%
summarise(Bought_Price = sum(Price * Bought),
Sold_Price = sum(Sold * Price))
# A tibble: 9 x 4
# Groups: Product [?]
# Product ProductType Bought_Price Sold_Price
# <fctr> <fctr> <dbl> <dbl>
# 1 Apples Green 1 2
# 2 Apples Pink 2 1
# 3 Apples Red 4 3
# 4 Buscuits Chocolate 4 6
# 5 Buscuits Digestive 2 6
# 6 Buscuits Oat 8 1
# 7 Oranges Big 3 5
# 8 Oranges Medium 1 2
# 9 Oranges Small 3 2
答案 2 :(得分:2)
使用dplyr
,我们按产品&#39;产品类型&#39;和summarise
进行分组,以创建&#39; BoughtPrice&#39;和&#39; SoldPrice&#39;通过子集化&#39; Price&#39;在哪里&#39;购买&#39;或者&#39;已售出&#39;是1
library(dplyr)
sales %>%
group_by(Product, ProductType) %>%
summarise(BoughtPrice = Price[Bought==1], SoldPrice = Price[Sold ==1])
data.table
的类似方法是
library(data.table)
setDT(sales)[, lapply(.SD, function(x) Price[x==1]),
.(Product, ProductType), .SDcols = Bought:Sold]