我是R的初学者。有人可以帮我解决如何在R中完成以下工作。
我已将R连接到redshift(AWS)数据库,我正在对红移表执行某些操作。
从源表订单我创建了一个数据框,其中包含所有可能的组合列表,如何放置不同的订单。我有一个id列列出了唯一的组合(它是jst行号,因为每一行都有一个独特的组合)
包含以下值的数据框:
amt order_time order_day hour_day table_no item_grp id
2 1 2 14 16 1 1
1 2 1 18 12 2 2
总的来说,数据框中包含1500个行条目(意味着1500种可能的组合)
我希望这个数据框充当包含order_id的sql表名序的查找表
订单表
order_id amt order_time order_day hour_day table_no item_grp
123 2 1 2 14 16 1
321 2 1 2 14 16 1
456 1 2 1 18 12 2
如何将数据框中的值传递给where条件下的sql语句 就像读取我的数据框的每一行,从订单表中获取满足所需条件的值,并按下面列出的格式列出行
输出表格如下:
order_id amt order_time order_day hour_day table_no item_grp id
123 2 1 2 14 16 1 1
321 2 1 2 14 16 1 1
456 1 2 1 18 12 2 2
等等......
答案 0 :(得分:0)
这是一个解决方案。它使用dplyr
包中的left_join()方法进行数据帧操作。
有关详情,请参阅dplyr
文档:https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
library(dplyr) # %>% , left_join()
library(purrr) # map_df() to remove factors from structure()
#sample data
order_details <-
dput(
structure(
list(
order_id = structure(1:3, .Label = c("123", "321",
"456"), class = "factor"),
amt = structure(c(2L, 2L, 1L), .Label = c("1",
"2"), class = "factor"),
order_time = structure(c(1L, 1L, 2L), .Label = c("1",
"2"), class = "factor"),
order_day = structure(c(2L, 2L, 1L), .Label = c("1",
"2"), class = "factor"),
hour_day = structure(c(1L, 1L, 2L), .Label = c("14",
"18"), class = "factor"),
table_no = structure(c(2L, 2L, 1L), .Label = c("12",
"16"), class = "factor"),
item_grp = structure(c(1L, 1L, 2L), .Label = c("1",
"2"), class = "factor")
),
.Names = c(
"order_id",
"amt",
"order_time",
"order_day",
"hour_day",
"table_no",
"item_grp"
),
row.names = c(NA,
-3L), class = "data.frame"))
order_details <- purrr::map_df(purrr::map_df(order_details, as.character), as.integer)
#sample data contd.
orders <-
dput(structure(
list(
amt = c(2L, 1L),
order_time = 1:2,
order_day = c(2L,
1L),
hour_day = c(14L, 18L),
table_no = c(16L, 12L),
item_grp = 1:2,
id = 1:2
),
.Names = c(
"amt",
"order_time",
"order_day",
"hour_day",
"table_no",
"item_grp",
"id"
),
row.names = c(NA,-2L),
class = "data.frame"
))
# lookup order id
orders_augm <- orders %>%
left_join(
order_details,
by = c(
"amt",
"order_time",
"order_day",
"hour_day",
"table_no",
"item_grp"
)
)
结果:
orders_augm
# A tibble: 3 × 8
amt order_time order_day hour_day table_no item_grp id order_id
<int> <int> <int> <int> <int> <int> <int> <int>
1 2 1 2 14 16 1 1 123
2 2 1 2 14 16 1 1 321
3 1 2 1 18 12 2 2 456
重新排序的列:
orders_augm %>%
select(order_id, amt,
order_time, order_day, hour_day,
table_no, item_grp, id )
结果
# A tibble: 3 × 8
order_id amt order_time order_day hour_day table_no item_grp id
<int> <int> <int> <int> <int> <int> <int> <int>
1 123 2 1 2 14 16 1 1
2 321 2 1 2 14 16 1 1
3 456 1 2 1 18 12 2 2