DF1
function basvuru_ekle($data)
{
$this->db->insert($this->table, $data);
return $this->db->insert_id();
}
DF2
SIC Value
350 100
460 500
140 200
290 400
506 450
注意:SIC1的类具有字符,我们需要转换为数字范围
我正在尝试获取如下所示的输出
所需的输出:
DF3
SIC1 AREA
100-200 Forest
201-280 Hospital
281-350 Education
351-450 Government
451-550 Land
我首先尝试将SIC1的字符类转换为数字 然后尝试合并,但没有运气,有人可以对此进行指导吗?
答案 0 :(得分:3)
我们可以进行非股权加入。将'DF2'中的'SIC1'列拆分为(tstrsplit
到数字列,然后对第一个数据集进行非等值连接。
library(data.table)
setDT(DF2)[, c('start', 'end') := tstrsplit(SIC1, '-', type.convert = TRUE)]
DF2[, -1, with = FALSE][DF1, on = .(start <= SIC, end >= SIC),
mult = 'last'][, .(SIC = start, Value, AREA)]
# SIC Value AREA
#1: 350 100 Education
#2: 460 500 Land
#3: 140 200 Forest
#4: 290 400 Education
#5: 506 450 Land
或者如@Frank所述,我们可以进行滚动连接以提取“ AREA”并在第一个数据集上进行更新
setDT(DF1)[, AREA := DF2[DF1, on=.(start = SIC), roll=TRUE, x.AREA]]
DF1 <- structure(list(SIC = c(350L, 460L, 140L, 290L, 506L), Value = c(100L,
500L, 200L, 400L, 450L)), .Names = c("SIC", "Value"),
class = "data.frame", row.names = c(NA, -5L))
DF2 <- structure(list(SIC1 = c("100-200", "201-280", "281-350", "351-450",
"451-550"), AREA = c("Forest", "Hospital", "Education", "Government",
"Land")), .Names = c("SIC1", "AREA"), class = "data.frame",
row.names = c(NA, -5L))
答案 1 :(得分:3)
可以选择将tidyr::separate
与sqldf
一起使用,以将两个表连接到值的范围。
library(sqldf)
library(tidyr)
DF2 <- separate(DF2, "SIC1",c("Start","End"), sep = "-")
sqldf("select DF1.*, DF2.AREA from DF1, DF2
WHERE DF1.SIC between DF2.Start AND DF2.End")
# SIC Value AREA
# 1 350 100 Education
# 2 460 500 Lan
# 3 140 200 Forest
# 4 290 400 Education
# 5 506 450 Lan
数据:
DF1 <- read.table(text =
"SIC Value
350 100
460 500
140 200
290 400
506 450",
header = TRUE, stringsAsFactors = FALSE)
DF2 <- read.table(text =
"SIC1 AREA
100-200 Forest
201-280 Hospital
281-350 Education
351-450 Government
451-550 Lan",
header = TRUE, stringsAsFactors = FALSE)