我有下表,其中我有汽车备件的项目名称。我有汽车制造商生产的特定零件的ITEM代码,我也有零件制造商生产的相同零件的相应ITEM代码。
我定期获得一个输入,我只获得已售出的ITEM代码。如何识别出售的部件。
> trial
# A tibble: 6 x 5
Name `OEM Part` `OES 1 Code` `OES 2 Code` `OES 3 Code`
<chr> <chr> <chr> <chr> <chr>
1 Brakes 231049A76 1910290/230023 NA NA
2 Cables 2410ASD12 NA 219930 3213Q23
3 Tyres 9412HJ12 231233 NA NA
4 Suspension 756634K71 782320/880716 NA NA
5 Ball Bearing 2IW2WD23 231224 NA NA
6 Clutches 9304JFW3 NA QQW223 23RQR3
假设我输入了以下值
> item_code <- c("231049A76", "1910290", "1910290", "23RQR3")
我需要以下输出
Name
Brakes
Brakes
Brakes
Clutches
注意: 1910290
和230023
是独立的部分;它们都是稍微改动的刹车。
答案 0 :(得分:6)
如果您将数据重新整形为长格式,则可以使用连接:
library(tidyverse)
trial <- tibble(Name = c("Brakes", "Cables", "Tyres", "Suspension", "Ball Bearing", "Clutches"),
`OEM Part` = c("231049A76", "2410ASD12", "9412HJ12", "756634K71", "2IW2WD23", "9304JFW3"),
`OES 1 Code` = c("1910290/230023", NA, "231233", "782320/880716", "231224", NA),
`OES 2 Code` = c(NA, "219930", NA, NA, NA, "QQW223"),
`OES 3 Code` = c(NA, "3213Q23", NA, NA, NA, "23RQR3"))
trial_long <- trial %>%
gather('code_type', 'code', -Name) %>% # reshape to long form
separate_rows(code) %>% # separate double values
drop_na(code) # drop unnecessary NA rows
# join to filter and duplicate
trial_long %>%
right_join(tibble(code = c("231049A76", "1910290", "1910290", "23RQR3")))
#> # A tibble: 4 x 3
#> Name code_type code
#> <chr> <chr> <chr>
#> 1 Brakes OEM Part 231049A76
#> 2 Brakes OES 1 Code 1910290
#> 3 Brakes OES 1 Code 1910290
#> 4 Clutches OES 3 Code 23RQR3
答案 1 :(得分:3)
使用sapply
和apply
的效率不高的方法,我们会在trial
中找到item_code
中的哪一行,然后获取相应的Name
值
sapply(item_code, function(x)
trial$Name[apply(trial[-1], 1, function(y) any(grepl(x, y)))])
# 231049A76 1910290 1910290 23RQR3
# "Brakes" "Brakes" "Brakes" "Clutches"
如果您不需要名称,请在USE.NAMES = FALSE
中设置sapply
。
答案 2 :(得分:1)
以下是使用base
的类似于您的示例:
## Create a dummy matrix
example <- cbind(matrix(1:4, 4,1), matrix(letters[1:20], 4, 4))
colnames(example) <- c("names", "W", "X", "Y", "Z")
# names W X Y Z
#[1,] "1" "a" "e" "i" "m"
#[2,] "2" "b" "f" "j" "n"
#[3,] "3" "c" "g" "k" "o"
#[4,] "4" "d" "h" "l" "p"
此表与您的表类似,其中名称位于第一列,而模式则匹配其他列。
## The pattern of interest
pattern <- c("a","e", "f", "p")
对于此模式,我们希望得到以下结果:"1","1","2","4"
。
## Detecting the pattern per row
matching_rows <- row(example[,-1])[example[,-1] %in% pattern]
#[1] 1 1 2 4
## Returning the rows with the pattern
example[matching_rows,1]
#[1] "1" "1" "2" "4"