规则
{Denny Frying Pan} => {Denny C-Size Batteries}
{Denny Scented Tissue} => {Denny Paper Plates}
{Blue Label Fancy Canned Clams} => {蓝标水罐头金枪鱼}
{Denny Plastic Forks} => {Golden Frozen Peas}
{Denny Frying Pan} => {Denny D-Size Batteries}
{Denny Plastic Forks} => {仿制品杏子洗发水}
{Golden Frozen Peas} => {Denny Plastic Forks}
{Faux Products Apricot Shampoo} => {Denny Plastic Forks}
{Blue Label罐装金枪鱼在水中} => {Blue Label Fancy Canned Clams}
{Blue Label Canned String Beans} => {Faux Products Buffered Aspirin}
{Denny D-Size Batteries} => {丹尼煎锅}
我有一个如上所述的单列数据框。 我想将上述规则分为LHS和RHS
LHS应包含{}之前=>之间的字符。 并且类似地,RHS应该包含在=>
之后的下一个{}之间包含的字符我想知道如何在R中完成这项工作?
答案 0 :(得分:1)
RULES <- c("{Denny Frying Pan} => {Denny C-Size Batteries}",
"{Denny Scented Tissue} => {Denny Paper Plates}",
"{Blue Label Fancy Canned Clams} => {Blue Label Canned Tuna in Water}",
"{Denny Plastic Forks} => {Golden Frozen Peas}",
"{Denny Frying Pan} => {Denny D-Size Batteries}",
"{Denny Plastic Forks} => {Faux Products Apricot Shampoo}",
"{Golden Frozen Peas} => {Denny Plastic Forks}",
"{Faux Products Apricot Shampoo} => {Denny Plastic Forks}",
"{Blue Label Canned Tuna in Water} => {Blue Label Fancy Canned Clams}",
"{Blue Label Canned String Beans} => {Faux Products Buffered Aspirin}",
"{Denny D-Size Batteries} => {Denny Frying Pan}")
df <- as.data.frame(do.call(rbind,strsplit(RULES,"} => {",fixed=TRUE)))
df[,1] <- gsub("{","",df[,1],fixed = TRUE)
df[,2] <- gsub("}","",df[,2],fixed = TRUE)
df
V1 V2
1 Denny Frying Pan Denny C-Size Batteries
2 Denny Scented Tissue Denny Paper Plates
3 Blue Label Fancy Canned Clams Blue Label Canned Tuna in Water
4 Denny Plastic Forks Golden Frozen Peas
5 Denny Frying Pan Denny D-Size Batteries
6 Denny Plastic Forks Faux Products Apricot Shampoo
7 Golden Frozen Peas Denny Plastic Forks
8 Faux Products Apricot Shampoo Denny Plastic Forks
9 Blue Label Canned Tuna in Water Blue Label Fancy Canned Clams
10 Blue Label Canned String Beans Faux Products Buffered Aspirin
11 Denny D-Size Batteries Denny Frying Pan
答案 1 :(得分:0)
您可以尝试以下方法之一。两者都假设你从一个名为“rules”的角色向量开始。如果“规则”已经是data.frame
中的列,则需要稍加修改。
library(splitstackshape)
library(dplyr)
data.table(rules = gsub("[{}]", "", gsub("=>", "\t", rules))) %>%
cSplit("rules", "\t")
# rules_1 rules_2
# 1: Denny Frying Pan Denny C-Size Batteries
# 2: Denny Scented Tissue Denny Paper Plates
# 3: Blue Label Fancy Canned Clams Blue Label Canned Tuna in Water
# 4: Denny Plastic Forks Golden Frozen Peas
# 5: Denny Frying Pan Denny D-Size Batteries
# 6: Denny Plastic Forks Faux Products Apricot Shampoo
# 7: Golden Frozen Peas Denny Plastic Forks
# 8: Faux Products Apricot Shampoo Denny Plastic Forks
# 9: Blue Label Canned Tuna in Water Blue Label Fancy Canned Clams
# 10: Blue Label Canned String Beans Faux Products Buffered Aspirin
# 11: Denny D-Size Batteries Denny Frying Pan
library(dplyr)
library(tidyr)
data.frame(rules) %>%
mutate(rules = gsub("\\s+=>\\s+", "=>", rules)) %>%
mutate(rules = gsub("[{}]", "", rules)) %>%
separate(rules, into = c("V1", "V2"), sep = "=>")
答案 2 :(得分:0)
以下是我坚持使用 qdapRegex 的方法:
RULES <- c("{Denny Frying Pan} => {Denny C-Size Batteries}",
"{Denny Scented Tissue} => {Denny Paper Plates}",
"{Blue Label Fancy Canned Clams} => {Blue Label Canned Tuna in Water}",
"{Denny Plastic Forks} => {Golden Frozen Peas}",
"{Denny Frying Pan} => {Denny D-Size Batteries}",
"{Denny Plastic Forks} => {Faux Products Apricot Shampoo}",
"{Golden Frozen Peas} => {Denny Plastic Forks}",
"{Faux Products Apricot Shampoo} => {Denny Plastic Forks}",
"{Blue Label Canned Tuna in Water} => {Blue Label Fancy Canned Clams}",
"{Blue Label Canned String Beans} => {Faux Products Buffered Aspirin}",
"{Denny D-Size Batteries} => {Denny Frying Pan}")
library(qdapRegex)
setNames(do.call(rbind.data.frame, rm_curly(RULES, extract=TRUE)), c("LHS", "RHS"))
## LHS RHS
## 1 Denny Frying Pan Denny C-Size Batteries
## 2 Denny Scented Tissue Denny Paper Plates
## 3 Blue Label Fancy Canned Clams Blue Label Canned Tuna in Water
## 4 Denny Plastic Forks Golden Frozen Peas
## 5 Denny Frying Pan Denny D-Size Batteries
## 6 Denny Plastic Forks Faux Products Apricot Shampoo
## 7 Golden Frozen Peas Denny Plastic Forks
## 8 Faux Products Apricot Shampoo Denny Plastic Forks
## 9 Blue Label Canned Tuna in Water Blue Label Fancy Canned Clams
## 10 Blue Label Canned String Beans Faux Products Buffered Aspirin
## 11 Denny D-Size Batteries Denny Frying Pan
我们在大括号之间提取内容,然后使用do.call
+ rbind.data.frame
强制转换为data.frame
。