如何从数据框中删除特定字符

时间:2020-10-13 14:41:42

标签: r

我有一个要清理的原始数据框。

有一列包含成千上万行,看起来像:c("round", "square", "triangle")

我希望每行的最终结果看起来像:round, square, triangle

帮助?

dput(ItemisedOrders[1:10, "Products", drop = FALSE])

structure(list(Products = list("Meatlovers Pizza", c("Supreme Pizza", 
"BBQ Chicken Pizza"), c("Seafood Pizza", "Vegetarian Pizza"), 
    c("Margherita Pizza", "Supreme Pizza", "Meatlovers Pizza"
    ), c("BBQ Chicken Pizza", "Hawaiian Pizza", "Meatlovers Pizza"
    ), c("Hawaiian Pizza", "Supreme Pizza"), c("Hawaiian Pizza", 
    "Pepperoni Pizza"), c("Seafood Pizza", "BBQ Chicken Pizza", 
    "Vegetarian Pizza", "Hawaiian Pizza"), "Pepperoni Pizza", 
    c("Margherita Pizza", "Supreme Pizza"))), row.names = c(NA, 
10L), class = "data.frame")

2 个答案:

答案 0 :(得分:2)

感谢dput-好像您有list列-每行都是一个向量!我们可以使用sapply向每行应用一个函数,幸运的是toString函数可以完成您想要的操作。调用数据df

df$Products = sapply(df$Products, toString)
df
# 1                                                    Meatlovers Pizza
# 2                                    Supreme Pizza, BBQ Chicken Pizza
# 3                                     Seafood Pizza, Vegetarian Pizza
# 4                   Margherita Pizza, Supreme Pizza, Meatlovers Pizza
# 5                 BBQ Chicken Pizza, Hawaiian Pizza, Meatlovers Pizza
# 6                                       Hawaiian Pizza, Supreme Pizza
# 7                                     Hawaiian Pizza, Pepperoni Pizza
# 8  Seafood Pizza, BBQ Chicken Pizza, Vegetarian Pizza, Hawaiian Pizza
# 9                                                     Pepperoni Pizza
# 10                                    Margherita Pizza, Supreme Pizza

答案 1 :(得分:0)

我不太了解您的问题,但是如果它在数据框中,而您只想删除引号,则可以尝试:

your_data$coulmn_name <- gsub(""", "", your_data$coulmn_name)

第一个“”中的内容将被替换,而第二个“”中的内容将被替换,通过保持关闭状态,不会替换任何东西

编辑

我实际上并没有尝试过运行它,我想是因为符号r用作运算符,它们有些棘手。我已经可以使用它了:

df$Products <- gsub("\"", "", df$Products)
df$Products <- gsub("c", "", df$Products)
df$Products <- gsub("[()]", "", df$Products)