Question

我正在尝试从数据框中的列中提取前几个字符。我需要的是先遇到几个字符，直到遇到“，”。

数据：

texts
12/5/15, 11:49 - thanks, take care
12/5/15, 11:51 - cool

我需要的是

texts                                   date
12/5/15, 11:49 - thanks, take care     12/5/15
12/10/15, 11:51 - cool                 12/10/15

我厌倦了使用它，但这返回了没有列的所有内容

df$date <- sub(", ", "", df$date, fixed = TRUE)

 and 

df$date <- gsub( ".,","", df$texts)

Excel等价物

=LEFT(A1, FIND(",",A1,1)-1)

Answer 1

您可以使用sub：

sub('(^.*?),.*', '\\1', df$texts)
# [1] "12/5/15" "12/5/15"

模式匹配

行^的开头，后跟任何字符.重复0到无穷大时间，但尽可能少*?，所有捕获的( ... )
后跟逗号,
后跟任何字符，重复0到无穷大时间.*

将匹配整行，并将其替换为

捕获的群组\\1。

其他选项：substr，strsplit，stringr::str_extract。

如果您计划使用所述日期，as.Date（或strptime，如果您也想要时间），实际上可以删除所需的日期：

as.Date(df$texts, '%m/%d/%y')`  # or '%d/%m/%y', if that's the format
# [1] "2015-12-05" "2015-12-05"

数据：

df <- structure(list(texts = structure(1:2, .Label = c("12/5/15, 11:49 - thanks, take care", 
                "12/5/15, 11:51 - cool"), class = "factor")), .Names = "texts", 
                class = "data.frame", row.names = c(NA, -2L))

Answer 2

为什么不呢，

sub(',.*', '', df$texts)
#[1] "12/5/15" "12/5/15"

Answer 3

你可以做到

l <- strsplit (df$date, split = ",")

使用昏迷分割文本，然后

sapply (l, "[", 1)

只保留第一部分。

R中的LEFT加FIND功能相当于什么？

3 个答案: