Whatsapp可以选择通过电子邮件将群组对话发送给自己。我这样做了,现在想在R中探索它。问题是它似乎有多个分隔符,我不知道如何在R中处理。
这是我试过的:
library(readr)
library(dplyr)
> gf <- read_delim('df.txt', col_names = F, skip = 2, delim='\t')
Warning message:
15 problems parsing 'df.txt'. See problems(...) for more details.
> head(gf)
Source: local data frame [6 x 12]
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
1 9:14pm Mar 31 umair: Great NA NA NA NA NA NA NA
2 9:14pm Mar 31 umair: I am back NA NA NA NA NA NA NA
3 9:15pm Mar 31 umair: ?? NA NA NA NA NA NA NA
4 10:27pm Mar 31 umair: Kon kon zinda hay NA NA NA NA NA NA NA
5 10:49pm Mar 31 Kazim: Sab zinda hain ..... NA NA NA NA NA NA NA
6 10:50pm Mar 31 umair: Very good NA NA NA NA NA NA NA
您可以帮我阅读此文件,以便“sender:message”分为2列吗?前两列作为单独的列读取,如图所示。显然我不希望列X4到X12。
以下是原始文件的前几行:
9:14pm, Mar 31 - umair: Great
9:14pm, Mar 31 - umair: I am back
9:15pm, Mar 31 - umair:
10:27pm, Mar 31 - umair: Kon kon zinda hay
10:49pm, Mar 31 - Kazim: Sab zinda hain .....
10:50pm, Mar 31 - umair: Very good
10:52pm, Mar 31 - umair: Abid agaya dobara?
10:54pm, Mar 31 - Kazim: Nai wo nai aya
10:54pm, Mar 31 - umair: Hmmmmmmmmm
答案 0 :(得分:3)
这个问题很旧,但是当我想做同样的事情时,我的Google搜索就把我引到了这里。我找出来并将其放入R包中。安装并读取数据:
devtools::install_github("JBGruber/rwhatsapp")
library(rwhatsapp)
gf <- rwa_read("df.txt")
或者您可以直接在行中粘贴
> lines <- c(
"9:14pm, Mar 31 - umair: Great",
"9:14pm, Mar 31 - umair: I am back",
"9:15pm, Mar 31 - umair: ",
"10:27pm, Mar 31 - umair: Kon kon zinda hay",
"10:49pm, Mar 31 - Kazim: Sab zinda hain .....",
"10:50pm, Mar 31 - umair: Very good",
"10:52pm, Mar 31 - umair: Abid agaya dobara?",
"10:54pm, Mar 31 - Kazim: Nai wo nai aya",
"10:54pm, Mar 31 - umair: Hmmmmmmmmm"
)
> rwa_read(lines)
# A tibble: 9 x 3
time author text
<dttm> <fct> <chr>
1 2018-03-31 21:14:13 umair Great
2 2018-03-31 21:14:13 umair I am back
3 2018-03-31 21:15:13 umair " "
4 2018-03-31 22:27:13 umair Kon kon zinda hay
5 2018-03-31 22:49:13 Kazim Sab zinda hain .....
6 2018-03-31 22:50:13 umair Very good
7 2018-03-31 22:52:13 umair Abid agaya dobara?
8 2018-03-31 22:54:13 Kazim Nai wo nai aya
9 2018-03-31 22:54:13 umair Hmmmmmmmmm