我需要一些数据整理方面的帮助。我在Facebook Messenger上下载了与某人的对话,但输出如下:
apply plugin: 'com.android.application'
android {
compileSdkVersion 26
defaultConfig {
applicationId "com.example.asus.apptest"
minSdkVersion 19
targetSdkVersion 26
versionCode 1
versionName "1.0"
testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
}
}
}
dependencies {
implementation fileTree(dir: 'libs', include: ['*.jar'])
implementation 'com.android.support:appcompat-v7:26.1.0'
implementation 'com.android.support.constraint:constraint-layout:1.1.2'
implementation 'com.android.support:support-v4:26.1.0'
implementation 'com.google.firebase:firebase-core:16.0.1'
implementation 'com.google.firebase:firebase-database:16.0.1'
implementation 'com.google.firebase:firebase-storage:16.0.1'
implementation 'com.google.firebase:firebase-auth:16.0.1'
implementation 'com.firebaseui:firebase-ui-database:4.0.1'
implementation 'com.firebaseui:firebase-ui-auth:4.0.1'
testImplementation 'junit:junit:4.12'
androidTestImplementation 'com.android.support.test:runner:1.0.2'
androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.2'
implementation 'de.hdodenhof:circleimageview:2.2.0'
implementation 'com.theartofdev.edmodo:android-image-cropper:2.7.+'
implementation 'com.squareup.picasso:picasso:2.71828'
}
apply plugin: 'com.google.gms.google-services'
它们全都放在一列中,但是我试图制作一个数据框,其中说话者在一列中,消息在另一列中,而日期在另一列中。我面临的问题是,有时消息会分成两行,所以我不能仅将整个列分为三列。最好的解决方案是什么?感谢对此的任何帮助:)
答案 0 :(得分:0)
由于您的输入始终为“人A”(或人B),并以日期结尾,格式为YYYY-MM-DD HH:MM,因此我将使用正则表达式:
library(stringr)
date_match="\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}"
col_a=str_match_all(chat_messenger,
paste0("(?<=\n|^)Person A\\s*\n([\\s\\S]*?)\n",date_match, sep="")
)[[1]][,2]
col_b=str_match_all(chat_messenger,
paste0("(?<=\n)Person B\\s*\n([\\s\\S]*?)\n",date_match, sep="")
)[[1]][,2]
col_a
col_b
给出以下结果:
> col_a
[1] "Coolcool " "You called Person B \nDuration: 30 seconds "
[3] "Hey! "
> col_b
[1] "See you later \n:D " ". \nWhat's up? "
为了更好地了解正则表达式匹配: 我将划分这一行: (?<= \ n | ^)人员A \ s * \ n([\ s \ S] *?)\ n
(?<=\n|^)
正在查找以空格或文档开头开头的内容,以防您在聊天中使用“ Person A”一词。 Person A\\s*\n
:查找名称,后跟空格(至少0)和换行符([\\s\\S]*?)
:提取所有内容,包括换行符\n
:换行前停止提取