无监督的句子学习

时间:2017-02-07 21:13:36

标签: machine-learning

我有一个数据代表运营商对工业设备上执行的各种活动的评论。评论可能反映了例行维护/更换活动,或者可能表示发生了一些损坏,必须进行修复以纠正损坏。 我有一套200,000个句子需要分为两个部分 - 修复/预定维护(或未确定)。它们没有标签,因此寻找无监督的基于学习的解决方案。

一些样本数据如下所示:

"电机线圈损坏。更换电机" "看到皮带裂缝。安装新皮带" "偶尔的启动问题。更换了开关"

"更换了皮带" "上油和清洁完成"。 "是否有预防性维护计划"

前三个句子必须标记为修复,而后三个句子必须标记为计划维护。

这个问题的好方法是什么。虽然我对机器学习有所了解,但我还是基于NLP的机器学习的新手。

我看到很多与此相关的论文https://pdfs.semanticscholar.org/a408/d3b5b37caefb93629273fa3d0c192668d63c.pdf https://arxiv.org/abs/1611.07897

但想知道是否有任何标准方法来解决此类问题

1 个答案:

答案 0 :(得分:1)

似乎你可以使用一些可靠的关键词(在这种情况下似乎是动词)来为NLP分类器创建训练样本。或者你可以使用KMeans或KMedioids聚类并使用2作为K,这可以很好地分离集合。如果你想真正参与其中,你可以使用像Latent Derichlet Allocation这样的东西,这是一种无监督的主题建模。但是,对于这样的问题,对于你拥有的少量数据,你会对结果感到沮丧,你将成为IMO。

OpenNLP和StanfordNLP都有文本分类器,所以如果你想进入分类路线,我建议如下:

C:\reactprojects\myapp\android>gradlew installDebug --stacktrace
:app:preBuild UP-TO-DATE
:app:preDebugBuild UP-TO-DATE
:app:checkDebugManifest
:app:preReleaseBuild UP-TO-DATE
:app:prepareComAndroidSupportAppcompatV72301Library UP-TO-DATE
:app:prepareComAndroidSupportRecyclerviewV72340Library UP-TO-DATE
:app:prepareComAndroidSupportSupportV42340Library UP-TO-DATE
:app:prepareComFacebookFbuiTextlayoutbuilderTextlayoutbuilder100Library UP-TO-DATE
:app:prepareComFacebookFrescoDrawee0110Library UP-TO-DATE
:app:prepareComFacebookFrescoFbcore0110Library UP-TO-DATE
:app:prepareComFacebookFrescoFresco0110Library UP-TO-DATE
:app:prepareComFacebookFrescoImagepipeline0110Library UP-TO-DATE
:app:prepareComFacebookFrescoImagepipelineBase0110Library UP-TO-DATE
:app:prepareComFacebookFrescoImagepipelineOkhttp30110Library UP-TO-DATE
:app:prepareComFacebookReactReactNative0412Library UP-TO-DATE
:app:prepareComFacebookSoloaderSoloader010Library UP-TO-DATE
:app:prepareOrgWebkitAndroidJscR174650Library UP-TO-DATE
:app:prepareDebugDependencies
:app:compileDebugAidl UP-TO-DATE
:app:compileDebugRenderscript UP-TO-DATE
:app:generateDebugBuildConfig UP-TO-DATE
:app:generateDebugAssets UP-TO-DATE
:app:mergeDebugAssets UP-TO-DATE
:app:generateDebugResValues UP-TO-DATE
:app:generateDebugResources UP-TO-DATE
:app:mergeDebugResources UP-TO-DATE
:app:bundleDebugJsAndAssets SKIPPED
:app:processDebugManifest UP-TO-DATE
:app:processDebugResources UP-TO-DATE
:app:generateDebugSources UP-TO-DATE
:app:processDebugJavaRes UP-TO-DATE
:app:compileDebugJavaWithJavac UP-TO-DATE
:app:compileDebugNdk UP-TO-DATE
:app:compileDebugSources UP-TO-DATE
:app:preDexDebug UP-TO-DATE
:app:dexDebug FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':app:dexDebug'.
> com.android.ide.common.process.ProcessException: org.gradle.process.internal.ExecException: Process 'command 'C:\Program Files\Java\jdk1.8.0_112\bin\java.exe'' finished with non-zero exit value -1073740791

* Try:
Run with --info or --debug option to get more log output.

* Exception is:
org.gradle.api.tasks.TaskExecutionException: Execution failed for task ':app:dexDebug'.
        at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeActions(ExecuteActionsTaskExecuter.java:69)
        at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.execute(ExecuteActionsTaskExecuter.java:46)
        at org.gradle.api.internal.tasks.execution.PostExecutionAnalysisTaskExecuter.execute(PostExecutionAnalysisTaskExecuter.java:35)
        at org.gradle.api.internal.tasks.execution.SkipUpToDateTaskExecuter.execute(SkipUpToDateTaskExecuter.java:64)
        at org.gradle.api.internal.tasks.execution.ValidatingTaskExecuter.execute(ValidatingTaskExecuter.java:58)
        at org.gradle.api.internal.tasks.execution.SkipEmptySourceFilesTaskExecuter.execute(SkipEmptySourceFilesTaskExecuter.java:42)
        at org.gradle.api.internal.tasks.execution.SkipTaskWithNoActionsExecuter.execute(SkipTaskWithNoActionsExecuter.java:52)
        at org.gradle.api.internal.tasks.execution.SkipOnlyIfTaskExecuter.execute(SkipOnlyIfTaskExecuter.java:53)
        at org.gradle.api.internal.tasks.execution.ExecuteAtMostOnceTaskExecuter.execute(ExecuteAtMostOnceTaskExecuter.java:43)
        at org.gradle.api.internal.AbstractTask.executeWithoutThrowingTaskFailure(AbstractTask.java:310)
        at org.gradle.execution.taskgraph.AbstractTaskPlanExecutor$TaskExecutorWorker.executeTask(AbstractTaskPlanExecutor.java:79)
        at org.gradle.execution.taskgraph.AbstractTaskPlanExecutor$TaskExecutorWorker.processTask(AbstractTaskPlanExecutor.java:63)
        at org.gradle.execution.taskgraph.AbstractTaskPlanExecutor$TaskExecutorWorker.run(AbstractTaskPlanExecutor.java:51)
        at org.gradle.execution.taskgraph.DefaultTaskPlanExecutor.process(DefaultTaskPlanExecutor.java:23)
        at org.gradle.execution.taskgraph.DefaultTaskGraphExecuter.execute(DefaultTaskGraphExecuter.java:88)
        at org.gradle.execution.SelectedTaskExecutionAction.execute(SelectedTaskExecutionAction.java:37)
        at org.gradle.execution.DefaultBuildExecuter.execute(DefaultBuildExecuter.java:62)
        at org.gradle.execution.DefaultBuildExecuter.access$200(DefaultBuildExecuter.java:23)
        at org.gradle.execution.DefaultBuildExecuter$2.proceed(DefaultBuildExecuter.java:68)
        at org.gradle.execution.DryRunBuildExecutionAction.execute(DryRunBuildExecutionAction.java:32)
        at org.gradle.execution.DefaultBuildExecuter.execute(DefaultBuildExecuter.java:62)
        at org.gradle.execution.DefaultBuildExecuter.execute(DefaultBuildExecuter.java:55)
        at org.gradle.initialization.DefaultGradleLauncher.doBuildStages(DefaultGradleLauncher.java:149)
        at org.gradle.initialization.DefaultGradleLauncher.doBuild(DefaultGradleLauncher.java:106)
        at org.gradle.initialization.DefaultGradleLauncher.run(DefaultGradleLauncher.java:86)
        at org.gradle.launcher.exec.InProcessBuildActionExecuter$DefaultBuildController.run(InProcessBuildActionExecuter.java:90)
        at org.gradle.tooling.internal.provider.ExecuteBuildActionRunner.run(ExecuteBuildActionRunner.java:28)
        at org.gradle.launcher.exec.ChainingBuildActionRunner.run(ChainingBuildActionRunner.java:35)
        at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:41)
        at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:28)
        at org.gradle.launcher.exec.DaemonUsageSuggestingBuildActionExecuter.execute(DaemonUsageSuggestingBuildActionExecuter.java:50)
        at org.gradle.launcher.exec.DaemonUsageSuggestingBuildActionExecuter.execute(DaemonUsageSuggestingBuildActionExecuter.java:27)
        at org.gradle.launcher.cli.RunBuildAction.run(RunBuildAction.java:40)
        at org.gradle.internal.Actions$RunnableActionAdapter.execute(Actions.java:169)
        at org.gradle.launcher.cli.CommandLineActionFactory$ParseAndBuildAction.execute(CommandLineActionFactory.java:237)
        at org.gradle.launcher.cli.CommandLineActionFactory$ParseAndBuildAction.execute(CommandLineActionFactory.java:210)
        at org.gradle.launcher.cli.JavaRuntimeValidationAction.execute(JavaRuntimeValidationAction.java:35)
        at org.gradle.launcher.cli.JavaRuntimeValidationAction.execute(JavaRuntimeValidationAction.java:24)
        at org.gradle.launcher.cli.CommandLineActionFactory$WithLogging.execute(CommandLineActionFactory.java:206)
        at org.gradle.launcher.cli.CommandLineActionFactory$WithLogging.execute(CommandLineActionFactory.java:169)
        at org.gradle.launcher.cli.ExceptionReportingAction.execute(ExceptionReportingAction.java:33)
        at org.gradle.launcher.cli.ExceptionReportingAction.execute(ExceptionReportingAction.java:22)
        at org.gradle.launcher.Main.doAction(Main.java:33)
        at org.gradle.launcher.bootstrap.EntryPoint.run(EntryPoint.java:45)
        at org.gradle.launcher.bootstrap.ProcessBootstrap.runNoExit(ProcessBootstrap.java:54)
        at org.gradle.launcher.bootstrap.ProcessBootstrap.run(ProcessBootstrap.java:35)
        at org.gradle.launcher.GradleMain.main(GradleMain.java:23)
        at org.gradle.wrapper.BootstrapMainStarter.start(BootstrapMainStarter.java:30)
        at org.gradle.wrapper.WrapperExecutor.execute(WrapperExecutor.java:127)
        at org.gradle.wrapper.GradleWrapperMain.main(GradleWrapperMain.java:61)
Caused by: org.gradle.internal.UncheckedException: com.android.ide.common.process.ProcessException: org.gradle.process.internal.ExecException: Process 'command 'C:\Program Files\Java\jdk1.8.0_112\bin\java.exe'' finished with non-zero exit value -1073740791
        at org.gradle.internal.UncheckedException.throwAsUncheckedException(UncheckedException.java:45)
        at org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:78)
        at org.gradle.api.internal.project.taskfactory.AnnotationProcessingTaskFactory$IncrementalTaskAction.doExecute(AnnotationProcessingTaskFactory.java:243)
        at org.gradle.api.internal.project.taskfactory.AnnotationProcessingTaskFactory$StandardTaskAction.execute(AnnotationProcessingTaskFactory.java:219)
        at org.gradle.api.internal.project.taskfactory.AnnotationProcessingTaskFactory$IncrementalTaskAction.execute(AnnotationProcessingTaskFactory.java:230)
        at org.gradle.api.internal.project.taskfactory.AnnotationProcessingTaskFactory$StandardTaskAction.execute(AnnotationProcessingTaskFactory.java:208)
        at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeAction(ExecuteActionsTaskExecuter.java:80)
        at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeActions(ExecuteActionsTaskExecuter.java:61)
        ... 49 more
Caused by: com.android.ide.common.process.ProcessException: org.gradle.process.internal.ExecException: Process 'command 'C:\Program Files\Java\jdk1.8.0_112\bin\java.exe'' finished with non-zero exit value -1073740791
        at com.android.build.gradle.internal.process.GradleProcessResult.assertNormalExitValue(GradleProcessResult.java:42)
        at com.android.builder.core.AndroidBuilder.convertByteCode(AndroidBuilder.java:1276)
        at com.android.builder.core.AndroidBuilder$convertByteCode$2.call(Unknown Source)
        at com.android.build.gradle.tasks.Dex.doTaskAction(Dex.groovy:165)
        at com.android.build.gradle.tasks.Dex.this$6$doTaskAction(Dex.groovy)
        at com.android.build.gradle.tasks.Dex$this$6$doTaskAction.callCurrent(Unknown Source)
        at com.android.build.gradle.tasks.Dex.taskAction(Dex.groovy:99)
        at org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:75)
        ... 55 more
Caused by: org.gradle.process.internal.ExecException: Process 'command 'C:\Program Files\Java\jdk1.8.0_112\bin\java.exe'' finished with non-zero exit value -1073740791
        at org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalExitValue(DefaultExecHandle.java:365)
        at com.android.build.gradle.internal.process.GradleProcessResult.assertNormalExitValue(GradleProcessResult.java:40)
        ... 62 more


BUILD FAILED

Total time: 14.279 secs

C:\reactprojects\myapp\android>

如果您不想走这条路线,我建议您使用像SOLR或ElasticSearch这样的文本索引技术或者您最喜欢的RDBMS文本索引来执行"更像是这样的&#34 #34;类型功能,所以你不必玩机器学习连续模型更新游戏。