如何使用SyntaxNet输出来操作执行命令,例如在Linux系统上将文件保存在文件夹中

时间:2016-06-10 17:24:03

标签: python xml terminal semantics syntaxnet

下载并训练SyntaxNet后,我正在尝试编写一个可以打开新/现有文件的程序,例如AutoCAD文件,并通过分析文本将文件保存到特定目录中:打开LibreOffice文件X 。将SyntaxNet的输出视为:

echo "save AUTOCAD file  X in directory Y" | ./test.sh > output.txt


Input: save AUTOCAD file X in directory Y
Parse:
save VB ROOT
 +-- X NNP dobj
 |   +-- file NN compound
 |       +-- AUTOCAD CD nummod
 +-- directory NN nmod
     +-- in IN case
     +-- Y CD nummod

首先,我考虑将解析后的文本更改为XML格式,然后使用语义分析(如SPARQL)解析XML文件,以查找ROOT = save,dobj = X和nummode = Y并编写一个python程序,可以做同样的事情,在文中说

  1. 我不知道如果我将解析后的文本更改为XML,然后使用使用查询的语义分析,以便将ROOT与保存dobj的对应函数或脚本进行匹配,在nummode

  2. 中提及的目录中
  3. 我有一些想法将python连接到带有subprocess包的终端,但我找不到任何可以帮助我保存例如来自终端的AUTOCAD文件或任何其他文件的内容或在python的帮助下,我需要编写一个脚本.sh吗?

  4. 我对文本的句法和语义分析进行了大量研究,例如Christian Chiarcos, 2011Hunter and Cohen, 2006Verspoor et al., 2015,还研究了Microsoft CortanaSiriusgoogle now,但他们都没有详细说明他们如何将已解析的文字更改为执行命令,这让我得出结论,这项工作也是如此容易被谈论,但由于我不是计算机科学专业,我无法弄清楚我能做些什么。

1 个答案:

答案 0 :(得分:10)

我是计算机科学世界和SyntaxNet的初学者。我写了一个简单的SyntaxNet-Python算法,它使用SyntaxNet来分析用户插入的文本命令,“打开我用LibreOffice writer用实验室编写器编写的文件簿”,然后用python算法分析SyntaxNet输出以便转向它是一个执行命令,在这种情况下打开一个文件,使用任何支持的格式,在Linux,Ubuntu 14.04环境中使用LibreOffice。您可以看到here LibreOffice定义的不同命令行,以便在此包中使用不同的应用程序。

  1. 安装并运行SyntaxNet(解释here中的安装过程)后,shell脚本在~/models/syntaxnet/suntaxnet/目录和conl2tree函数中打开demo.sh(为了从SyntaxNet获取line 54 to 56输出而不是树格式输出,将删除tab delimited)。

  2. 此命令在终端窗口中输入:

    echo'用libreOffice writer'|打开我与实验室作家写的文件簿syntaxnet / demo.sh> output.txt的

  3. output.txt文档保存在demo.sh存在的目录中,它将如下图所示:

    enter image description here

    1. output.txt作为输入文件,使用下面的python算法分析SyntaxNet输出,并从LibreOffice包中识别您想要目标应用程序的文件名以及用户想要使用的命令。
    2. #!/bin/sh

      import csv
      
      import subprocess
      
      import sys
      
      import os
      
      #get SyntaxNet output as the Python algorithm input file
      filename='/home/username/models/syntaxnet/work/output.txt'
      
      #all possible executive commands for opening any file with any format with Libreoffice file
      commands={
      ('open',  'libreoffice',  'writer'):  ('libreoffice', '--writer'),
      ('open',  'libreoffice',  'calculator'):  ('libreoffice' ,'--calc'),
      ('open',  'libreoffice',  'draw'):  ('libreoffice' ,'--draw'),
      ('open',  'libreoffice',  'impress'): ('libreoffice' ,'--impress'),
      ('open',  'libreoffice',  'math'):  ('libreoffice' ,'--math'),
      ('open',  'libreoffice',  'global'):  ('libreoffice' ,'--global'),
      ('open',  'libreoffice',  'web'): ('libreoffice' ,'--web'),
      ('open',  'libreoffice',  'show'):  ('libreoffice', '--show'),
      }
      #all of the possible synonyms of the application from Libreoffice 
      comments={
       'writer': ['word','text','writer'],
       'calculator': ['excel','calc','calculator'],
       'draw': ['paint','draw','drawing'],
       'impress': ['powerpoint','impress'],
       'math': ['mathematic','calculator','math'],
       'global': ['global'],
       'web': ['html','web'],
       'show':['presentation','show']
       }
      
      root ='ROOT'            #ROOT of the senctence
      noun='NOUN'             #noun tagger
      verb='VERB'             #verb tagger
      adjmod='amod'           #adjective modifier
      dirobj='dobj'           #direct objective
      apposmod='appos'        # appositional modifier
      prepos_obj='pobj'       # prepositional objective
      app='libreoffice'       # name of the package
      preposition='prep'      # preposition
      noun_modi='nn'          # noun modifier 
      
      #read from Syntaxnet output tab delimited textfile
      def readata(filename):
          file=open(filename,'r')
          lines=file.readlines()
          lines=lines[:-1]
          data=csv.reader(lines,delimiter='\t')
          lol=list(data)
          return  lol
      
      # identifies the action, the name of the file and whether the user mentioned the name of the application implicitely  
      def exe(root,noun,verb,adjmod,dirobj,apposmod,commands,noun_modi):
          interprete='null'
          lists=readata(filename)
          for sublist in lists:
              if sublist[7]==root and sublist[3]==verb: # when the ROOT is verb the dobj is probably the name of the file you want to have
                      action=sublist[1]
                      dep_num=sublist[0]
                      for sublist in lists:
                          if sublist[6]==dep_num and sublist[7]==dirobj:
                              direct_object=sublist[1]
                              dep_num=sublist[0]
                              dep_num_obj=sublist[0]
                              for sublist in lists:
                                  if direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==apposmod:
                                      direct_object=sublist[1]
                                  elif  direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==adjmod:
                                      direct_object=sublist[1]
                      for sublist in lists:
                          if sublist[6]==dep_num_obj and sublist[7]==adjmod:
                                  for key, v in  comments.iteritems():
                                      if sublist[1] in v:
                                          interprete=key
                      for sublist in lists:
                          if sublist[6]==dep_num_obj and sublist[7]==noun_modi:
                              dep_num_nn=sublist[0]
                              for key, v in  comments.iteritems():
                                  if sublist[1] in v:
                                      interprete=key
                                      print interprete
                              if interprete=='null':
                                  for sublist in lists:
                                      if sublist[6]==dep_num_nn and sublist[7]==noun_modi:
                                          for key, v in  comments.iteritems():
                                              if sublist[1] in v:
                                                  interprete=key
              elif  sublist[7]==root and sublist[3]==noun: # you have to find the word which is in a adjective form and depends on the root
                  dep_num=sublist[0]
                  dep_num_obj=sublist[0]
                  direct_object=sublist[1]
                  for sublist in lists:
                      if sublist[6]==dep_num and sublist[7]==adjmod:
                          actionis=any(t1==sublist[1] for (t1, t2, t3) in commands)
                          if actionis==True:
                              action=sublist[1]
                      elif sublist[6]==dep_num and sublist[7]==noun_modi:
                          dep_num=sublist[0]
                          for sublist in lists:
                              if sublist[6]==dep_num and sublist[7]==adjmod:
                                  if any(t1==sublist[1] for (t1, t2, t3) in commands):
                                      action=sublist[1]
                  for sublist in lists:
                      if direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==apposmod and sublist[1]!=action:
                          direct_object=sublist[1]
                      if  direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==adjmod and sublist[1]!=action:
                          direct_object=sublist[1]
                  for sublist in lists:
                      if sublist[6]==dep_num_obj and sublist[7]==noun_modi:
                          dep_num_obj=sublist[0]
                          for key, v in  comments.iteritems():
                              if sublist[1] in v:
                                  interprete=key
                              else:
                                  for sublist in lists:
                                      if sublist[6]==dep_num_obj and sublist[7]==noun_modi:
                                          for key, v in  comments.iteritems():
                                              if sublist[1] in v:
                                                  interprete=key
          return action, direct_object, interprete
      
      action, direct_object, interprete = exe(root,noun,verb,adjmod,dirobj,apposmod,commands,noun_modi)
      
      # find the application (we assume we know user want to use libreoffice but we donot know what subapplication should be used)
      def application(app,prepos_obj,preposition,noun_modi):
          lists=readata(filename)
          subapp='not mentioned'
          for sublist in lists:
              if sublist[1]==app:
                  dep_num=sublist[6]
                  for sublist in lists:
                      if sublist[0]==dep_num and sublist[7]==prepos_obj:
                          actioni=any(t3==sublist[1] for (t1, t2, t3) in commands)
                              if actioni==True:
                                  subapp=sublist[1]
                              else:
                                  for sublist in lists:
                                      if sublist[6]==dep_num and sublist[7]==noun_modi:
                                          actioni=any(t3==sublist[1] for (t1, t2, t3) in commands)
                                          if actioni==True:
                                              subapp=sublist[1]
                              elif sublist[0]==dep_num and sublist[7]==preposition:
                                  sublist[6]=dep_num
                                  for subline in lists:
                                      if subline[0]==dep_num and subline[7]==prepos_obj:
                                          if any(t3==sublist[1] for (t1, t2, t3) in commands):
                                              subapp=sublist[1]
                                          else:
                                              for subline in lists:
                                                  if subline[0]==dep_num and subline[7]==noun_modi:
                                                      if any(t3==sublist[1] for (t1, t2, t3) in commands):
                                                          subapp=sublist[1]
          return subapp
      
      sub_application=application(app,prepos_obj,preposition,noun_modi)
      
      if sub_application=='not mentioned' and interprete!='null':
          sub_application=interprete
      elif sub_application=='not mentioned' and interprete=='null':
          sub_application=interprete
      
      # the format of file
      def format_function(sub_application):
          subapp=sub_application
          Dobj=exe(root,noun,verb,adjmod,dirobj,apposmod,commands,noun_modi)[1]
          if subapp!='null':
              if subapp=='writer':
                  a='.odt'
                  Dobj=Dobj+a
              elif subapp=='calculator':
                  a='.ods'
                  Dobj=Dobj+a
              elif subapp=='impress':
                  a='.odp'
                  Dobj=Dobj+a
              elif subapp=='draw':
                  a='.odg'
                  Dobj=Dobj+a
              elif subapp=='math':
                  a='.odf'
                  Dobj=Dobj+a
              elif subapp=='math':
                  a='.odf'
                  Dobj=Dobj+a
              elif subapp=='web':
                  a='.html'
                  Dobj=Dobj+a
          else:
              Dobj='null'
          return Dobj
      
      def get_filepaths(directory):
          myfile=format_function(sub_application)
          file_paths = []  # List which will store all of the full filepaths.
          # Walk the tree.
          for root, directories, files in os.walk(directory):
              for filename in files:
              # Join the two strings in order to form the full filepath.
                  if filename==myfile:
                      filepath = os.path.join(root, filename)
                      file_paths.append(filepath)  # Add it to the list.
          return file_paths  # Self-explanatory.
      
      # Run the above function and store its results in a variable.
      full_file_paths = get_filepaths("/home/ubuntu/")
      
      if full_file_paths==[]:
          print 'No file with name %s is found' % format_function(sub_application)
      if full_file_paths!=[]:
          path=full_file_paths
          prompt='> '
          if len(full_file_paths) >1:
              print full_file_paths
              print 'which %s do you mean?'% subapp
              inputname=raw_input(prompt)
              if inputname in full_file_paths:
                  path=inputname
              #the main code structure
          if sub_application!='null':
              command= commands[action,app,sub_application]
              subprocess.call([command[0],command[1],path[0]])
          else:
              print "The sub application is not mentioned clearly"
      

      我再次说我是初学者,代码可能看起来不那么整洁或专业但我只是试着用我所有关于这个迷人的知识 SyntaxNet一个实用的算法。 这个简单的算法可以打开文件:

      1. 使用LibreOffice支持的任何格式,例如.odt,.odf,.ods,.html,.odp

      2. 它可以理解LibreOffice中不同应用程序的隐式引用,例如:“用libreoffice打开文本文件”而不是“用libreoffice writer打开文件簿”

      3. 可以解决SyntaxNet解释被称为形容词的文件名称的问题。