chatbot编程,xml和perl

时间:2014-05-09 21:47:27

标签: xml perl chatbot

我正在使用xml文件在perl上编写一个chatbot程序,其中包含每个答案的模式,例如,如果用户引入的字符串包含“你知道michael jordan”的模式,可能的答案应该是“谁是迈克尔乔丹?”。 xml代码如下所示。

问题是,我不知道如何提取用户引入的字符串的第二部分,在上面给出的例子“michael jordan”中并把它放在我的输出中???什么做

XML中的

<star/><star index="2"/>是什么意思?

谢谢

<category> 
<pattern>you know *</pattern>
  <template> 
    <random> 
      <li>No, who is?</li>
      <li>who is <star/>?</li>
      <li>i don't know.</li>
    </random>
  </template>
</category>

perl代码:

my $parser  = XML::LibXML->new();
my $xmlfile = $parser->parse_file( $ARGV[0] );

my %palabras;
my @respuestas;

$xmlfile = $xmlfile->getDocumentElement();

my @kids = $xmlfile->findnodes('//category');

foreach my $child (@kids) {
    my $pattern = $child->findvalue('pattern');

    @respuestas = $child->findnodes('template/random/li');

    for my $answer (@respuestas) {
        push @{ $palabras{$pattern} }, $answer->textContent;
    }

}

my $cadena = <STDIN>;

while ( $cadena ne "adios\n" ) {
    foreach my $pattern ( keys %palabras ) {
        if ( index( uc $cadena, $pattern ) != -1 ) {
            @respuestas = @{ $palabras{$pattern} };
            my $n = int rand( $#respuestas + 1 );
            print $respuestas[$n] . "\n";    #
            last;
        }
    }

    $cadena = <STDIN>;
}

1 个答案:

答案 0 :(得分:0)

  

以及<star/><star index="2"/>在XML中的含义是什么?

根据XML spec 3.1节,语法规则[44]描述"Tags for Empty Elements",这意味着,元素可能有一些属性,但它没有内容(换句话说,没有后代,没有文本)。

<强>更新

在阅读了OP的更多评论并在对问题进行了一些新的更新之后,这是一个可能的解决方案:

<强> test.pl

#!/usr/bin/env perl
package Bot::Find::Answer;
use strict;
use warnings;
use XML::LibXML;
use Data::Dumper;
use List::Util qw/first/;

#### Constructor
#### Get path to XML with question/answer data.
#### Calls init to process data.
#### Returns new instance of object Bot::Find::Answer
sub new {
    my ($class,$xml_path) = @_;
    my $obj = bless {
        #### Path on disk to XML
        xml_path => $xml_path,
        #### Knowlege Base
        kb       => [],
    }, $class;
    $obj->init();
    return $obj;
};

#### Parse XML
#### Get stars in question and replace them with regex capture groups
#### Get all answers for each question and store them.
#### Store everything in $self->{kb}
sub init {
    my ($self) = @_;

    my $kb = $self->{kb};

    my $xml = XML::LibXML->load_xml(
        location => $self->{xml_path}
    );

    for my $cat ($xml->findnodes('//category')) {
        my $question_pattern = ($cat->findnodes('pattern'))[0]->textContent;
        $question_pattern =~ s/\*/(.*)/g;
        my @answers = 
        map { $_->textContent }
        $cat->findnodes('template/random/li');

        push @$kb, {
            p => $question_pattern,
            a => \@answers
        };
    };

};


#### Get first category for which the question matches the associated pattern
#### Pick a random answer
#### Fill random answer with captures from pattern.
#### Return answer
sub compute_answer {
    my ($self,$q) = @_;
    my $kb = $self->{kb};
    my $cat_found = first { $q =~ /$_->{p}/ } @$kb;
    my $idx = int(rand(@{ $cat_found->{a}}));
    my $picked_answer = $cat_found->{a}->[$idx];
    my (@captures) = $q =~ $cat_found->{p};
    for my $i (0..(-1+@captures)) {
        my $j = $i + 1;
        my $capture_val = $captures[$i];
        $picked_answer =~ s/\[capture$j\]/$capture_val/g;
    };

    return $picked_answer;
}

package main;

my $o = Bot::Find::Answer->new('sample.xml');
print $o->compute_answer("you know michael jordan");

<强> sample.xml中

<?xml version="1.0" encoding="iso-8859-1"?>
<data>
    <category> 
        <pattern>you know *</pattern>
        <template> 
            <random> 
                <li>No, who is [capture1]?</li>
                <li>who is [capture1]?</li>
                <li>i don't know.</li>
            </random>
        </template>
    </category>
    <category> 
        <pattern>name a country from south america</pattern>
        <template> 
            <random> 
                <li>ecuador</li>
                <li>uruguay</li>
                <li>chile</li>
                <li>panama</li>
                <li>brazil</li>
            </random>
        </template>
    </category>
</data>