我正在改写这个问题,希望它能更好地表达出来。我正在使用AJAX将字符串传递给此PHP脚本。 PHP脚本采用一串单词,将它们分成一个数组,然后尝试分析每个单词。我遇到的问题是,如果有一个单词无法分析,它就不会响应AJAX请求。如果一个单词无法分解,我希望它返回单词本身,但是它似乎跳过了这一行。
我也不知道如何将这个脚本制作成可以在自己的计算机上轻松测试的脚本,因为它有很多活动部件。我已经提取了相关的HTML,JS和PHP代码。让我知道是否还有其他帮助,或者无论如何我都可以澄清一下。
我是第一次使用PHP的Python程序员,但是调试起来很困难。
<?php
class MorphologyAnalyzer
{
protected $affixes;
public function customSort($a,$b){
// sort compares based on string length
if(mb_strlen($b, 'UTF-8') == mb_strlen($a, 'UTF-8')){
return 0;
}
return (mb_strlen($a, 'UTF-8') < mb_strlen($b, 'UTF-8')) ? 1 : -1;
}
public function __construct()
{
$this->affixes = array();
// draw affixes from the Phrasicon
$xmlDoc = new DOMDocument();
$xmlDoc->load("../../morphemes/morphemes.xml");
$xpath = new DOMXPath($xmlDoc);
$result = $xpath->query("//source");
foreach($result as $entry) {
// building the list $affixes. Basically, take all possible affixes and exclude duplicates
$m = $entry->nodeValue;
if (!in_array($m, $this->affixes)) {
array_push($this->affixes, $m);
}
}
// sort affixes by length? why the calls to both sort and usort?
sort($this->affixes);
usort($this->affixes, array($this, 'customSort'));
}
public function beginsWith($haystack, $needle) {
if(strlen($haystack) <= strlen($needle)){
return FALSE;
} else{
return $needle === "" || strpos($haystack, $needle) !== FALSE;
}
}
public function endsWith($haystack, $needle) {
if(strlen($haystack) <= strlen($needle)){
return FALSE;
} else{
return $needle === "" || strpos($haystack, $needle, strlen($haystack) - strlen($needle)) !== FALSE;
}
}
public function analyze($word){ // to analyze each word: recursion! Yay!
if($word == ""){
return "";
} else{
for($a=0; $a<count($this->affixes); $a++) {
$affix = $this->affixes[$a];
// cut off the initial hyphen that is on the affixes
if(substr($affix, 0, 1) == "-"){
$affix = substr($affix, 1);
} elseif(substr($affix, -1, 1) == "-"){
$affix = substr($affix, 0, -1);
};
// if the whole word matches a given morpheme, just return it.
if($word == $affix){
return $word;
} elseif ($this->endsWith($word,$affix) !== FALSE) {
// we need to add a call here that recursively calls analyze on smaller and smaller strings
// transform the word from hayunam to hayu-nam, I think
return $this->analyze(preg_replace('/'.$affix.'$/', '', $word)) . " -" . $affix;
}
}
// return the word if no match was found
return $word;
}
}
}
?>
<?php
$txt = $_GET['text'];
// split input text into a list of words.
$words = explode(' ', $txt);
// this is the text that will be returned ultimately -- it will be a list of morphemes
$morphemes = "";
// create an analyzer
$analyzer = new MorphologyAnalyzer();
// go through the list of words...
for ($i=0; $i < sizeof($words); $i++){
// the current word
$word = $words[$i];
$morphemes = $morphemes . $analyzer->analyze($word) . " ";
}
// return the list of morphemes at the end.
echo rtrim($morphemes);
?>
这是相关的JS
function segment(line) {
// this code segments the source text in the "create story" section
var lineNumber = line.split("line")[1];
var lineSource = document.getElementById((line + "_source")).value;
var morphemes = document.getElementsByClassName((lineNumber + "morpheme"));
var morphemeCounter = morphemes.length;
// if source is blank, fill in all the morphemes with blanks
for (i = 1; i <= morphemeCounter; i++) {
document.getElementById(line + "_m" + i).value = "";
document.getElementById(line + "_g" + i).value = "";
}
if (window.XMLHttpRequest) {
// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp = new XMLHttpRequest();
} else { // code for IE6, IE5
xmlhttp = new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.onreadystatechange = function() {
if (xmlhttp.readyState == 4 && xmlhttp.status == 200) {
response = xmlhttp.responseText.trim().split(" ");
console.log("response" + response)
for (i = 0; i < response.length; i++) {
// if there aren't any remaining morpheme spaces to segment the text...
if(document.getElementById(line + '_m' + (i + 1))==null){
// ... add one.
addMorpheme((line + "_m" + (i+1)));
}
document.getElementById(line + '_m' + (i + 1)).value = response[i];
}
suggest(response[i], ((line + '_g') + (i)));
}
}
console.log(lineSource)
xmlhttp.open("GET", "segmentation.php?text="+lineSource, true);
xmlhttp.send();
}
这是HTML
<h3 class="text-center">Story Text</h3>
<div class="container-fluid bodyContainer" id="lineContainer">
<fieldset id="line1">
<h4>Line 1</h4>
<div class="row">
<div class="col">
Source<br/>
<input class="form-control" type="text" id="line1_source" name="line[0][source]">
<a id="line1_segmentation" class="btn btn-success" onclick="segment('line1')">Segment <i class="fas fa-bolt"></i></a>
</div>
</div>
<div class="row">
<div class="col">
Translation<br/>
<input class="form-control" id="line1_translation" type="text" name="line[0][translation]">
</div>
</div>
<div class="row">
<div class="col">
Hints/Notes (for comparing multiple versions of a story)<br/>
<input class="form-control" type="text" name="line[0][note]">
</div>
</div>
<div class="row">
<div class="col">
Audio Filename<br/>
<input class="form-control" type="text" name="line[0][lineaudio]">
</div>
</div>
<h5>Morphemes</h5>
<div class="row">
<div class="col">
<input id="line1_m1" class="1morpheme" type="text" name="line[0][morpheme][0][m]" onkeyup="suggest(this.value, 'line1_g1')">
</div>
</div>
<div class="row">
<div class="col">
<input id="line1_g1" type="text" name="line[0][morpheme][0][g]" list="list_line1_g1" autocomplete="off">
</div>
</div>
<div class="row">
<div class="col">
<a class="btn btn-danger" id="delete_line1_m1" onclick="deleteMorpheme('line1_m1')"><i class="fas fa-trash-alt fa-lg"></i> Morpheme 1</a>
<a class="btn btn-info" id="addMorpheme_line1_m1" onclick="addMorpheme('line1_m2')"><i class="fas fa-plus"></i></a>
</div>
</div>
</fieldset>
</div>
<div class="row">
<div class="col">
<a class="btn btn-info" onclick="addLine()">+ Add Line</a>
</div>
</div>
<div class="row">
<div class="col">
<button class="btn btn-primary btn-lg" type="submit" value="Create">Create <i class="fas fa-rocket"></i></button>
</div>
</div>
</div>
<div id="datalistContainer">
<datalist id="list_line1_g1">
<option value=" "></option>
</datalist>
</div>
XML文件morphemes.xml包含如下条目:
<morpheme>
<source>hayu</source>
<gloss>dog</gloss>
</morpheme>
对于像hayunam xyz这样的字符串(第一个单词可以分解,而第二个单词不能分解),我希望它返回hayu -nam xyz