我从website下载了GCIDE(GNU项目的CIDE出版物,英语协作国际词典)。
该包包含各种XML文件。我在Windows PC上运行PHP和Apache。如何使用PHP在这些XML文件中搜索单词及其定义?
答案 0 :(得分:7)
你的项目引起了我的兴趣,并认为我可能会发现它有用,所以做了一些研究,并找到了下面的code on this page。我运行这个php,目前在我的数据库中有一个功能齐全的字典!
以下是我为完成并运行而做的所有事情(我将XML文件解压缩到包含这些文件的文件夹中名为XML的文件夹中。)
表格的SQL - gcide
CREATE TABLE `gcide` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`word` varchar(255) DEFAULT NULL,
`definition` text,
`pos` varchar(50) DEFAULT NULL,
`fld` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `word` (`word`)
) ENGINE=MyISAM
PHP for gcide XML Import - import_gcide_xml.php
<?php
$connection = mysql_connect('localhost', 'root', '') or die('Could not connect to MySQL database. ' . mysql_error());
$db = mysql_select_db('fiddle',$connection);
mysql_query('TRUNCATE TABLE gcide') or die(mysql_error());
$xml = array('xml/gcide_a.xml', 'xml/gcide_b.xml', 'xml/gcide_c.xml', 'xml/gcide_d.xml', 'xml/gcide_e.xml','xml/gcide_f.xml','xml/gcide_g.xml', 'xml/gcide_h.xml', 'xml/gcide_i.xml', 'xml/gcide_j.xml', 'xml/gcide_k.xml', 'xml/gcide_l.xml', 'xml/gcide_m.xml', 'xml/gcide_n.xml', 'xml/gcide_o.xml', 'xml/gcide_p.xml', 'xml/gcide_q.xml', 'xml/gcide_r.xml', 'xml/gcide_s.xml', 'xml/gcide_t.xml', 'xml/gcide_u.xml', 'xml/gcide_v.xml', 'xml/gcide_w.xml', 'xml/gcide_x.xml', 'xml/gcide_y.xml', 'xml/gcide_z.xml');
$numberoffiles = count($xml);
for ($i = 0; $i <= $numberoffiles-1; $i++) {
$xmlfile = $xml[$i];
// original file contents
$original_file = @file_get_contents($xmlfile);
// if file_get_contents fails to open the link do nothing
if(!$original_file) {}
else {
// find words in original file contents
preg_match_all("/<hw>(.*?)<\/hw>(.*?)<def>(.*?)<\/def>/", $original_file, $results);
$blocks = $results[0];
// traverse blocks array
for ($j = 0; $j <= count($blocks)-1; $j++) {
preg_match_all("/<hw>(.*?)<\/hw>/", $blocks[$j], $wordarray);
$words = $wordarray[0];
$word = addslashes(strip_tags($words[0]));
$word = preg_replace('{-}', ' ', $word);
$word = preg_replace("/[^a-zA-Z0-9\s]/", "", $word);
preg_match_all("/<def>(.*?)<\/def>/", $blocks[$j], $definitionarray);
$definitions = $definitionarray[0];
$definition = addslashes(strip_tags($definitions[0]));
$definition = preg_replace('{-}', ' ', $definition);
$definition = preg_replace("/[^a-zA-Z0-9\s]/", "", $definition);
preg_match_all("/<pos>(.*?)<\/pos>/", $blocks[$j], $posarray);
$poss = $posarray[0];
$pos = addslashes(strip_tags($poss[0]));
$pos = preg_replace('{-}', ' ', $pos);
$pos = preg_replace("/[^a-zA-Z0-9\s]/", "", $pos);
preg_match_all("/<fld>(.*?)<\/fld>/", $blocks[$j], $fldarray);
$flds = $fldarray[0];
$fld = addslashes(strip_tags($flds[0]));
$fld = preg_replace('{-}', ' ', $fld);
$fld = preg_replace("/[^a-zA-Z0-9\s]/", "", $fld);
$insertsql = "INSERT INTO gcide (word, definition, pos, fld) VALUES ('$word', '$definition', '$pos', '$fld')";
$insertresult = mysql_query($insertsql) or die(mysql_error());
echo $word. " " . $definition ."\n";
}
}
}
echo 'Done!';
?>
CSS搜索页面 - gcide.css
body{ font-family:Arial, Helvetica, sans-serif; }
#search_box { padding:4px; border:solid 1px #666666; margin-bottom:15px; width:300px; height:30px; font-size:18px;-moz-border-radius: 6px;-webkit-border-radius: 6px; }
#search_results { display:none;}
.word { font-weight:bold; }
.found { font-weight: bold; }
dl { font-family:serif;}
dt { font-weight:bold;}
dd { font-weight:normal;}
.pos { font-weight: normal;}
.fld { margin-right:10px;}
搜索页面的HTML - index.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>PHP, jQuery search of GCIDE</title>
<link href="gcide.css" rel="stylesheet" type="text/css"/>
<link href="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8/themes/ui-lightness/jquery-ui.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
<script src="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8/jquery-ui.min.js"></script>
<script type="text/javascript">
$(function() {
$("#search_box").keyup(function() {
// getting the value that user typed
var searchString = $("#search_box").val();
// forming the queryString
var data = 'search='+ searchString;
// if searchString is not empty
if(searchString) {
// ajax call
$.ajax({
type: "POST",
url: "gcide_search.php",
data: data,
beforeSend: function(html) { // this happens before actual call
$("#results").html('');
$("#search_results").show();
$(".word").html(searchString);
},
success: function(html){ // this happens after we get results
$("#results").show();
$("#results").append(html);
}
});
}
return false;
});
});
</script>
</head>
<body>
<div class="ui-widget-content" style="padding:10px;">
<input id="search_box" class='search_box' type="text" />
<div id="search_results">Search results for <span class="word"></span></div>
<dl id="results"></dl>
</div>
</body>
</html>
用于jQuery搜索的PHP - gcide_search.php
<?php
if (isset($_POST['search'])) {
$db = new pdo("mysql:host=localhost;dbname=fiddle", "root", "");
// never trust what user wrote! We must ALWAYS sanitize user input
$word = mysql_real_escape_string($_POST['search']);
$query = "SELECT * FROM gcide WHERE word LIKE '" . $word . "%' ORDER BY word LIMIT 10";
$result = $db->query($query);
$end_result = '';
if ($result) {
while ( $r = $result->fetch(PDO::FETCH_ASSOC) ) {
$end_result .= '<dt>' . $r['word'];
if($r['pos']) $end_result .= ', <span class="pos">'.$r['pos'].'</span>';
$end_result .= '</dt>';
$end_result .= '<dd>';
if($r['fld']) $end_result .= '<span class="fld">('.$r['fld'].')</span>';
$end_result .= $r['definition'];
$end_result .= '</dd>';
}
}
if(!$end_result) {
$end_result = '<dt><div class="ui-state-highlight ui-corner-all" style="margin-top: 20px; padding: 0 .7em;">
<p><span class="ui-icon ui-icon-info" style="float: left; margin-right: .3em;"></span>
No results found.</p>
</div></dt>';
}
echo $end_result;
}
?>
答案 1 :(得分:1)
我偶然会偶然发现这个PHP and AJAX example - 它可能会让你指向正确的方向,但是如果有这么多数据,你可能会考虑将它导入数据库并使用它的搜索功能 - 这就是他们的设计目标,而性能可能是一个问题,通过XML文件的那么多纯文本。查看this answer进行XML导入。还找到了这个关于importing GCIDE XML的答案。