Question

我现在有几天的问题：s ... 我试图在字符串中获取一些更改数据，字符串是这样的：

<docdata>
 <!-- News Identifier -->
        <doc-id id-string ="YBY15349" />

        <!-- Date of issue -->
        <date.issue norm ="2012-09-22 19:52" />
        <!-- Date of release -->
        <date.release norm ="2012-09-22 19:52" />
      </docdata>

我需要的只是＆＃34; 2012-09-22 19:52＆＃34; 中的日期，它存储在某种类型的字符串中xml，顺便说一句。所以我不能使用普通的xml解析器，我已经加载/读取文件来改变一些字符集

    $fname = $file;
    $fhandle = fopen($fname,"r");
    $content = fread($fhandle,filesize($fname));
    str_replace("<?xml version=\"1.0\" encoding=\"UTF-8\"?>", "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>", $content); 
etc..

这项工作就像一个魅力，但用字符串我不能使用它。我尝试使用preg_match_all，但我不能正确。它有一种简单的方法来搜索这个值

<date.issue norm ="2012-09-22 19:52" />

并获取变量中的日期？

提前致谢并抱歉我的英语。

Answer 1

与以下内容匹配的正则表达式：

<date.issue norm ="2012-09-22 19:52" />

将是：

/<date\.issue\s*norm\s*="([^"]*)"/

在代码中：

preg_match_all('/<date\.issue\s*norm\s*="([^"]*)"/', $content, $matches);
// $matches[1] contains all the dates

Answer 2

来自PHP documentation：

file_get_contents（）是将文件内容读入字符串的首选方法。如果操作系统支持，它将使用内存映射技术来提高性能。

因此，您的代码将成为：

$content = file_get_contents($file);
$content = str_replace("<?xml version=\"1.0\" encoding=\"UTF-8\"?>", "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>", $content);
preg_match_all('/date\.issue norm ="([^"]+)" /', $content, $date);

默认行为是将带括号的匹配项存储在数组$date[1]中。因此，您可以遍历$date[1][0]，$date[1][1]，依此类推。

Answer 3

而不是使用

fopen($filename)

使用

$filename = '/path/to/file.xml';
$filearray = file($filename) // pulls the while file into an array by lines

$searchstr = 'date.issue';

foreach($filearray as $line) {
   if(stristr($line,$searchstr)) { // <-- forgot the )
      $linearray = explode('"',$line);
      // your date should be $linearray[1];
      echo $linearray[1]."\n";  // to test your output
      // rest of your code here
   }
}

这样你就可以在整个文件中搜索你的搜索字符串，而且格式错误的xml也不是问题。

获取引号之间的数据等等

3 个答案: