需要帮助从PHP中的文件中获取字符串

时间:2016-12-21 16:30:28

标签: php string

我有一个txt文件,里面装满了html代码。我正在尝试创建一个PHP页面来搜索代码并获取"用户名"对我来说:

以下是该页面的一小部分示例:

  <div class="search-result-details">
    <div class="employee-name">This is my name!</div>
    <ul class="employee-details">
      <li><span class="label">Login</span>username</li>
      <li><span class="label">Employee ID</span>####</li>
      <li><span class="label">Barcode ID</span>###</li>
      <li><span class="label">Status</span>Active</li>
    </ul>
    <ul class="org-details">
      <li><span class="label">Location</span>SAT1 (755)</li>
      <li><span class="label">Shift</span>AAAA</li>
      <li><span class="label">Department</span>1231</li>
      <li><span class="label">Area</span>26</li>
      <li><span class="label">Crew</span>0</li>
      <li><span class="label">Supervisor</span>manager name</li>
    </ul>
  </div>
</a></li>
                    </ol>
                </div>

我需要从以下行获取用户名:

<li><span class="label">Login</span>username</li>

我已经知道了,至少要抓住我需要的路线:

    <?php
$file = 'log.txt';
$searchfor = '<ul class="employee-details">
      <li><span class="label">Login</span>';

// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');

// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/^.*$pattern.*\$/m";
// search, and store all matching occurences in $matches
if(preg_match_all($pattern, $contents, $matches)){
   echo "Found matches:\n";
   echo implode("\n", $matches[0]);
}
else{
   echo "No matches found";
}

?>

当前输出:

<ul class="employee-details">
  <li><span class="label">Login</span>username</li>

非常感谢任何帮助。谢谢。

2 个答案:

答案 0 :(得分:0)

虽然有点hacky,但这是你可以做到的一种方式。

$contents = file_get_contents($file);

preg_match("/(Login<\/span>)([a-zA-Z0-9]*)(<\/li>)/", $contents, $matches);

if (is_array($matches) && isset($matches[2])) {
   $username = trim($matches[2]);
}

当然,中间捕获组需要支持用户名中可能包含的任何字符。

另请注意,如果HTML结构发生变化,这将中断

最后,如果文件中可以有多个用户名,则可以使用preg_match_all,然后$matches[2]将是一个用户名数组。

答案 1 :(得分:0)

使用DOMDocument:

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML('<div class="search-result-details">
    <div class="employee-name">This is my name!</div>
    <ul class="employee-details">
      <li><span class="label">Login</span>username</li>
      <li><span class="label">Employee ID</span>####</li>
      <li><span class="label">Barcode ID</span>###</li>
      <li><span class="label">Status</span>Active</li>
    </ul>
    <ul class="org-details">
      <li><span class="label">Location</span>SAT1 (755)</li>
      <li><span class="label">Shift</span>AAAA</li>
      <li><span class="label">Department</span>1231</li>
      <li><span class="label">Area</span>26</li>
      <li><span class="label">Crew</span>0</li>
      <li><span class="label">Supervisor</span>manager name</li>
    </ul>
  </div>
</a></li>
                    </ol>
                </div>');
libxml_use_internal_errors(false);

$html = new DOMXPath($doc);
$result = '';
foreach ($html->query("//*[@class='label']") as $value) {
    if ($value->textContent == 'Login') {
        $result = $value->nextSibling->textContent;
        break;
    }
}

echo $result;

<强>输出:

username

libxml_use_internal_errors的原因是抑制this answer中列出的验证错误。