解析HTML或纯文本文件并返回由字符串分隔的文本

时间:2012-05-25 03:23:48

标签: php javascript html regex preg-match-all

我在解析文本文件时需要帮助。此文本文件中包含一些html标记。我正在寻找的是一个解决方案(在PHP或JS或两者中),它将剥离所有这些,并将输出存储到单独的变量中。

  Integration/QA  
<http://shopfloor/sfweb/secure/CancelOrders>


  Development  
<http://shopfloor/sfweb/secure/CancelOrders>


------------------------------------------------------------------------

*HEADER INFO*
    *View Object:* 6541997  *BPO:* 0020064484   *Ack Date:* 2012-05-25
    *Operation(s):* PS_Queue, PS_BoxAll, JPN_End

------------------------------------------------------------------------

*EXTERNAL ORDER NUMBER REFERENCE*
*SAP Sales Order Number*    *Customer P.O. Number*  *Legacy Order Number*
0310407774      89FC37763001

------------------------------------------------------------------------

*PRODUCTS FOR THIS WORK OBJECT/OPERATION(S)*
*PL*    *Product #*     *Qty*   *Options*   *Serial #*
LN  AE241A  1        

------------------------------------------------------------------------

*Station Info*
*Start Station:* JPN_End    *Location:* Done    *Station:*
*Birth Date/Time:* 2012-05-22 08:26:17 SGT  *Power Cord:*   *Voltage:*

------------------------------------------------------------------------

*MATERIAL LIST FOR THIS WORK OBJECT/OPERATION(S)*
*Part Number*   *Qty*   *Description*   *BB Type*   *Material
Location*   *Serial Number*
AE241-90001     1   XP Remote Support Service Leaflet   BOM     PACK     


Privacy Statement

我基本上想从这段代码中删除一些文本到php变量,所以它会返回:

$viewobject = "6541997"
$BPO = "0020064484"
$ackdate = "2012-05-25"
$operations = "PS_Queue, PS_BoxAll, JPN_End"
$sapSO = "0310407774"
$legacyON = "89FC37763001"
$pl = "LN"
$product = "AE241A"
$qty = 1;
$startstn = "JPN_end"
$location = "Done"
$bdate = "PS_Queue, PS_BoxAll, JPN_End"
$pn = "AE241-90001"
$qty = 1;
$description =" XP Remote Support Service Leaflet";

之类的。这可能吗?

1 个答案:

答案 0 :(得分:1)

使用regular expression

preg_match_all('/\*(view object|bpo|ack date):\*\s+([0-9\-]+)/i', $text, $m);

// $m contains matches, try to print_r($m)

$viewobject = $m[2][0];  // 6541997
$bpo = $m[2][1];         // 0020064484
$ackdate = $m[2][2];     // 2012-05-25