使用正则表达式解析文本文件以从多行获取字符串

时间:2017-05-21 13:27:50

标签: php regex string

我有一个文件dumpsys.txt包含:

...
...
Receiver Resolver Table:
  Full MIME Types:
      application/vnd.wap.mms-message:
        53929df com.android.messaging/.receiver.MmsWapPushReceiver
        592f62c com.android.messaging/.receiver.AbortMmsWapPushReceiver
        89c68f5 com.android.messaging/.receiver.MmsWapPushDeliverReceiver

  Base MIME Types:
      application:
        53929df com.android.messaging/.receiver.MmsWapPushReceiver
        592f62c com.android.messaging/.receiver.AbortMmsWapPushReceiver
        89c68f5 com.android.messaging/.receiver.MmsWapPushDeliverReceiver

  Schemes:
      content:
        511868a com.android.messaging/.receiver.SendStatusReceiver (3 filters)

  Non-Data Actions:
      android.intent.action.ACTION_DEFAULT_SMS_SUBSCRIPTION_CHANGED:
        83084fb com.android.messaging/.receiver.DefaultSmsSubscriptionChangeReceiver
      com.android.Bugle.intent.action.ACTION_NOTIFY_CONVERSATIONS_CHANGED:
        a4f4918 com.android.messaging/.widget.BugleWidgetProvider
      com.android.Bugle.intent.action.ACTION_NOTIFY_MESSAGES_CHANGED:
        c979f71 com.android.messaging/.widget.WidgetConversationProvider
      android.intent.action.DEVICE_STORAGE_LOW:
        1898156 com.android.messaging/.receiver.StorageStatusReceiver
...
...

如图所示,名为Receiver Resolver Table的部分包含一些子部分Non-Data Actions。其他部分也可能包含一个名为Non-Data Actions的小节。

我想在Non-Data Actions的{​​{1}}内提取子字符串,最好在php中使用正则表达式。 在我的情况下,我希望在Receiver Resolver Table下的每一行/.之后的子字符串。

输出示例:

Non-Data Actions

1 个答案:

答案 0 :(得分:1)

我把它分为两步,即:

1 - 将Non-Data Actions:与:

匹配
/Receiver Resolver Table:.*?Non-Data Actions:(.*?)^[\r\n]/sm

2 - 将widget / receivers与:

相匹配
%/\.(.*?)$%sm
$text = <<< EOF
Receiver Resolver Table:
  Full MIME Types:
      application/vnd.wap.mms-message:
        53929df com.android.messaging/.receiver.MmsWapPushReceiver
        592f62c com.android.messaging/.receiver.AbortMmsWapPushReceiver
        89c68f5 com.android.messaging/.receiver.MmsWapPushDeliverReceiver

  Base MIME Types:
      application:
        53929df com.android.messaging/.receiver.MmsWapPushReceiver
        592f62c com.android.messaging/.receiver.AbortMmsWapPushReceiver
        89c68f5 com.android.messaging/.receiver.MmsWapPushDeliverReceiver

  Schemes:
      content:
        511868a com.android.messaging/.receiver.SendStatusReceiver (3 filters)

  Non-Data Actions:
      android.intent.action.ACTION_DEFAULT_SMS_SUBSCRIPTION_CHANGED:
        83084fb com.android.messaging/.receiver.DefaultSmsSubscriptionChangeReceiver
      com.android.Bugle.intent.action.ACTION_NOTIFY_CONVERSATIONS_CHANGED:
        a4f4918 com.android.messaging/.widget.BugleWidgetProvider
      com.android.Bugle.intent.action.ACTION_NOTIFY_MESSAGES_CHANGED:
        c979f71 com.android.messaging/.widget.WidgetConversationProvider
      android.intent.action.DEVICE_STORAGE_LOW:
        1898156 com.android.messaging/.receiver.StorageStatusReceiver

  something:
      content:
        511868a com.android.messaging/.receiver.SendStatusReceiver (3 filters)
EOF;

preg_match_all('/Receiver Resolver Table:.*?Non-Data Actions:(.*?)^[\r\n]/sm', $text, $m, PREG_PATTERN_ORDER);
$m = $m[1][0];

preg_match_all('%/\.(.*?)$%sm', $m, $m, PREG_PATTERN_ORDER);
$m = $m[1];

print_r($m);

输出:

Array
(
    [0] => receiver.DefaultSmsSubscriptionChangeReceiver
    [1] => widget.BugleWidgetProvider
    [2] => widget.WidgetConversationProvider
    [3] => receiver.StorageStatusReceiver
)

Fisher-Yates shuffle in javascript already has a version here in Stack

正则表达式解释:

1 - Receiver Resolver Table:.*?Non-Data Actions:(.*?)^[\r\n]

Match the character string “Receiver Resolver Table:” literally (case insensitive) «Receiver Resolver Table:»
Match any single character «.*?»
   Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character string “Non-Data Actions:” literally (case insensitive) «Non-Data Actions:»
Match the regex below and capture its match into backreference number 1 «(.*?)»
   Match any single character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the beginning of a line (at beginning of the string or after a line break character) (line feed) «^»
Match a single character present in the list below «[\r\n]»
   The carriage return character «\r»
   The line feed character «\n»

2 - /\.(.*?)$

Match the character “/” literally «/»
Match the character “.” literally «\.»
Match the regex below and capture its match into backreference number 1 «(.*?)»
   Match any single character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of a line (at the end of the string or before a line break character) (line feed) «$»