此文字为4行, 5列:
Compliance: 7-Day RN Waiver Indicator 1 443 443 VARCHAR2
Related Provider Number 10 686 695 CHAR
Services: Speech Pathology Off-Site Residents 1 834 834 VARCHAR2
Staff Count: Food Service Worker - Contract 25 1022 1029 NUMBER
提取第1,2,5列的正则表达式是什么?像:
Compliance: 7-Day RN Waiver Indicator|1|VARCHAR2
Related Provider Number|10|CHAR
Services: Speech Pathology Off-Site Residents|1|VARCHAR2
Staff Count: Food Service Worker - Contract|25|NUMBER
这是我的工作正则表达式\s{4}([\w\s]*)
:https://regex101.com/r/uQxRzA/1/
更新
唯一可以帮助的假设是第1列没有2个或更多空格的名称。
答案 0 :(得分:0)
您需要首先对线进行标准化,然后才能分割超过2个空格。
$string = 'Compliance: 7-Day RN Waiver Indicator 1 443 443 VARCHAR2
Related Provider Number 10 686 695 CHAR
Services: Speech Pathology Off-Site Residents 1 834 834 VARCHAR2
Staff Count: Food Service Worker - Contract 25 1022 1029 NUMBER';
$bits = explode(PHP_EOL, $string);
foreach($bits as $bit) {
print_r(preg_split('/\h{2,}/', trim($bit)));
}
或在您的情况下更改
print_r(preg_split('/\h{2,}/', trim($bit)));
到
$columns = preg_split('/\h{2,}/', trim($bit));
然后$columns[0]
是第1列,$columns[1]
是第2列,$columns[4]
是第5列。
答案 1 :(得分:0)
<?php
$input = <<<INPUT
Compliance: 7-Day RN Waiver Indicator 1 443 443 VARCHAR2
Related Provider Number 10 686 695 CHAR
Services: Speech Pathology Off-Site Residents 1 834 834 VARCHAR2
Staff Count: Food Service Worker - Contract 25 1022 1029 NUMBER
INPUT;
preg_match_all("/(.*?)([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)(\n|$)/", $input, $m);
print_r($m);
/* output:
Array
(
[0] => Array
(
[0] => Compliance: 7-Day RN Waiver Indicator 1 443 443 VARCHAR2
[1] => Related Provider Number 10 686 695 CHAR
[2] => Services: Speech Pathology Off-Site Residents 1 834 834 VARCHAR2
[3] => Staff Count: Food Service Worker - Contract 25 1022 1029 NUMBER
)
[1] => Array
(
[0] => Compliance: 7-Day RN Waiver Indicator
[1] => Related Provider Number
[2] => Services: Speech Pathology Off-Site Residents
[3] => Staff Count: Food Service Worker - Contract
)
[2] => Array
(
[0] => 1
[1] => 10
[2] => 1
[3] => 25
)
[3] => Array
(
[0] => 443
[1] => 686
[2] => 834
[3] => 1022
)
[4] => Array
(
[0] => 443
[1] => 695
[2] => 834
[3] => 1029
)
[5] => Array
(
[0] => VARCHAR2
[1] => CHAR
[2] => VARCHAR2
[3] => NUMBER
)
[6] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
)
)
*/
答案 2 :(得分:0)
提取第1,2,5列
使用preg_split
和preg_match
函数:
$text = 'Compliance: 7-Day RN Waiver Indicator 1 443 443 VARCHAR2
Related Provider Number 10 686 695 CHAR
Services: Speech Pathology Off-Site Residents 1 834 834 VARCHAR2
Staff Count: Food Service Worker - Contract 25 1022 1029 NUMBER';
$lines = preg_split('/\s*\n\s*/', $text);
foreach ($lines as $line) {
preg_match('/^(.+\S+)\s+(\S+)\s+\S+\s+\S+\s+(\S+)$/', $line, $m);
array_shift($m);
echo implode('|', $m) . PHP_EOL;
}
输出:
Compliance: 7-Day RN Waiver Indicator|1|VARCHAR2
Related Provider Number|10|CHAR
Services: Speech Pathology Off-Site Residents|1|VARCHAR2
Staff Count: Food Service Worker - Contract|25|NUMBER
答案 3 :(得分:0)
^\h{2,}((?:(?!\h{2})[\s\S])*)\h*(\S+)(?:\h*\S+){2}\h*(\S+)
替换
$1|$2|$3
^
在行首处断言位置\h{2,}
匹配2个或更多水平空白字符((?:(?!\h{2})[\s\S])*)
将以下内容捕获到捕获组1中。\h*
任意数量的水平空白字符(\S+)
将一个或多个非空白字符捕获到捕获组2 (?:\h*\S+){2}
完全匹配以下两次
\h*
匹配任意数量的水平空白字符\S+
匹配一个或多个非空白字符\h*
匹配任意数量的水平空白字符(\S+)
将一个或多个非空白字符捕获到捕获组3