Question

给定一个字符串，例如：

$string = "  this     is   a   string  ";

返回包含一个数字的csv数组的最佳方法是什么，每个单词代表其第一个字符位置，如下所示：

$string = "  this     is   a   string  ";
             ^        ^    ^   ^
             2        11   16  20

理想情况下，输出只是一个数组：

2,11,16,20

到目前为止，这是我所拥有的，但我认为鉴于我的技能有限，这有点过头了：

$string = "  this     is   a   string  ";
$string = rtrim($string); //just trim the right sides spaces
$len = strlen($string);
$is_prev_white = true;
$result = "";
for( $i = 0; $i <= $len; $i++ ) {
    $char = substr( $string,$i,1);
    if(!preg_match("/\s/", $char) AND $prev_white){
        $result .= $i.",";
        $prev_white = false;
    }else{
        $prev_white = true;
    }   
}
echo $result;

我得到： 2,4,11,16,20,22,24,26

Answer 1

Php正则表达式匹配提供了一个标志来返回te偏移而不是匹配的子字符串。使用以下代码段：

$hits = [];
preg_match_all("/(?<=\s)\w/", "  this     is   a   string  ", $hits, PREG_PATTERN_ORDER | PREG_OFFSET_CAPTURE);
$result = array_column ( $hits[0], 1 );
$s_result = join ( ", ", $result);
echo $s_result;

正则表达式模式使用正向lookbehind来查找空白字符后面的第一个字符。对array_column的调用从作为模式匹配描述返回的多维数组中提取结果数据。 join将数组元素连接成一个字符串，所选的分隔符将其转换为csv行。

有关详细信息，请参阅array_column和preg_match_all的php文档。

直播示例here。据该网站称，解决方案的工作方式为PHP 5.5.0。

Answer 2

你想要PREG_OFFSET_CAPTURE标志：

$string = "   this     is   a   string  ";
preg_match_all('/(?:^|\s)([^\s])/', $string, $matches, PREG_OFFSET_CAPTURE);

$result = $matches[1];

echo var_dump($result);

正则表达式是：

(?:^|\s) // Matches white space or the start of the string (non capturing group)
(^\s) // Matches anything *but* white space (capturing group)

传递PREG_OFFSET_CAPTURE使preg_match（）或preg_match_all（）返回匹配为两元素数组，其中包含匹配的字符串和搜索字符串中匹配的索引。上面代码的结果是：

array(4) { 
    [0]=> array(2) { [0]=> string(1) "t" [1]=> int(2) } 
    [1]=> array(2) { [0]=> string(1) "i" [1]=> int(11) } 
    [2]=> array(2) { [0]=> string(1) "a" [1]=> int(16) } 
    [3]=> array(2) { [0]=> string(1) "s" [1]=> int(20) } 
}

所以你可以用

得到索引的数组

$firstChars = array_column($result, 1);

Answer 3

使用preg_match_all和array_walk函数的简单但渐进式 :)解决方案：将preg_match_all函数与PREG_OFFSET_CAPTURE标志一起使用：

PREG_OFFSET_CAPTURE ：如果传递此标志，则对于每个发生的匹配，还将返回附加字符串偏移量。请注意，这会将匹配的值更改为一个数组，其中每个元素都是一个数组，其中包含偏移量为0的匹配字符串，其字符串偏移量为偏移量为1的主题。 / p>

$string = "  this     is   a   string  ";   // subject
preg_match_all("/\b\w+\b/iu", $string, $matches, PREG_OFFSET_CAPTURE);

array_walk($matches[0], function(&$v){   // filter string offsets
    $v = $v[1];
});
var_dump($matches[0]);

// the output:
array (size=4)
  0 => int 2
  1 => int 11
  2 => int 16
  3 => int 20

http://php.net/manual/en/function.preg-match-all.php

http://php.net/manual/en/function.array-walk.php

Answer 4

您正在寻找的模式非常简单，因此不需要正则表达式来匹配它。你可以通过循环遍历字符串来实现这一点。

$l = strlen($string);
$result = array();

// use this flag to keep track of whether the previous character was NOT a space
$c = false;

for ($i=0; $i < $l; $i++) {
    // if the previous character was a space and the current one isn't...
    if (!$c && $string[$i] != ' ') {
        // add current index to result
        $result[] = $i;
    }
    // set the 'not a space' flag for the current character
    $c = $string[$i] != ' ';
}

Answer 5

此外，您可以使用带有两个标记的preg_split。

$string = "  this     is   a   string  ";

$flags = PREG_SPLIT_NO_EMPTY | PREG_SPLIT_OFFSET_CAPTURE;

// \W+ matches one or more non word characters
$csv = implode(",", array_column(preg_split('/\W+/', $string, -1, $flags), 1));

echo $csv;

2,11,16,20

如果您需要带偏移的字词，只需删除array_column和implode部分。

$res = preg_split('/\W+/', $string, -1, $flags);

Answer 6

让我们试试这个没有正则表达式。我希望它对你有用。

$str="   w  this     is   a   string  ";
echo "<pre>";
print_r(first_letter_index($str));

function first_letter_index($str)
{
    $arr2 = array_map('trim',str_split($str));
    $result=array();
    foreach($arr2 as $k=>$v)
    {
        if(!empty($v) && empty($arr2[$k-1]))
        {
            $result[$k]=$v;
        }
    }
    return $result;
}

PHP将字符串中每个第一个字符的位置放入数组中

6 个答案: