操纵大量的字符串

时间:2015-07-09 14:58:58

标签: php arrays string command-line-interface

我们的一家供应商收到了一张12K +图像的DVD。在我将它们放在我们的网络服务器上之前,我需要调整大小,重命名和复制它们。 要做到这一点,我正在编写一个PHP cli程序。 似乎我有点卡住了......

所有文件都符合某种模式。

复制和重命名不是问题,字符串的操作是。

所以要简化示例代码:假设我有一个包含字符串的数组,我想将它们放入一个新数组中。

原始数组如下所示:

$names = array (
 'FIX1_VARA_000.1111_FIX2',
 'FIX1_VARB_000.1111.2_FIX2',
 'FIX1_VARB_222.2582_FIX2',
 'FIX1_VARC_555.8794_FIX2',
 'FIX1_VARD_111.0X00(2-5)_FIX2',
 'FIX1_VARA_112.01XX(09-13)_FIX2',
 'FIX1_VARB_444.XXX1(203-207).2_FIX2'    
);

该阵列中的每个弦从前面的相同固定部分开始,并在末端FIX1&中以相同的固定部分结束。分别为FIX2。 在FIX1之后,总会有一个下划线,后跟一个变量部分,后跟一个下划线。我对固定部件或可变部件不感兴趣。所以我把它全部剪掉了。

剩下的字符串可以是以下两种类型: 如果它只包含数字和点:那么它是一个有效的字符串,我把它 在$ clean数组中。 EG:000.1111或000.111.2 如果字符串中不仅有数字和点,那么它总是有几个X和一个用数字和 - 打开的闭合括号。 像444.XXX1(203-207).2

括号之间的数字形成一个系列,本系列中的每个数字都需要替换X&#39。应该放在$ clean数组中的字符串是:

444.2031.2

444.2041.2

444.2051.2

444.2061.2

444.2071.2

这是我挣扎的部分。

$clean = array();
foreach ($names as $name){
    $item = trim(strstr(str_replace(array('FIX1_', '_FIX2'),'',$name), '_'),'_');
    // $item get the values:
    /*  
     * 000.1111, 
     * 000.1111.2, 
     * 222.2582, 
     * 555.8794, 
     * 111.0X00(2-5), 
     * 112.01XX(09-13), 
     * 444.XXX1(203-207).2 
     *  
     */

    // IF an item has no X in it, it can be put in the $clean array
    if (strpos($item,'X') === false){
        //this is true for the first 4 array values in the example
        $clean[] = $item;
    }
    else {
        //this is for the last 3 array values in the example
        $b = strpos($item,'(');
        $e = strpos($item,')');
        $sequence = substr($item,$b,$e-$b+1);

        $item = str_replace($sequence,'',$item);

        /* This is the part were I'm stuck */
        /* ------------------------------- */
        /* it should get the values in the sequence variable and iterate over them:
         * 
         * So for $names[5] ('FIX1_VARA_112.01XX(09-13)_FIX2') I want the folowing values entered into the $clean array:
         * Value of $sequence = '(09-13)'
         * 
         * 112.0109
         * 112.0110
         * 112.0111
         * 112.0112
         * 112.0113
         *  
         */      
    }
}

//NOW ECHO ALL VALUES IN $clean:
foreach ($clean as $c){
    echo $c . "\n";
}

最终输出应为:

000.1111
000.1111.2
222.2582
555.8794
111.0200
111.0300
111.0400
111.0500
112.0109
112.0110
112.0111
112.0112
112.0113
444.2031.2
444.2041.2
444.2051.2
444.2061.2
444.2071.2

任何有关&#34的帮助;在这里我被困'"部分将不胜感激。

2 个答案:

答案 0 :(得分:1)

首先,我假设你的所有文件都有有效的模式,所以没有文件有错误,否则,只需添加安全条件......
$sequence中,您获得了(09-13)。 要使用数字,您必须删除(),因此请创建其他变量:

$range = substr($item,$b,$e-$b+1);
// you get '09-13'

然后你需要拆分它:

list($min, $max) = explode("-",$range);
// $min = '09', $max = '13'
$nbDigits = strlen($max);
// $nbDigits = 2

然后你需要从最小到最大的所有数字:

$numbersList = array();
$min = (int)$min; // $min becomes 9, instead of '09'
$max = (int)$max;
for($i=(int)$min; $i<=(int)$max; $i++) {
    // set a number, including leading zeros
    $numbersList[] = str_pad($i, $nbDigits, '0', STR_PAD_LEFT);
}

然后你必须用这些数字生成文件名:

$xPlace = strpos($item,'X');
foreach($numbersList as $number) {
    $filename = $item;
    for($i=0; $i<$nbDigits; $i++) {
        // replacing one digit at a time, to replace each 'X'
        $filename[$xPlace+$i] = $number[$i];
    }
    $clean[] = $filename;
}

它应该做一些工作,可能会有一些错误,但这是一个好的开始,试一试:)

答案 1 :(得分:1)

像@ stdob--提到的那样,正则表达式确实是你想要的。这是代码的工作版本:

$names = array (
 'FIX1_VARA_000.1111_FIX2',
 'FIX1_VARB_000.1111.2_FIX2',
 'FIX1_VARB_222.2582_FIX2',
 'FIX1_VARC_555.8794_FIX2',
 'FIX1_VARD_111.0X00(2-5)_FIX2',
 'FIX1_VARA_112.01XX(09-13)_FIX2',
 'FIX1_VARB_444.XXX1(203-207).2_FIX2'
);

$clean = array();
foreach ($names as $name){
    $item = trim(strstr(str_replace(array('FIX1_', '_FIX2'),'',$name), '_'),'_');
    // $item get the values:
    /*
     * 000.1111,
     * 000.1111.2,
     * 222.2582,
     * 555.8794,
     * 111.0X00(2-5),
     * 112.01XX(09-13),
     * 444.XXX1(203-207).2
     *
     */

    // IF an item has no X in it, it can be put in the $clean array
    if (strpos($item,'X') === false){
        //this is true for the first 4 array values in the example
        $clean[] = $item;
    }
    else {
        // Initialize the empty matches array (I prefer [] to array(), but pick your poison)
        $matches = [];

        // Check out: https://www.regex101.com/r/qG4jS4/1 to see visually how this works (also, regex101.com is just rad)
        // This uses capture groups, which get stored in the $matches array.
        preg_match('/\((\d*)-(\d*)\)/', $item, $matches);

        // Now we've got the array of values that we want to have in our clean array
        $range = range($matches[1], $matches[2]);

        // Since preg_match has our parenthesis and digits grabbed for us, get rid of those from the string
        $item = str_replace($matches[0],'',$item);


        // Truly regrettable variable names, but you get the idea!
        foreach($range as $number){
            // Here's where it gets ugly. You're wanting the numbers to work like strings (have consistent length
            // like 09 and 13) but also work like numbers (when you create a sequence of numbers). That kind of
            // thinking begets hackery. This probably isn't your fault, but it seems helpful to point out.

            // Anyways, we can use the number of X's in the string to figure out how many characters we ought
            // to be adding. This is important because otherwise we'll end up with 112.019 instead of 112.0109.
            // PHP casts that '09' to (int) 9 when we run the range() function, so we lose the leading zero.
            $xCount = substr_count($item, 'X');

            if($xCount > strlen($number)){
                // This function adds a given number ($xCount, in our case) of a character ('0') to
                // the end of a string (unless it's given the STR_PAD_LEFT flag, in which case it adds
                // the padding to the left side)
                $number = str_pad($number, $xCount, '0', STR_PAD_LEFT);
            }

            // With a quick cheat by padding an empty string with the same number of X's we counted earlier...
            $xString = str_pad('', $xCount, 'X');

            // Now we can add the fixed string into the clean array.
            $clean[] = str_replace($xString, $number, $item);
        }
    }
}

// I also happen to prefer var_dump to echo, but again, your mileage may vary.
var_dump($clean);

输出:

array (size=18)
  0 => string '000.1111' (length=8)
  1 => string '000.1111.2' (length=10)
  2 => string '222.2582' (length=8)
  3 => string '555.8794' (length=8)
  4 => string '111.0200' (length=8)
  5 => string '111.0300' (length=8)
  6 => string '111.0400' (length=8)
  7 => string '111.0500' (length=8)
  8 => string '112.0109' (length=8)
  9 => string '112.0110' (length=8)
  10 => string '112.0111' (length=8)
  11 => string '112.0112' (length=8)
  12 => string '112.0113' (length=8)
  13 => string '444.2031.2' (length=10)
  14 => string '444.2041.2' (length=10)
  15 => string '444.2051.2' (length=10)
  16 => string '444.2061.2' (length=10)
  17 => string '444.2071.2' (length=10)

- 编辑 - 删除了关于strpos==的警告,看起来有人已经在评论中指出了这一点。