解析包含图像数据的文本文件

时间:2012-10-03 19:57:53

标签: php javascript regex

我已经能够使用来自http://www.imagemagick.org的Imagemagick从文本文件中读取数据。

我得到了

0,0: (255,255,255,  0)  #FFFFFF00  srgba(255,255,255,0)
1,0: (255,255,255,  0)  #FFFFFF00  srgba(255,255,255,0)
2,0: (255,255,255,  0)  #FFFFFF00  srgba(255,255,255,0)


40,23: (162,167, 32, 24)  #A2A72018  srgba(162,167,32,0.0941176)
41,23: (255,255,255,  0)  #FFFFFF00  srgba(255,255,255,0)
42,23: (162,166, 48, 40)  #A2A63028  srgba(162,166,48,0.156863)
43,23: (162,166, 47, 40)  #A2A62F28  srgba(162,166,47,0.156863)

我对正则表达式不熟悉我将使用哪种表达方式 得到开头的坐标和最后的rgba()。

好吧,我有点想通了正则表达式 这是它的一部分

/rgba\([0-9]{1,3},[0-9]{1,3},[0-9]{1,3},[0-9]\.[0-9]{1-9}/gi

但最后一部分与rgba()的小数部分不匹配。

好吧我知道它已经有一段时间了,我不确定我是否应该开始一个新线程

但我已经想出如何删除以删除两个中间部分,所有括号 和srgba,也是零alpha的行,但不知何故,它在文本文件中留下了一个空白区域。如果有任何改进可以让任何人看到。

$fh = fopen("pcmanD.txt", "r");
$fg = fopen("pcmanJ.txt", "wt");
$new_array = ""; $parts = "";
while (!feof($fh)) {
    $line = fgets($fh);
    $lines[] = $line;
    $newword = "";
    $match1 ="/:\s?\s?\(\s?\s?\d+,\s?\s?\d+,\s?\s?\d+,\s?\s?\d+\)\s?\s?#[a-zA-Z0-9]{6,8}\s?\s?srgba/";
    $match2 ="/^\s\s?/";
    $match3 = "/\s\(/";
    $match4 = "/\)/";
    $match5 = "/[0-9]{1,3},[0-9]{1,3},[0-9]{1,3},[0-9]{1,3},[0-9]{1,3},0[^\.]/";

    $parts1 = preg_replace($match1,"", $line );
    $parts2 = preg_replace($match2,"", $parts1 );
    $parts3 = preg_replace($match3,",", $parts2 );
    $parts4 = preg_replace($match4,"", $parts3 );
    $parts5 = preg_replace($match5,"",$parts4);


    echo "<pre>";
    print_r($parts5);
    echo "</pre>";

    fwrite($fg, $parts5);
} 

fclose($fg);
fclose($fh);

但是这个代码出现了新的问题,我得到浮点数的两倍或三倍 比赛结束后。

$thisisit[] = "";
$thisisit2[] = "";
$countThis = 0;
$fh = fopen("sometext.txt", "r");
$new_array = ""; $parts = "";
while (!feof($fh)) {
    $line = fgets($fh);
    $line2 = $line;
    $newword = "";
    $match1 ="/^\s*?[\d]+,[\d]+/";
    $parts1 = preg_match($match1, $line, $regs);
    foreach($regs as $key => $lame) {
        $thisisit[] = $lame;
    }
    $match2 ="/(?:(\d{1,3},\d{1,3},))(\d{1,3},\d{1,3},\d{1,3},[01][\.]?[\d]*)/";
    $parts2 = preg_match($match2, $line2, $regs2);
    foreach($regs2 as $key2 => $lame2) {
        $thisisit2[] =$lame2;
    }
    $countLame = count($thisisit);
} 

echo "</script>";
$newCounter = 0;
for($i = 0; $i < (500); $i++) {
    echo  $thisisit[$i] . "<br />";
    echo $thisisit2[$newCounter] . "<br />" ;
    $newCounter = $newCounter +4;
}
fclose($fh);

这是我正在处理的文本文件中的一些文本

 42,23,162,166,48,0.156863
 43,23,162,166,47,0.156863
 44,23,167,170,67,0.219608
 45,23,162,166,47,0.156863
 46,23,167,170,67,0.219608
 47,23,162,166,37,0.117647
 48,23,162,167,32,0.0941176
                                      86,23,163,167,40,0.12549
 87,23,160,164,47,0.164706
 88,23,188,190,122,0.352941
 86,24,233,234,197,0.486275
 87,24,251,250,250,1
 88,24,251,250,250,1
 89,24,251,250,250,1

3 个答案:

答案 0 :(得分:0)

> '0,0: (255,255,255,  0)'.match(/^([^:]+) *: *\(([^)]+)\)/)
[ '0,0: (255,255,255,  0)',
  '0,0',
  '255,255,255,  0',
  index: 0,
  input: '0,0: (255,255,255,  0)' ]

答案 1 :(得分:0)

/^(\d+,\d+):.*\(([^)]*)\)$/

这应该将坐标下拉到捕获组1中,并将RGBA值下拉到组2.但是,请注意,如果对您来说这对您使用字符串操作可能更有效。您可以在:上拆分字符串以获取坐标,然后在srgba(上拆分并删除颜色值的最后一个字符。

答案 2 :(得分:0)

示例#1:

如果您有多行$input

代码

$input = '
    0,0: (255,255,255,  0)  #FFFFFF00  srgba(255,255,255,0)
    43,23: (162,166, 47, 40)  #A2A62F28  srgba(162,166,47,0.156863)
';
preg_match_all('~^\s*(.+?):.+srgba\((.+?)\)\s*$~m', $input, $match);
print_r($match);

输出

Array
(
    [0] => Array
        (
            [0] => 0,0: (255,255,255,  0)  #FFFFFF00  srgba(255,255,255,0)
            [1] => 43,23: (162,166, 47, 40)  #A2A62F28  srgba(162,166,47,0.156863)
        )
    [1] => Array
        (
            [0] => 0,0
            [1] => 43,23
        )
    [2] => Array
        (
            [0] => 255,255,255,0
            [1] => 162,166,47,0.156863
        )
)

示例#2:

如果您的单行$input具有更深的正则表达式:

代码

$input = '43,23: (162,166, 47, 40)  #A2A62F28  srgba(162,166,47,0.156863)';
preg_match('~^\s*(\d+),\s*(\d+):.+srgba\(\s*(\d+),\s*(\d+),\s*(\d+),\s*(.+?)\)\s*$~m', $input, $match);
print_r($match);

输出

Array
(
    [0] => 43,23: (162,166, 47, 40)  #A2A62F28  srgba(162,166,47,0.156863)
    [1] => 43
    [2] => 23
    [3] => 162
    [4] => 166
    [5] => 47
    [6] => 0.156863
)