php curl在文本中搜索单词

时间:2013-02-15 14:56:50

标签: php curl preg-match-all

我想制作PHP代码,可以从网页上获取文本。 theis代码从网站获取文本,然后搜索文本以“N”开头并以“a”结束然后脚本将其添加到数组[0]。 我在正则表达式中的错误,请帮我解决,抱歉我的英语不好

这是我的代码:

<?php 
$url = "http://www.nokia.com/global/products/phone/lumia820/";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch,CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);
curl_exec($ch);
curl_close($ch);

preg_match_all('/^N(.*)\a$/', $result, $matches[0]);
preg_match_all('/^M(.*)\m$/', $result, $matches[0]);
preg_match_all('/^M(.*)\c$/', $result, $matches[0]);

print_r($matches[0]);
?>

如果我尝试使用html hor示例,如何将700 mAh作为数组[0],将BL-5CA作为数组[2],将诺基亚作为数组[3]

<div class="b-goods-specifications mod_cutted"> <div class="b-goods-specifications-item"> <ul class="b-goods-specifications-list"> <li class = "b-goods-specifications-row g-clearfix "> <div class="b-goods-specifications-cell"> <span> **capacity** </ span> </ div> <div class="b-goods-specifications-cell"> **700 mAh** </ div> </ li> <li class="b-goods-specifications-row g-clearfix"> <div class="b-goods-specifications-cell"> <span> **Model** </ span> </ div > <div class="b-goods-specifications-cell"> **BL-5CA** </ div> </ li> <li class="b-goods-specifications-row g-clearfix"> <div class = "b-goods-specifications-cell "> <span> **Brand** </ span> </ div> <div class="b-goods-specifications-cell"> **Nokia** </ div> </ li> <li class =" b-goods -specifications-row g-clearfix "> <div class="b-goods-specifications-cell"> <span> Type </ span> </ div> <div class="b-goods-specifications-cell"> Li -ion ​​</ div> </ li> </ ul> </ div> </ div>

3 个答案:

答案 0 :(得分:0)

这是无效的:

preg_match_all('/^N(.*)\a$/', $result, $matches[0]);
                                               ^^^--- here

应该只是$matches。当您稍后查看$匹配时,[0]将匹配的ENTIRE字符串(例如,您运行正则表达式的内容)和[1][2]等等将被捕获来自比赛的字符串。

同样,您对所有三个正则表达式使用相同的$matches[0],这意味着您将覆盖结果,并且最终只得到LAST正则表达式的结果。你应该捕捉到不同的阵列,或者在每个阵列之后输出,这样你就不会失去你想要的东西。

答案 1 :(得分:0)

试试这个 的 '〜N(。*?)一〜SI'

 preg_match_all('~N(.*?)a~si', $result, $matches[0]);

答案 2 :(得分:0)

preg_match_all('/N\w*?a/i', $result, $matches1);
preg_match_all('/M\w*?m/i', $result, $matches2);
preg_match_all('/M\w*?c/i', $result, $matches3);
echo "<pre>";
print_r($matches1[0]);
print_r($matches2[0]);
print_r($matches3[0]);
echo "</pre>";

输出:

Array
(
    [0] => Nokia
    [1] => Nokia
    [2] => na
    [3] => na
    [4] => nitia
    [5] => na
    [6] => ngua
    [7] => nokia
    [8] => nokia
    [9] => nonica
    [10] => nokia
    [11] => na
    [12] => Nokia
    [13] => na
    [14] => Nokia
    [15] => na
    [16] => ncsea
    [17] => Nokia
    [18] => na
    [19] => ncsea
    [20] => na
    [21] => ncsea
    [22] => na
    [23] => ncsea
    [24] => na
    [25] => ncsea
    [26] => na
    [27] => ncsea
    [28] => na
    [29] => ncsea
    [30] => na
    [31] => ncsea
    [32] => nchda
    [33] => na
    [34] => ncsea
    [35] => nokia
    [36] => Nokia
    [37] => na
    [38] => ncsea
    [39] => nokia
    [40] => Nokia
    [41] => na
    [42] => ncsea
    [43] => nokia
    [44] => nokia
    [45] => na
    [46] => Na
    [47] => Na
    [48] => nloa
    [49] => na
    [50] => nta
    [51] => nokia
    [52] => nokia
    [53] => nokia
    [54] => Nokia
    [55] => na
    [56] => na
    [57] => na
    [58] => na
    [59] => na
    [60] => Nokia
    [61] => Nokia
    [62] => Nokia
    [63] => Nokia
    [64] => nnova
    [65] => nnova
    [66] => nnova
    [67] => nokia
    [68] => nokia
    [69] => nokia
    [70] => nokia
    [71] => na
    [72] => ncia
    [73] => nokia
    [74] => na
    [75] => ncia
    [76] => na
    [77] => nokia
    [78] => na
    [79] => na
    [80] => na
    [81] => nokia
    [82] => na
    [83] => na
    [84] => na
    [85] => na
    [86] => nokia
    [87] => nokia
    [88] => Nokia
    [89] => Nokia
    [90] => Nokia
    [91] => nokia
    [92] => Nokia
    [93] => Nokia
    [94] => nokia
    [95] => na
    [96] => nokia
    [97] => Nokia
    [98] => nokia
    [99] => Nokia
    [100] => nokia
    [101] => Nokia
    [102] => nokia
    [103] => Nokia
    [104] => nokia
    [105] => Nokia
    [106] => nokia
    [107] => Nokia
    [108] => Nokia
    [109] => Nokia
    [110] => nokia
    [111] => na
    [112] => nokia
    [113] => nokia
    [114] => nokia
    [115] => nokia
    [116] => nokia
    [117] => nokia
    [118] => Nokia
    [119] => nokia
    [120] => na
    [121] => nokia
    [122] => Nokia
    [123] => nokia
    [124] => Nokia
    [125] => nokia
    [126] => Nokia
    [127] => nokia
    [128] => Nokia
    [129] => nokia
    [130] => Nokia
    [131] => nokia
    [132] => Nokia
    [133] => Nokia
    [134] => Nokia
    [135] => Nokia
    [136] => nokia
    [137] => Nokia
    [138] => Nokia
    [139] => Nokia
    [140] => nokia
    [141] => Nokia
    [142] => nokia
    [143] => Nokia
    [144] => Nokia
    [145] => Nokia
    [146] => na
    [147] => na
    [148] => ndedA
    [149] => na
    [150] => na
    [151] => nokia
    [152] => nokia
    [153] => nokia
    [154] => nokia
    [155] => Nokia
    [156] => nokia
    [157] => Nokia
    [158] => nokia
    [159] => na
    [160] => nokia
    [161] => nokia
    [162] => nokia
    [163] => nokia
    [164] => nokia
    [165] => nokia
    [166] => Nokia
    [167] => Nokia
    [168] => Nokia
    [169] => nokia
    [170] => Nokia
    [171] => nokia
    [172] => na
    [173] => nokia
    [174] => nokia
    [175] => nokia
    [176] => nokia
    [177] => nokia
    [178] => nokia
    [179] => Nokia
    [180] => Nokia
    [181] => Nokia
    [182] => nokia
    [183] => nokia
    [184] => na
    [185] => nokia
    [186] => nokia
    [187] => nokia
    [188] => nokia
    [189] => nokia
    [190] => nokia
    [191] => nokia
    [192] => nokia
    [193] => na
    [194] => nokia
    [195] => nokia
    [196] => nokia
    [197] => nokia
    [198] => nokia
    [199] => nokia
    [200] => Nokia
    [201] => Nokia
    [202] => Nokia
    [203] => nokia
    [204] => nokia
    [205] => Nokia
    [206] => nokia
    [207] => na
    [208] => nokia
    [209] => nokia
    [210] => nokia
    [211] => nokia
    [212] => nokia
    [213] => nokia
    [214] => nokia
    [215] => nokia
    [216] => nokia
    [217] => nokia
    [218] => nokia
    [219] => nokia
    [220] => Nokia
    [221] => Nokia
    [222] => Nokia
    [223] => nokia
    [224] => nokia
    [225] => na
    [226] => nokia
    [227] => nokia
    [228] => nokia
    [229] => nokia
    [230] => nokia
    [231] => nokia
    [232] => na
    [233] => na
    [234] => na
    [235] => na
    [236] => Nokia
    [237] => nokia
    [238] => Nokia
    [239] => Nokia
    [240] => nokia
    [241] => na
    [242] => nokia
    [243] => Nokia
    [244] => nokia
    [245] => Nokia
    [246] => nokia
    [247] => Nokia
    [248] => nokia
    [249] => Nokia
    [250] => nokia
    [251] => Nokia
    [252] => Nokia
    [253] => nokia
    [254] => Nokia
    [255] => Nokia
    [256] => nokia
    [257] => na
    [258] => nokia
    [259] => Nokia
    [260] => nokia
    [261] => Nokia
    [262] => nokia
    [263] => Nokia
    [264] => nokia
    [265] => Nokia
    [266] => nokia
    [267] => Nokia
    [268] => ndedA
    [269] => na
    [270] => Nokia
    [271] => Nokia
    [272] => Nokia
    [273] => nokia
    [274] => Nokia
    [275] => nokia
    [276] => na
    [277] => nokia
    [278] => nokia
    [279] => nokia
    [280] => Nokia
    [281] => na
    [282] => Nokia
    [283] => Nokia
    [284] => Nokia
    [285] => nokia
    [286] => Nokia
    [287] => nokia
    [288] => na
    [289] => nokia
    [290] => nokia
    [291] => nokia
    [292] => Nokia
    [293] => na
    [294] => Nokia
    [295] => Nokia
    [296] => Nokia
    [297] => nokia
    [298] => Nokia
    [299] => nokia
    [300] => na
    [301] => nokia
    [302] => nokia
    [303] => nokia
    [304] => Nokia
    [305] => nokia
    [306] => na
    [307] => na
    [308] => na
    [309] => ne_lumia
    [310] => Nokia
    [311] => nokia
    [312] => Nokia
    [313] => Nokia
    [314] => nokia
    [315] => na
    [316] => nokia
    [317] => Nokia
    [318] => na
    [319] => na
    [320] => na
    [321] => ne_lumia
    [322] => nokia
    [323] => nokia
    [324] => na
    [325] => nokia
    [326] => Nokia
    [327] => nnova
    [328] => nokia
    [329] => na
    [330] => na
    [331] => na
    [332] => ne_lumia
    [333] => na
    [334] => nokia
    [335] => na
    [336] => na
    [337] => nokia
    [338] => na
    [339] => nokia
    [340] => na
    [341] => na
    [342] => na
    [343] => na
    [344] => ne_lumia
    [345] => nokia
    [346] => na
    [347] => nokia
    [348] => na
    [349] => nokia
    [350] => na
    [351] => na
    [352] => na
    [353] => na
    [354] => na
    [355] => Nokia
    [356] => Nokia
    [357] => Nokia
    [358] => Nokia
    [359] => nnova
    [360] => nnova
    [361] => nnova
    [362] => nokia
    [363] => nokia
    [364] => nokia
    [365] => nokia
    [366] => na
    [367] => ncia
    [368] => nokia
    [369] => na
    [370] => ncia
    [371] => na
    [372] => nokia
    [373] => na
    [374] => na
    [375] => na
    [376] => nokia
    [377] => na
    [378] => na
    [379] => na
    [380] => nokia
    [381] => nokia
    [382] => Nokia
    [383] => na
    [384] => nda
    [385] => na
    [386] => nokia
    [387] => nventwithnokia
    [388] => nokia
    [389] => Nokia
    [390] => nta
    [391] => nta
    [392] => nokia
    [393] => na
    [394] => Nokia
    [395] => na
    [396] => na
    [397] => nokia
    [398] => nokia
    [399] => nokia
    [400] => nokia
)
Array
(
    [0] => maxim
    [1] => milylum
    [2] => mm
    [3] => mobileXhtm
    [4] => mobileXhtm
    [5] => mobileXhtm
    [6] => mobileXhtm
    [7] => mobileXhtm
    [8] => mobileXhtm
    [9] => mobileXhtm
    [10] => mobileXhtm
    [11] => mobileXhtm
    [12] => mobileXhtm
    [13] => mobileXhtm
    [14] => mobileXhtm
    [15] => managem
    [16] => Maxim
    [17] => Maxim
    [18] => Maxim
    [19] => Maxim
    [20] => mm
    [21] => mm
    [22] => mm
    [23] => mm
    [24] => mm
    [25] => mm
    [26] => mm
    [27] => mobileXhtm
    [28] => mobileXhtm
    [29] => mobileXhtm
    [30] => mobileXhtm
    [31] => mobileXhtm
    [32] => mobileXhtm
    [33] => mobileXhtm
    [34] => mobileXhtm
    [35] => milylum
)
Array
(
    [0] => m07AC
    [1] => m07AC
    [2] => mmendedAc
    [3] => magic
    [4] => mZYuf6c
    [5] => mZYuf6c
    [6] => Mic
    [7] => Music
    [8] => Music
    [9] => music
    [10] => music
    [11] => Music
    [12] => music
    [13] => Music
    [14] => music
    [15] => music
    [16] => music
    [17] => music
    [18] => music
    [19] => music
    [20] => Music
    [21] => Music
    [22] => Music
    [23] => mmendedAc
    [24] => matc
)