我需要解析一个包含html标签的文本文件,如下所示:
<item>
<value4="L5u9eDNV40_val4">
<value6="xcE90l2HyN_val6">
<value3="hJyVXoE4YQ_val3">
<value5="K68yGpDsTR_val5">
<value2="umrVvR8Tfe_val2">
<value1="y6Ms2E5BHe_val1">
</item>
<item>
<value4="T4PFOipm3u_val4">
<value2="upLkW2r8nq_val2">
<value3="3h7lV6CaHP_val3">
<value5="4pETv3bt5c_val5">
<value1="iEPZCnzxjs_val1">
<value6="fWjg1Ueo5M_val6">
</item>
我需要使用PHP,结果应该是这样的数组:
array (size=10000)
0 => array (size = 3)
'value1' => string 'L5u9eDNV40_val4',
'value2' => string 'umrVvR8Tfe_val2',
'value4' => string 'T4PFOipm3u_val4' `
我使用SimpleHTMLDOM尝试了这个,但我无法做任何有用的事情。
答案 0 :(得分:1)
答案 1 :(得分:0)
目前尚不清楚您想要的最终数据结构,但此代码将创建一个数组数组$v_arr
,其中每个子数组包含一个<item>
的值:
$v_arr = array();
# split the string up into an array with one <item> per array element
$items = explode("<item>", $text);
foreach ($items as $i) {
# only parse entries that have <value... tags
if (strpos($i, '<value') !== false) {
# parse the value tags, save the matches in $matches
if (preg_match_all('#<(value\d)="(.+?)">#', $i, $matches)) {
# create a new array with valueX as keys, the other string as values.
# push this array on to a results array
$v_arr[] = array_combine( $matches[1], $matches[2] );
}
}
}
print_r($v_arr);
您发布的文字的输出:
Array
(
[0] => Array
(
[value4] => L5u9eDNV40_val4
[value6] => xcE90l2HyN_val6
[value3] => hJyVXoE4YQ_val3
[value5] => K68yGpDsTR_val5
[value2] => umrVvR8Tfe_val2
[value1] => y6Ms2E5BHe_val1
)
[1] => Array
(
[value4] => T4PFOipm3u_val4
[value2] => upLkW2r8nq_val2
[value3] => 3h7lV6CaHP_val3
[value5] => 4pETv3bt5c_val5
[value1] => iEPZCnzxjs_val1
[value6] => fWjg1Ueo5M_val6
)
)