在黑暗时代的某个时候,构建了一个输出以下html的脚本..
...
<TABLE BORDER=0 FRAME=ALL_FRAMES RULES=ALL_RULES ALIGN=CENTER BGCOLOR="ffffe5">
<CAPTION ALIGN=TOP>
<FONT COLOR=009594 SIZE=-1><B>Access Information</B></FONT>
</CAPTION>
<TR>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
<FONT COLOR=black SIZE=-1><B>Access Circuit(s):</B></FONT>
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
**DATA TO COLLECT 111**
</TD>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
<FONT COLOR=black SIZE=-1><B>Other Circuit(s):</B></FONT>
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
 
</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
 
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
**DATA TO COLLECT AAA**
</TD>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
 
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
 
</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
 
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
**DATA TO COLLECT BBB**
</TD>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
 
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
 
</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
 
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
**DATA TO COLLECT CCC**
</TD>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
 
</TD>
<TD ALIGN=LEFT VALIGN=MIDDLE>
 
</TD>
</TR>
<TR>
<TD ALIGN=RIGHT VALIGN=MIDDLE>
<FONT COLOR=black SIZE=-1><B>Customer:</B></FONT>
</TD>
...
抱歉,我会告诉你表格布局,但我不知道如果没有<table>
SO
如何使用XPATH(在PHP中)仅收集每个DATA TO COLLECT
部分?到目前为止,我已经能够使用//*[*='Access Circuit(s):']/following-sibling::td[1]
检索第一行。
注意事项:
答案 0 :(得分:1)
我想出的表达方式是:
//TR[(.//B[.='Access Circuit(s):']) or ((./preceding-sibling::TR//B[.='Access Circuit(s):']) and (./following-sibling::TR//B[.='Customer:']))]//TD[2]
返回
<TD ALIGN="LEFT" VALIGN="MIDDLE">**DATA TO COLLECT 111**</TD>
<TD ALIGN="LEFT" VALIGN="MIDDLE">**DATA TO COLLECT AAA**</TD>
<TD ALIGN="LEFT" VALIGN="MIDDLE">**DATA TO COLLECT BBB**</TD>
<TD ALIGN="LEFT" VALIGN="MIDDLE">**DATA TO COLLECT CCC**</TD>
它使用第一行包含Access Circuit(s):
且第一行包含Customer:
的知识。如果您无法确定其中任何一个,那么我认为使用单个XPath表达式无法完成。
Step-by-step
1. //TR[
2. (.//B[.="Access Circuit(s):"])
3. or ( (./preceding-sibling::TR//B[.="Access Circuit(s):"])
4. and (./following-sibling::TR//B[.="Customer:"]) )
5. ]//TD[2]
Means
1. all TR nodes
2. that either contain "Access Circuit(s):"
3. or
- (3.) are positioned after "Access Circuit(s):"
- (4.) and are positioned before "Customer:"
5. all TD nodes that are the second TD of their parents