我尝试从网页中提取特定值,以便将其拉入Google表格电子表格。问题在于页面的结构不会使值容易拉动。
鉴于下面的HTML,任何人都可以提出一种方法来拉动#4,586和#34;来自TD元素之后包含" Prop Taxes"?页面上有很多TD,其类别为" d97m50"。还有很多表格包含" d97m2"。
我尝试了以下但无法使其中任何一个工作。对于第一个,我无法确定在页面上迭代TD的方法,在包含" Prop Taxes"之后找到TD。并从中提取文本。第二个失败了,因为我无法确定一个可以做同样事情的正则表达式。
<TABLE class="d97m2" cellSpacing=0 cellPadding=0 sizset="false" sizcache06358115873960983="276 82 150">
<!-- A bunch of other rows -->
<TR>
<TD class="d97m40"><span class="label">Prop Taxes:</SPAN></TD>
<TD class="d97m50" colSpan=2><SPAN class="wrapped-field">$4,586</span></TD>
<TD class="d97m43"><span class="label d97m29">Garbage:</SPAN></TD>
<TD class="d97m26"><SPAN class="wrapped-field">$0</span></TD>
<TD class="d97m44"><span class="label">Parking Inc:</SPAN></TD>
<TD class="d97m45"><SPAN class="wrapped-field">$0</span></TD>
<TD class="d97m46"><span class="label">TOE:</SPAN></TD>
<TD class="d97m47"><SPAN class="wrapped-field">$10,248</span></TD></TR>
<TR>
<!-- a bunch more rows -->
</TABLE>
&#13;
答案 0 :(得分:0)
拉表的一种相当简单的方法是使用表格中的importhtml函数,例如:
=importhtml("http://www.tradingeconomics.com/zambia/rating","table",1)
答案 1 :(得分:0)
如果您可以获取希望处理为Javascript String对象的HTML,则可以使用RegEx来识别您所追踪的特定字符串。
例如,给出测试文本:
<TABLE class="d97m2" cellSpacing=0 cellPadding=0 sizset="false" sizcache06358115873960983="276 82 150">
<!-- A bunch of other rows -->
<TR>
<TD class="d97m40"><span class="label">Prop Taxes:</SPAN></TD>
<TD class="d97m50" colSpan=2><SPAN class="wrapped-field">$4,586</span></TD>
<TD class="d97m43"><span class="label d97m29">Garbage:</SPAN></TD>
<TD class="d97m26"><SPAN class="wrapped-field">$0</span></TD>
<TD class="d97m44"><span class="label">Parking Inc:</SPAN></TD>
<TD class="d97m45"><SPAN class="wrapped-field">$0</span></TD>
<TD class="d97m46"><span class="label">TOE:</SPAN></TD>
<TD class="d97m47"><SPAN class="wrapped-field">$10,248</span></TD></TR>
<TR>
<!-- a bunch more rows -->
</TABLE>
以下正则表达式:
/.*?Prop\sTaxes(.|\s)*?d97m50.*?\$(.*?)<\/span/mg
将在其第二场比赛中产生值“4,586”,然后您可以按照自己的意愿处理。
这是一个示例答案,展示了如何获得多个匹配并处理它们。
Javascript Regular Expression multiple match
此代码适用于我:
function regExTest() {
var s = '<TABLE class="d97m2" cellSpacing=0 cellPadding=0 sizset="false" sizcache06358115873960983="276 82 150">' +
'<!-- A bunch of other rows -->' +
'<TR>' +
'<TD class="d97m40"><span class="label">Prop Taxes:</SPAN></TD>' +
'<TD class="d97m50" colSpan=2><SPAN class="wrapped-field">$1,986</span></TD>' +
'<TD class="d97m43"><span class="label d97m29">Garbage:</SPAN></TD>' +
'<TD class="d97m26"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m44"><span class="label">Parking Inc:</SPAN></TD>' +
'<TD class="d97m45"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m46"><span class="label">TOE:</SPAN></TD>' +
'<TD class="d97m47"><SPAN class="wrapped-field">$10,248</span></TD></TR>' +
'<TR>' +
'<TR>' +
'<TD class="d97m40"><span class="label">Prop Taxes:</SPAN></TD>' +
'<TD class="d97m50" colSpan=2><SPAN class="wrapped-field">$4,586</span></TD>' +
'<TD class="d97m43"><span class="label d97m29">Garbage:</SPAN></TD>' +
'<TD class="d97m26"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m44"><span class="label">Parking Inc:</SPAN></TD>' +
'<TD class="d97m45"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m46"><span class="label">TOE:</SPAN></TD>' +
'<TD class="d97m47"><SPAN class="wrapped-field">$10,248</span></TD></TR>' +
'<TR>' +
'<TR>' +
'<TD class="d97m40"><span class="label">Prop Taxes:</SPAN></TD>' +
'<TD class="d97m50" colSpan=2><SPAN class="wrapped-field">$2,514</span></TD>' +
'<TD class="d97m43"><span class="label d97m29">Garbage:</SPAN></TD>' +
'<TD class="d97m26"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m44"><span class="label">Parking Inc:</SPAN></TD>' +
'<TD class="d97m45"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m46"><span class="label">TOE:</SPAN></TD>' +
'<TD class="d97m47"><SPAN class="wrapped-field">$10,248</span></TD></TR>' +
'<TR>' +
'<TR>' +
'<TD class="d97m40"><span class="label">Prop Taxes:</SPAN></TD>' +
'<TD class="d97m50" colSpan=2><SPAN class="wrapped-field">$3,312</span></TD>' +
'<TD class="d97m43"><span class="label d97m29">Garbage:</SPAN></TD>' +
'<TD class="d97m26"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m44"><span class="label">Parking Inc:</SPAN></TD>' +
'<TD class="d97m45"><SPAN class="wrapped-field">$0</span></TD>' +
'<TD class="d97m46"><span class="label">TOE:</SPAN></TD>' +
'<TD class="d97m47"><SPAN class="wrapped-field">$10,248</span></TD></TR>' +
'<TR>' +
'<!-- a bunch more rows -->' +
'</TABLE>';
var qualityRegex = /.*?Prop\sTaxes(.|\s)*?d97m50.*?\$(.*?)<\/span/mg,
matches = [];
var match = qualityRegex.exec(s);
while (match != null) {
matches.push(match[2]);
match = qualityRegex.exec(s);
}
/* Matches now contains the numbers you require */
}