Question

美好的一天，

如何使用HTML打印WWW::Mechanize::Firefox代码的文字？

我试过了：

    print $_->text, '/n' for $mech->selector('td.dataCell');

    print $_->text(), '/n' for $mech->selector('td.dataCell');


    print $_->{text}, '/n' for $mech->selector('td.dataCell');

    print $_->content, '/n' for $mech->selector('td.dataCell');

请记住我不想要{innerhtml}，但这确实有用。

print $_->{text}, '/n' for $mech->selector('td.dataCell');

以上行确实有效，但输出只有多个/n

Answer 1

my $node = $mech->xpath('//td[@class="dataCell"]/text()');

print $node->{nodeValue};

请注意，如果你正在检索散布着其他标签的文字，例如本例中的“Test_1”和“Test_3”......

<html>
  <body>
    <form name="input" action="demo_form_action.asp" method="get">
      <input name="testRadioButton" value="test 1" type="radio">Test_1<br>
      <input name="testRadioButton" value="test 3" type="radio">Test_3<br>
      <input value="Submit" type="submit">
    </form>
  </body>
</html>

您需要根据他们在标记中的位置来引用它们（考虑任何换行符）：

$node = $self->{mech}->xpath("//form/text()[2]", single=>1);

print $node->{nodeValue};

打印“Test_1”。

Answer 2

我愿意：

print $mech->xpath('//td[@class="dataCell"]/text()');

使用xpath表达式

Answer 3

我唯一的解决方案是使用：

my $element = $mech->selector('td.dataCell');

my $string = $element->{innerHTML};

然后在每个dataCell

中格式化html

Answer 4

或者：

$element->{textContent};

或

$element->{innerText};

会奏效。

WWW :: Mechanize :: Firefox如何在HTML元素标签中提取文本？

4 个答案: