Question

我的产品网站中有html页面，我想解析文档并从html页面获取产品版本。

html页面如下所示：

<html>
.......
.......
<body>
.......
.......
<div id='version_info'>
    <div class="product-version">
        <div class="product-title">Name of the product 1:</div><div class="product-value">ver_123</div>
    </div>
    <div class="product-version">
        <div class="product-title">Name of the product 2:</div><div class="product-value">ver_456</div>
    </div>
    <div class="product-version">
        <div class="product-title">Name of the product 3:</div><div class="product-value">ver_845</div>
    </div>
    <div class="product-version">
        <div class="product-title">Name of the product 4:</div><div class="product-value">ver_146</div>
    </div>  
</div>
.......
.......
</body>
.......
.......
</html>

如何grep文档并形成类似这样的字符串？ productname1 = ver_123，productname2 = ver_456，productname3 = ver_845等

Answer 1

我已经处理过这个特定的HTML文件，结果我在变量result下得到了所需变量的字典

注意：

1. 请在剧本中更改html文件的路径。

2。这个特定的剧本适合这个HTML示例。有关进一步的要求和改进，请提供HTML。

--- - hosts: localhost name: "Getting varibles from HTML" vars: result: {} tasks: - name: "Getting content of the file" command: cat /path/to/html/file register: search - name: "Creating dictionary while Looping over file" ignore_errors: true vars: key: "{{item | replace('<div class=\"product-title\">','') | replace('</div>','') | regex_replace('<div.*','') | regex_replace('^\\s*','')}}" value: "{{item | replace('<div class=\"product-title\">','') | replace('</div>','') | regex_replace('^[\\w\\s\\:]*','') | replace('<div class=\"product-value\">','') | regex_replace('\\s*$','')}}" set_fact: result: "{{ result | combine( { key: value } ) }}" when: "'product-title' in item" with_items: "{{search.stdout_lines}}" - name: "Getting register" debug: msg: "{{result}}" ...

<强>输出

ok: [localhost] => { "msg": { "Name of the product 1:": "ver_123", "Name of the product 2:": "ver_456", "Name of the product 3:": "ver_845", "Name of the product 4:": "ver_146" } }

在ansible中解析和greping html

1 个答案: