我的产品网站中有html页面,我想解析文档并从html页面获取产品版本。
html页面如下所示:
<html>
.......
.......
<body>
.......
.......
<div id='version_info'>
<div class="product-version">
<div class="product-title">Name of the product 1:</div><div class="product-value">ver_123</div>
</div>
<div class="product-version">
<div class="product-title">Name of the product 2:</div><div class="product-value">ver_456</div>
</div>
<div class="product-version">
<div class="product-title">Name of the product 3:</div><div class="product-value">ver_845</div>
</div>
<div class="product-version">
<div class="product-title">Name of the product 4:</div><div class="product-value">ver_146</div>
</div>
</div>
.......
.......
</body>
.......
.......
</html>
如何grep文档并形成类似这样的字符串? productname1 = ver_123,productname2 = ver_456,productname3 = ver_845等
答案 0 :(得分:1)
我已经处理过这个特定的HTML文件,结果我在变量result
下得到了所需变量的字典
注意:强>
1. 请在剧本中更改html文件的路径。
2。这个特定的剧本适合这个HTML示例。有关进一步的要求和改进,请提供HTML。
---
- hosts: localhost
name: "Getting varibles from HTML"
vars:
result: {}
tasks:
- name: "Getting content of the file"
command: cat /path/to/html/file
register: search
- name: "Creating dictionary while Looping over file"
ignore_errors: true
vars:
key: "{{item | replace('<div class=\"product-title\">','') | replace('</div>','') | regex_replace('<div.*','') | regex_replace('^\\s*','')}}"
value: "{{item | replace('<div class=\"product-title\">','') | replace('</div>','') | regex_replace('^[\\w\\s\\:]*','') | replace('<div class=\"product-value\">','') | regex_replace('\\s*$','')}}"
set_fact:
result: "{{ result | combine( { key: value } ) }}"
when: "'product-title' in item"
with_items: "{{search.stdout_lines}}"
- name: "Getting register"
debug:
msg: "{{result}}"
...
<强>输出强>
ok: [localhost] => {
"msg": {
"Name of the product 1:": "ver_123",
"Name of the product 2:": "ver_456",
"Name of the product 3:": "ver_845",
"Name of the product 4:": "ver_146"
}
}