我需要从过滤器列表中获取该类别中有多少个规则的特定规则。
我试图从过滤器列表中提取这种类型的规则。规则模式如下
“ /example.com $ script,domain = example.com”
第二个例外规则是
“ @@ / example.com $ script,domain = example.com”
具有域锚的第三条规则是
“ ||| example.com
带有锚点和域标记的第四条规则是
“ || jizz.best ^ $ popup,domain = vivo.sx
第五个是
“ @@ || pagead2.googlesyndication.com/pagead/js/adsbygoogle.js$script,domain=quebeccoupongratuit.com
第6个具有域限制的地址如下
“ example.com ### examplebanner
第7个不受域限制的是
“ ### examplebanner
第8个元素隐藏很典型
example.com#@ ## examplebanner
这些是我必须分别获取的规则的不同类别
a=open('1-19-16anti-adblock-killer-filters.txt','r')
text=a.read()
line_starts_with_2pipes_no_domain = 0
line_starts_with_2pipes_with_domain = 0
line_starts_with_2ats_with_domain = 0
line_with_domain = 0
for line in text.split("\n"):
if line.startswith("||"):
if ",domain" in line:
line_starts_with_2pipes_with_domain += 1
else:
line_starts_with_2pipes_no_domain += 1
elif line.startswith("@@") and ",domain" in line:
line_starts_with_2ats_with_domain += 1
elif ",domain" in line:
line_with_domain += 1
elif line.strip():
print(f"No idea what to do with :{line}")
print("2pipes_no_group", line_starts_with_2pipes_no_domain )
print("2pipes_with_group", line_starts_with_2pipes_with_domain )
print("2@_with_group", line_starts_with_2ats_with_domain )
print("line_with_domain", line_with_domain)
我现在正在尝试获取第5,第6,第7和第8条规则。感谢您的答复。
答案 0 :(得分:1)
您的正则表达式不适合域前的,
:
"\/[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+domain="
# ^^^^^^^^^^^^ no , allowed
您还可以简化很多操作:
with open("easylist.txt") as f:
print('There are total Rule With Domain tag are =', f.read().count(",domain="))
应该给您答案',domain='
发生的频率。如果文件很大,还可以按行计数:
domain_rule_count = 0
with open("easylist.txt") as f:
for line in f:
domain_rule_count += 1 if ",domain=" in line else 0
在评论中的问题后进行编辑: 您只需测试所需的内容即可:
text = """ some text
/example.com $script,domain=example.com
@@/example.com $script,domain=example.com
||example.com
||jizz.best^$popup,domain=vivo.sx
"""
line_starts_with_2pipes_no_domain = 0
line_starts_with_2pipes_with_domain = 0
line_starts_with_2ats_with_domain = 0
line_with_domain = 0
for line in text.split("\n"):
if line.startswith("||"):
if ",domain" in line:
line_starts_with_2pipes_with_domain += 1
else:
line_starts_with_2pipes_no_domain += 1
elif line.startswith("@@") and ",domain" in line:
line_starts_with_2ats_with_domain += 1
elif ",domain" in line:
line_with_domain += 1
elif line.strip():
print(f"No idea what to do with '{line}'")
print("2pipes_no_group", line_starts_with_2pipes_no_domain )
print("2pipes_with_group", line_starts_with_2pipes_with_domain )
print("2@_with_group", line_starts_with_2ats_with_domain )
print("line_with_domain", line_with_domain)
输出:
No idea what to do with ' some text'
2pipes_no_group 1
2pipes_with_group 1
2@_with_group 1
line_with_domain 1