使用RE将条件值插入键的dict中

时间:2018-08-13 14:30:49

标签: python regex python-3.x dictionary dictionary-comprehension

所以我的问题很简单,我的条件是:

  1. 如果键中有"__",则将'class=\"\" '插入值对<tag></tag>的开始标记中,使其变为:<tag class=""></tag>,然后插入"__"的相关值
  2. <tag></tag>保留原样

我已经完成了95%的工作,但是我却错过了一步。这是我的字典和代码:

d1 = {
  "h1": "<h1 class=\"my-heading\">[]</h1>",
  "h1__display-3": "<h1 class=\" \">[]</h1>",
  "h1__display-3_text-white_text-center": "<h1 class=\" \">[]</h1>",
  "h1__mt-4": "<h1 class=\"\">[]</h1>",
  "h1__mt-5": "<h1 class=\" \">[]</h1>",
  "h1__mt-5_kakabum": "<h1 class=\" \">[]</h1>",
  "h1__my-4": "<h1 class=\" \">[]</h1>",
  "h2": "<h2>[]</h2>",
  "h2__card-title": "<h2 class=\"\">[]</h2>",
  "h2__mt-4": "<h2 class=\"\">[]</h2>",
  "h2__my-4": "<h2 class=\"\">[]</h2>"
}

d2 = {k:k for k, v in d1.items()}
res = {}
for k in d2:
    key = k.split('__')[0]
    if key in d1:
        res[k] = d1[key]
        res2 = {k: re.sub(r'class="([\w\- ]+)"', r'class="\1 ' + d2[k] + '"', v) for k,v in res.items()}
        res3 = {k: re.sub(r'\w\w\__', r'', v).replace('_', ' ') for k, v in res2.items()}

print('Res', json.dumps(res3, indent= 2))

以上,我得到不必要的

 {
  "h1": "<h1 class=\"my-heading h1\">[]</h1>", # notice the unnecessary 'h1' in class
  "h1__display-3": "<h1 class=\"my-heading display-3\">[]</h1>",
  "h1__display-3_text-white_text-center": "<h1 class=\"my-heading display-3 text-white text-center\">[]</h1>",
  "h1__mt-4": "<h1 class=\"my-heading mt-4\">[]</h1>",
  "h1__mt-5": "<h1 class=\"my-heading mt-5\">[]</h1>",
  "h1__mt-5_kakabum": "<h1 class=\"my-heading mt-5 kakabum\">[]</h1>",
  "h1__my-4": "<h1 class=\"my-heading my-4\">[]</h1>",
  "h2": "<h2>[]</h2>",
  "h2__card-title": "<h2>[]</h2>",
  "h2__mt-4": "<h2>[]</h2>",
  "h2__my-4": "<h2>[]</h2>"
}

代替所需

 {
  "h1": "<h1 class=\"my-heading\">[]</h1>", 
  "h1__display-3": "<h1 class=\"my-heading display-3\">[]</h1>",
  "h1__display-3_text-white_text-center": "<h1 class=\"my-heading display-3 text-white text-center\">[]</h1>",
  "h1__mt-4": "<h1 class=\"my-heading mt-4\">[]</h1>",
  "h1__mt-5": "<h1 class=\"my-heading mt-5\">[]</h1>",
  "h1__mt-5_kakabum": "<h1 class=\"my-heading mt-5 kakabum\">[]</h1>",
  "h1__my-4": "<h1 class=\"my-heading my-4\">[]</h1>",
  "h2": "<h2>[]</h2>", 
  "h2__card-title": "<h2 class=\"card-title\">[]</h2>", 
  "h2__mt-4": "<h2 class=\"mt-4\">[]</h2>",
  "h2__my-4": "<h2 class=\"my-4\">[]</h2>"
}

^通知:没有"h2": "<h2>[]</h2>",的d1类属性 但由于键具有'__',因此已插入类模式"h2__card-title": "<h2 class=\"card-title\">[]</h2>"

我错过了res2和res3之间的步骤,有人可以帮我吗?

1 个答案:

答案 0 :(得分:0)

好吧,我通过详细阅读python regex文档解决了它,这是我的弱点: https://docs.python.org/3.3/howto/regex.html

解决方案如下(使用与OP相同的格):

d2 = {k:k for k, v in d1.items()}
mydict = {}
for k in d2:
    key = k.split('__')[0]
    if key in d1:
        mydict[k] = d1[key]
        # @params: re.sub(pattern, repl, string, max=0)
        class_regex = r'class="([\w\- ]+)"'
        underscore_regex = '\w+(__{1,2})'
        arrow_regex = '\w+(>+)'

        res2 = {k: re.sub(class_regex, r'class="\1 ' + (k if re.search(re.compile(underscore_regex), k) else '') + '"', v) for k,v in mydict.items()}
        res3 = {k: (v if re.search(re.compile(underscore_regex), v) else (v if re.search(re.compile(class_regex), v) else re.sub(r'(>)', r' class="' + (k if re.search(re.compile(underscore_regex), k) else re.sub(class_regex, r'', '')) + '">', v, 1))) for k, v in res2.items()}
        res4 = {k: re.sub(r'__', r' ', v).replace('_', ' ') for k, v in res3.items()}
        res5 = {k: re.sub(r' class=\"\"', r'', v) for k, v in res4.items()}

print(json.dumps(res5, indent= 2, sort_keys=True))

将得到正确的输出:

 {
  "h1": "<h1 class=\"my-heading\">[]</h1>", 
  "h1__display-3": "<h1 class=\"my-heading display-3\">[]</h1>",
  "h1__display-3_text-white_text-center": "<h1 class=\"my-heading display-3 text-white text-center\">[]</h1>",
  "h1__mt-4": "<h1 class=\"my-heading mt-4\">[]</h1>",
  "h1__mt-5": "<h1 class=\"my-heading mt-5\">[]</h1>",
  "h1__mt-5_kakabum": "<h1 class=\"my-heading mt-5 kakabum\">[]</h1>",
  "h1__my-4": "<h1 class=\"my-heading my-4\">[]</h1>",
  "h2": "<h2>[]</h2>", 
  "h2__card-title": "<h2 class=\"card-title\">[]</h2>", 
  "h2__mt-4": "<h2 class=\"mt-4\">[]</h2>",
  "h2__my-4": "<h2 class=\"my-4\">[]</h2>"
}