Question

我正在尝试使用Slack的API，它发送用户名的字符串，例如：<@ UCH65RHRC>

因此，在API JSON主体的文本中，一行中可能包含上述几种模式，例如：

“嗨，<@ UCH65RHRC>和<@ UCH65RHRF>，感谢您所做的一切！”

如何使用Python的正则表达式查找具有此预定义模式的所有匹配字符串，即：<@ ##########，其中＃（共9个）可以是0-9和AZ ？

Answer 1

这是非常简单的任务。正则表达式def skill_graph_from_df(self, sx_dataframe, path_of_existing=""): """Builds directed graph from data frame, where the weight of the edges is the confidence, as used in associaton analysis. :param sx_dataframe: Pandas Dataframe - columns: tags, postid, page, alltext. :param path_of_existing: str - path of an existing skill graph in GraphML format. New data is added to this graph. New graph is built if string is empty. :return: void """ self.df_all = sx_dataframe self.pagelist = self.df_all.page.unique() len_df = len(self.df_all) # directed graph with confidence of the rule keyword 1 => keyword 2 as weight for edges (google association analysis for explanation) if path_of_existing is not "": # import GraphML graph self.read_graph(path_of_existing) self.keywords_di.graph['pages'] = self.keywords_di.graph['pages'] + ", " + ", ".join(self.pagelist) else: self.keywords_di.graph['pages'] = ", ".join(self.pagelist) for i in range(len_df): taglist = nltk.word_tokenize(self.df_all.iloc[i, 0]) pairs = findsubsets(taglist, 2) # pairs of keywords for word in taglist: # adds nodes if word in self.keywords_di.nodes: self.keywords_di.nodes[word]['count'] += 1 else: self.keywords_di.add_node(word, count=1) for pair in pairs: # adds edges if pair in self.keywords_di.edges: self.keywords_di.edges[pair]['paircount'] += 1 self.keywords_di.edges[pair[::-1]]['paircount'] += 1 else: self.keywords_di.add_edge(*pair, paircount=1) self.keywords_di.add_edge(*pair[::-1], paircount=1) for node in self.keywords_di: for edge in self.keywords_di.out_edges([node]): self.keywords_di.edges[edge]['confidence'] = self.keywords_di.edges[edge]['paircount'] / self.keywords_di.nodes[node]['count']应该符合您的要求。例如：

<@([0-9A-Z]{9})>

这将提供以下输出：

import re

body = "Hi <@UCH65RHRC> and <@UCH65RHRF>, thanks for all the great work!"
id_search = re.findall("<@([0-9A-Z]{9})>", body)

for id in id_search:
    print(id)

Python正则表达式，找到所有带有模式的匹配项

1 个答案: