Python正则表达式匹配相同长度的组匹配但在匹配之间变化

时间:2015-10-19 17:40:10

标签: python regex

我想匹配序列(G{x})([ACGT]{1,7})(G{x})([ACGT]{1,7})(G{x})([ACGT]{1,7})(G{x}),其中x是2到5之间的数字,它可以在匹配之间变化,但在单个匹配中的组之间必须相同。是否可以使用单个正则表达式执行此操作?

1 个答案:

答案 0 :(得分:3)

您可以使用backreferencing

INSERT INTO customer2(
  coalesce(customer_id, nextval('customer.customer_identity')),
  customer_name)
SELECT customer_id, --null if not found in customer1 table
       nc.customer_name
  FROM new_customers nc
  LEFT OUTER JOIN customer1 c1 on c1.customer_name = nc.customer_name

工作示例:https://regex101.com/r/yL5tE6/1

请注意,与第一组相比,它确实允许更多 (G{2,5})([ACGT]{1,7})\1([ACGT]{1,7})\1([ACGT]{1,7})\1 ,因为G可能会在{{1}附近添加[ACGT] s }}