我想使用正则表达式得到一些字母

时间:2019-05-16 08:34:18

标签: regex numbers

正如我在标题上所说,我想使用“正则表达式”获得一些字母。但是我不知道如何得到它。

re.findall("\d*\.?\d+[^Successful 50/50s]", a)

'防守\ n清洁床单\ n53 \ n失球\ n118 \ n铲球\ n186 \ n铲球成功%\ n75%\ n最后一个铲球\ n2 \ n射门不通\ n24 \ n拦截\ n151 \ n间隙\ n805 \ n头球清算\ n380 \ n离线结算\ n3 \ n恢复\ n666 \ n决斗胜利\ n435 \ n决斗失败\ n330 \ n成功50 / 50s \ n25 \ n空中战斗获胜\ n206 \ n空中战斗失败\ n193 \ n自己的进球\ n1 \ n导致进球的错误\ n1Team Play \ nAssists \ n2 \ nPasses \ n7,979 \ n每场比赛的通行证\ n56.19 \ n创造大机会\ n3 \ n交叉\ n48 \ n交叉准确度%\ n25%\ n直通球\ n10 \ n准确的长球\ n936纪律\ n黄色卡\ n13 \ n红牌\ n0 \ n犯规\ n48 \ n越位\ n2攻击\ n目标\ n6 \ n带球门\ n4 \ n右脚目标\ n1 \ n左脚目标\ n1 \ n击打木工\ n3'

我只想获取包括浮点数和%在内的数字,但不包括“成功50 / 50s”。但也想保留上千个位置,例如7,979。

1 个答案:

答案 0 :(得分:0)

您可以使用此正则表达式,该正则表达式将匹配所有数字,除了在数字前面加一个斜杠(例如50/50

的数字)
(?<!/)\d*(?:,\d+)*\.?\d+\b(?!/)

Regex Demo

您更新的Python代码,

import re

s = '''Defence\nClean sheets\n53\nGoals conceded\n118\nTackles\n186\nTackle success %\n75%\nLast man tackles\n2\nBlocked shots\n24\nInterceptions\n151\nClearances\n805\nHeaded Clearance\n380\nClearances off line\n3\nRecoveries\n666\nDuels won\n435\nDuels lost\n330\nSuccessful 50/50s\n25\nAerial battles won\n206\nAerial battles lost\n193\nOwn goals\n1\nErrors leading to goal\n1',
 'Team Play\nAssists\n2\nPasses\n7,979\nPasses per match\n56.19\nBig chances created\n3\nCrosses\n48\nCross accuracy %\n25%\nThrough balls\n10\nAccurate long balls\n936',
 'Discipline\nYellow cards\n13\nRed cards\n0\nFouls\n48\nOffsides\n2',
 'Attack\nGoals\n6\nHeaded goals\n4\nGoals with right foot\n1\nGoals with left foot\n1\nHit woodwork\n3'''

print(re.findall(r'(?<!/)\d*(?:,\d+)*\.?\d+\b(?!/)', s))

打印除50 / 50之外的所有数字,

['53', '118', '186', '75', '2', '24', '151', '805', '380', '3', '666', '435', '330', '25', '206', '193', '1', '1', '2', '7,979', '56.19', '3', '48', '25', '10', '936', '13', '0', '48', '2', '6', '4', '1', '1', '3']