我有一个看起来像的对象
block = [{'id':'10001', 'date':'2016-01-11', 'text':'this is some text. grab 40'},{'id':'10002', 'date':'2014-03-12', 'text':'this is some more text. grab 60'}]
我想抓住text
中的项目并重新格式化我的对象,使其看起来像:
block = [{'id':'10001', 'date':'2016-01-11', 'text':'this is some text. grab 40', 'grabbed': '40'},{'id':'10002', 'date':'2014-03-12', 'text':'this is some more text. grab 60', 'grabbed': '60'}]
我试过
for item in block:
if "grab" in item['text']:
m=re.search('grab (..)',line)
print m
但得到了错误
Traceback (most recent call last): File "<stdin>", line 3, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 146, in search
return _compile(pattern, flags).search(string) TypeError: expected string or buffer
答案 0 :(得分:1)
无需正则表达式。你可以这样做
for b in block:
b["grabbed"] = b["text"].rstrip().rsplit(" ",1)[-1]
In [205]: block
Out[205]:
[{'date': '2016-01-11',
'grabbed': '40',
'id': '10001',
'text': 'this is some text. grab 40'},
{'date': '2014-03-12',
'grabbed': '60',
'id': '10002',
'text': 'this is some more text. grab 60'}]
答案 1 :(得分:0)
假设抓取后只有2位数字,并且在一个字符串中只有一个“抓取xx”
for item in block:
if "grab it" in item['text']:
m = re.findall('grab \d{2}',item['text'])[0]
print m
或者在抓住后假设总是至少有一位数
for item in block:
if "grab it" in item['text']:
m = re.findall('grab \d+',item['text'])[0]
print m
答案 2 :(得分:0)
嗨看起来你的正则表达式的输入是关闭的:
m=re.search('grab (..)',line)
“线”来自哪里?那是一个字符串吗?你不想搜索“item ['text']”吗? 另请注意,“re.search”不会返回匹配项;使用例如re.findall()。
答案 3 :(得分:0)
此程序将修改您在问题中描述的block
:
from pprint import pprint
import re
block = [{'id':'10001', 'date':'2016-01-11', 'text':'this is some text. grab 40'},{'id':'10002', 'date':'2014-03-12', 'text':'this is some more text. grab 60'}]
pprint("Before:")
pprint(block)
for item in block:
grab = re.search(r"grab\s+(\d+)", item['text'])
if grab:
item['grabbed'] = grab.groups()[0]
pprint("After:")
pprint(block)