所以我有以下字符串:
__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503; expires=Sat, 09-Mar-19 03:35:03 GMT; path=/; domain=.coinmarketcap.com; HttpOnly, _version=a90f44e909c03fdad3caed1ec676a98472deb0f6; path=/, __session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974; domain=.coinmarketcap.com path=/
但是我需要从中删除垃圾,比如
expires=Sat, 09-Mar-19 03:35:03 GMT
或
domain=.coinmarketcap.com path=/
所以我只剩下三个值:
__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503; _version=a90f44e909c03fdad3caed1ec676a98472deb0f6; __session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974
答案 0 :(得分:0)
指定要保留的键:
In [193]: keys = ['__cfduid', '_version', '__session']
现在,请先致电re.findall
(import re
):
In [194]: ' '.join(re.findall(r'(?:{}).*?;'.format('|'.join(keys)), text)
Out[194]: '__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503; _version=a90f44e909c03fdad3caed1ec676a98472deb0f6; __session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974;'
正则表达式(?:{}).*?;
指定您只想查找这些选定键的键值对。其他一切都被丢弃了。只要您的字符串具有一致的结构((key=value;)+
)。
答案 1 :(得分:0)
对于任何以下划线开头的键,这是更通用的解决方案。
import re
str_list = re.findall(r"_\w+=\w+", your_string)
out:
['__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503',
'_version=a90f44e909c03fdad3caed1ec676a98472deb0f6',
'__session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ']
re.findall
会返回list
,您可以加入 "; ".join(str_list)
以获得所需的输出。
<%= stylesheet_link_tag stylesheet_path() %>
答案 2 :(得分:0)
另一种方法,
keys = ('__cfduid', '_version', '__session')
' '.join([x for x in text.split() if x.startswith(keys)])
答案 3 :(得分:0)
看起来你正在解析一个cookie字符串。在这种情况下,您应该使用标准库cookie解析模块 - https://docs.python.org/2/library/cookie.html#Cookie.BaseCookie.load
>>> from Cookie import SimpleCookie
>>> s = SimpleCookie()
>>> s.load("__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503; expires=Sat, 09-Mar-19 03:35:03 GMT; path=/; domain=.coinmarketcap.com; HttpOnly, _version=a90f44e909c03fdad3caed1ec676a98472deb0f6; path=/, __session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974; domain=.coinmarketcap.com path=/")
>>> [(k, s[k].value) for k in s.keys()]
[('__cfduid', 'dc3c9f85f65d39a5947d5f4850618237f1520566503'),
('_version', 'a90f44e909c03fdad3caed1ec676a98472deb0f6'),
('__session', 'NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974')]
>>> s['__cfduid'].value
'dc3c9f85f65d39a5947d5f4850618237f1520566503'
(Python 2,Python 3有不同的导入)。
这比尝试自己的cookie解析要好得多。