使用Python

时间:2017-02-27 01:45:22

标签: python regex

我发送一个POST请求,返回我的字符串响应,其结构如下:| hiddenField | field_name | field_value |

如何在Python中获取这些值,field_name和field_value?

我尝试使用正则表达式,但我不能。

  

| hiddenField | __VIEWSTATE | / wEPDwUKLTUzNjYxMTI2OA8WCB4IdndHcnVwb3MyiQYAAQAAAP //// 8BAAAAAAAAAAwCAAAASUJTQS5OZXRGb3JjZS5Nb2RlbCwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwEAQAAAJ8BU3lzdGVtLkNvbGxlY3Rpb25zLkdlbmVyaWMuTGlzdGAxW1tCU0EuTmV0Rm9yY2UuTW9kZWwuQ29yZS5FbnRpdGllcy5HcnVwb1Byb21vdG9yYSwgQlNBLk5ldEZvcmNlLk1vZGVsLCBWZXJzaW9uPTEuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49bnVsbF1dAwAAAAZfaXRlbXMFX3NpemUIX3ZlcnNpb24EAAAxQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmFbXQIAAAAICAkDAAAA | 512 | hiddenField | __VIEWSTATE1 | AQAAAAEAAAAHAwAAAAABAAAABAAAAAQvQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmECAAAACQQAAAANAwUEAAAAL0JTQS5OZXRGb3JjZS5Nb2RlbC5Db3JlLkVudGl0aWVzLkdydXBvUHJvbW90b3JhBgAAABw8SWRHcnVwb3Byb20 + a19fQmFja2luZ0ZpZWxkFTxOb21lPmtfX0JhY2tpbmdGaWVsZCA8Q2RVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZCA8SWRVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZBo8RHRDcmlhY2FvPmtfX0JhY2tpbmdGaWVsZBg8RG9taW5pbz5rX19CYWNraW5nRmllbGQAAQEDAwEFDlN5c3RlbS 5EZWNpbWFsD1N5c3RlbS5EYXRlVGltZQIAAAABMgYFAAAAEUdSVVBPIFBSSU1FQ09SQkFOBgYAAAAKMDAw | 512 | hiddenField | __VIEWSTATE2 | MDEwNDk4MQgFAzEzMQgNgCrlF50A0AgGBwAAAAtQUklNRUNPUkJBTgseCWNvZEdlc3RvcgUKMDAwMDEwNDk4MR4HQ1JDUGFnZSgpWlN5c3RlbS5VSW50MzIsIG1zY29ybGliLCBWZXJzaW9uPTQuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49Yjc3YTVjNTYxOTM0ZTA4OQoyOTAyOTk0ODc3HgpDUkNDb250ZW50BRJkZGxHcnVwb3NGaWx0cm8yX18WAmYPZBYCZg9kFgICAw9kFgICBQ9kFggCAw8PFgIeD0NvbW1hbmRBcmd1bWVudAURNC4xLjAuMDgxMy4wODAwLjBkZAIFDw8WAh4EVGV4dAVwVm9jw6ogZXN0w6EgZW0gPiA8c3Ryb25nPkhvbWU8L3N0cm9uZz4gID4gPHN0cm9uZz5SZWxhdMOzcmlvczwvc3Ryb25nPiA + IDxzdHJvbmc + UG9zacOnw6NvIGRl | 512 |

3 个答案:

答案 0 :(得分:3)

如果您可以假设数据将始终采用给定格式,则可以使用以下函数将其转换为字段映射字段名称到字段值:

def parse(big_string_blob):
  split_input = big_string_blob.split("|")     # (1)
  field_names = split_input[2::4]              # (2)
  field_values = split_input[3::4]             # (3)
  return dict(zip(field_names, field_values))  # (4)

1:将文本字符串转换为由|字符分隔的字符串列表。见string.split
2.从第3个元素开始创建包含split_input的每个第四个元素的列表。这些对应于字段名称 3.从第4个元素开始创建包含split_input的每个第四个元素的列表。这些对应于字段值 4.创建一个字典,将第一个列表中的元素映射为第二个列表中相应元素的键。请参阅zip

您也可以在此处使用它:https://repl.it/Fyog/0

答案 1 :(得分:1)

您可以使用regex解析数据,例如:

我假设你的数据是:

a = "|hiddenField|__VIEWSTATE|/wEPDwUKLTUzNjYxMTI2OA8WCB4IdndHcnVwb3MyiQYAAQAAAP////8BAAAAAAAAAAwCAAAASUJTQS5OZXRGb3JjZS5Nb2RlbCwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwEAQAAAJ8BU3lzdGVtLkNvbGxlY3Rpb25zLkdlbmVyaWMuTGlzdGAxW1tCU0EuTmV0Rm9yY2UuTW9kZWwuQ29yZS5FbnRpdGllcy5HcnVwb1Byb21vdG9yYSwgQlNBLk5ldEZvcmNlLk1vZGVsLCBWZXJzaW9uPTEuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49bnVsbF1dAwAAAAZfaXRlbXMFX3NpemUIX3ZlcnNpb24EAAAxQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmFbXQIAAAAICAkDAAAA|512|hiddenField|__VIEWSTATE1|AQAAAAEAAAAHAwAAAAABAAAABAAAAAQvQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmECAAAACQQAAAANAwUEAAAAL0JTQS5OZXRGb3JjZS5Nb2RlbC5Db3JlLkVudGl0aWVzLkdydXBvUHJvbW90b3JhBgAAABw8SWRHcnVwb3Byb20+a19fQmFja2luZ0ZpZWxkFTxOb21lPmtfX0JhY2tpbmdGaWVsZCA8Q2RVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZCA8SWRVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZBo8RHRDcmlhY2FvPmtfX0JhY2tpbmdGaWVsZBg8RG9taW5pbz5rX19CYWNraW5nRmllbGQAAQEDAwEFDlN5c3RlbS5EZWNpbWFsD1N5c3RlbS5EYXRlVGltZQIAAAABMgYFAAAAEUdSVVBPIFBSSU1FQ09SQkFOBgYAAAAKMDAw|512|hiddenField|__VIEWSTATE2|MDEwNDk4MQgFAzEzMQgNgCrlF50A0AgGBwAAAAtQUklNRUNPUkJBTgseCWNvZEdlc3RvcgUKMDAwMDEwNDk4MR4HQ1JDUGFnZSgpWlN5c3RlbS5VSW50MzIsIG1zY29ybGliLCBWZXJzaW9uPTQuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49Yjc3YTVjNTYxOTM0ZTA4OQoyOTAyOTk0ODc3HgpDUkNDb250ZW50BRJkZGxHcnVwb3NGaWx0cm8yX18WAmYPZBYCZg9kFgICAw9kFgICBQ9kFggCAw8PFgIeD0NvbW1hbmRBcmd1bWVudAURNC4xLjAuMDgxMy4wODAwLjBkZAIFDw8WAh4EVGV4dAVwVm9jw6ogZXN0w6EgZW0gPiA8c3Ryb25nPkhvbWU8L3N0cm9uZz4gID4gPHN0cm9uZz5SZWxhdMOzcmlvczwvc3Ryb25nPiA+IDxzdHJvbmc+UG9zacOnw6NvIGRl|512|"

然后你可以这样做:

import re

obj = re.findall('\|hiddenField|\|(.*?)\|\d+\|', a)

final = {k[0]:k[1] for k in [k.split('|') for k in obj if k != '']}

for k in final.items():
    print(k)

输出:

('__VIEWSTATE1', 'AQAAAAEAAAAHAwAAAAABAAAABAAAAAQvQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmECAAAACQQAAAANAwUEAAAAL0JTQS5OZXRGb3JjZS5Nb2RlbC5Db3JlLkVudGl0aWVzLkdydXBvUHJvbW90b3JhBgAAABw8SWRHcnVwb3Byb20+a19fQmFja2luZ0ZpZWxkFTxOb21lPmtfX0JhY2tpbmdGaWVsZCA8Q2RVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZCA8SWRVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZBo8RHRDcmlhY2FvPmtfX0JhY2tpbmdGaWVsZBg8RG9taW5pbz5rX19CYWNraW5nRmllbGQAAQEDAwEFDlN5c3RlbS5EZWNpbWFsD1N5c3RlbS5EYXRlVGltZQIAAAABMgYFAAAAEUdSVVBPIFBSSU1FQ09SQkFOBgYAAAAKMDAw')
('__VIEWSTATE', '/wEPDwUKLTUzNjYxMTI2OA8WCB4IdndHcnVwb3MyiQYAAQAAAP////8BAAAAAAAAAAwCAAAASUJTQS5OZXRGb3JjZS5Nb2RlbCwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwEAQAAAJ8BU3lzdGVtLkNvbGxlY3Rpb25zLkdlbmVyaWMuTGlzdGAxW1tCU0EuTmV0Rm9yY2UuTW9kZWwuQ29yZS5FbnRpdGllcy5HcnVwb1Byb21vdG9yYSwgQlNBLk5ldEZvcmNlLk1vZGVsLCBWZXJzaW9uPTEuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49bnVsbF1dAwAAAAZfaXRlbXMFX3NpemUIX3ZlcnNpb24EAAAxQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmFbXQIAAAAICAkDAAAA')
('__VIEWSTATE2', 'MDEwNDk4MQgFAzEzMQgNgCrlF50A0AgGBwAAAAtQUklNRUNPUkJBTgseCWNvZEdlc3RvcgUKMDAwMDEwNDk4MR4HQ1JDUGFnZSgpWlN5c3RlbS5VSW50MzIsIG1zY29ybGliLCBWZXJzaW9uPTQuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49Yjc3YTVjNTYxOTM0ZTA4OQoyOTAyOTk0ODc3HgpDUkNDb250ZW50BRJkZGxHcnVwb3NGaWx0cm8yX18WAmYPZBYCZg9kFgICAw9kFgICBQ9kFggCAw8PFgIeD0NvbW1hbmRBcmd1bWVudAURNC4xLjAuMDgxMy4wODAwLjBkZAIFDw8WAh4EVGV4dAVwVm9jw6ogZXN0w6EgZW0gPiA8c3Ryb25nPkhvbWU8L3N0cm9uZz4gID4gPHN0cm9uZz5SZWxhdMOzcmlvczwvc3Ryb25nPiA+IDxzdHJvbmc+UG9zacOnw6NvIGRl')

但是,如果您要解析所有数据,例如|hiddenField|field_name|field_value|digits,您可以执行以下操作:

import re

obj = re.findall('\|hiddenField|\|(.*?)\|(\d+)\|', a)

final = {k[0]:{'field_value': k[1], 'digits': k[2]} for k in [k[0].split("|") + [k[1]] for k in obj if k != ('','')]}

for k in final.items():
    print(k)

输出:

('__VIEWSTATE', {'field_value': '/wEPDwUKLTUzNjYxMTI2OA8WCB4IdndHcnVwb3MyiQYAAQAAAP////8BAAAAAAAAAAwCAAAASUJTQS5OZXRGb3JjZS5Nb2RlbCwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwEAQAAAJ8BU3lzdGVtLkNvbGxlY3Rpb25zLkdlbmVyaWMuTGlzdGAxW1tCU0EuTmV0Rm9yY2UuTW9kZWwuQ29yZS5FbnRpdGllcy5HcnVwb1Byb21vdG9yYSwgQlNBLk5ldEZvcmNlLk1vZGVsLCBWZXJzaW9uPTEuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49bnVsbF1dAwAAAAZfaXRlbXMFX3NpemUIX3ZlcnNpb24EAAAxQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmFbXQIAAAAICAkDAAAA', 'digits': '512'})
('__VIEWSTATE2', {'field_value': 'MDEwNDk4MQgFAzEzMQgNgCrlF50A0AgGBwAAAAtQUklNRUNPUkJBTgseCWNvZEdlc3RvcgUKMDAwMDEwNDk4MR4HQ1JDUGFnZSgpWlN5c3RlbS5VSW50MzIsIG1zY29ybGliLCBWZXJzaW9uPTQuMC4wLjAsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49Yjc3YTVjNTYxOTM0ZTA4OQoyOTAyOTk0ODc3HgpDUkNDb250ZW50BRJkZGxHcnVwb3NGaWx0cm8yX18WAmYPZBYCZg9kFgICAw9kFgICBQ9kFggCAw8PFgIeD0NvbW1hbmRBcmd1bWVudAURNC4xLjAuMDgxMy4wODAwLjBkZAIFDw8WAh4EVGV4dAVwVm9jw6ogZXN0w6EgZW0gPiA8c3Ryb25nPkhvbWU8L3N0cm9uZz4gID4gPHN0cm9uZz5SZWxhdMOzcmlvczwvc3Ryb25nPiA+IDxzdHJvbmc+UG9zacOnw6NvIGRl', 'digits': '512'})
('__VIEWSTATE1', {'field_value': 'AQAAAAEAAAAHAwAAAAABAAAABAAAAAQvQlNBLk5ldEZvcmNlLk1vZGVsLkNvcmUuRW50aXRpZXMuR3J1cG9Qcm9tb3RvcmECAAAACQQAAAANAwUEAAAAL0JTQS5OZXRGb3JjZS5Nb2RlbC5Db3JlLkVudGl0aWVzLkdydXBvUHJvbW90b3JhBgAAABw8SWRHcnVwb3Byb20+a19fQmFja2luZ0ZpZWxkFTxOb21lPmtfX0JhY2tpbmdGaWVsZCA8Q2RVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZCA8SWRVc3VhcmlvR2VzdG9yPmtfX0JhY2tpbmdGaWVsZBo8RHRDcmlhY2FvPmtfX0JhY2tpbmdGaWVsZBg8RG9taW5pbz5rX19CYWNraW5nRmllbGQAAQEDAwEFDlN5c3RlbS5EZWNpbWFsD1N5c3RlbS5EYXRlVGltZQIAAAABMgYFAAAAEUdSVVBPIFBSSU1FQ09SQkFOBgYAAAAKMDAw', 'digits': '512'})

答案 2 :(得分:1)

您也可以尝试以下方法。它只会以__

开头的单词
obj = [x.group() for x in re.finditer('__.*?\|\d+\|', input)]
final = {k[0]:k[1] for k in [k.split('|') for k in obj if k != '']}
for k in final.items():
    print(k)