在Python中解析和清理文本文件?

时间:2018-03-19 11:12:36

标签: json python-3.x

我有一个包含原始数据的文本文件。我想解析这些数据并清理它以便可以进一步使用。以下是rawdata。

"{\x0A    \x22identifier\x22: {\x0A        \x22company_code\x22: \x22TSC\x22,\x0A        \x22product_type\x22: \x22airtime-ctg\x22,\x0A        \x22host_type\x22: \x22android\x22\x0A    },\x0A    \x22id\x22: {\x0A        \x22type\x22: \x22guest\x22,\x0A        \x22group\x22: \x22guest\x22,\x0A        \x22uuid\x22: \x221a0d4d6e-0c00-11e7-a16f-0242ac110002\x22,\x0A        \x22device_id\x22: \x22423e49efa4b8b013\x22\x0A    },\x0A    \x22stats\x22: [\x0A        {\x0A            \x22timestamp\x22: \x222017-03-22T03:21:11+0000\x22,\x0A            \x22software_id\x22: \x22A-ACTG\x22,\x0A            \x22action_id\x22: \x22open_app\x22,\x0A            \x22values\x22: {\x0A                \x22device_id\x22: \x22423e49efa4b8b013\x22,\x0A                \x22language\x22: \x22en\x22\x0A            }\x0A        }\x0A    ]\x0A}"

我想删除所有十六进制字符,我尝试解析数据并存储在数组中并使用re.sub()清除它,但它提供了相同的数据。

for line in f:
    new_data = re.sub(r'[^\x00-\x7f],\x22',r'', line)
    data.append(new_data)

1 个答案:

答案 0 :(得分:0)

\ x0A是换行符的十六进制代码。在process_request()之后,AuthenticationMiddleware会给出

V Avrcp   : Active sessions changed, 1 sessions
V Avrcp   : Updating media controller to android.media.session.MediaController@5658a74
V Avrcp   : Metadata updated but no change!
V Avrcp   : updatePlayPauseState, state: null device: null
V Avrcp   : Device: no name: 
I Avrcp   : updatePlayStatusForDevice: device: null
V Avrcp   : Focus gained for player: com.kabouzeid.gramophone
D Avrcp   : Exit updateCurrentMediaController()
D Avrcp   : Exit onActiveSessionsChanged()
V Avrcp   : Active sessions changed, 1 sessions
V Avrcp   : Updating media controller to android.media.session.MediaController@6237b9d
V Avrcp   : Metadata updated but no change!
V Avrcp   : updatePlayPauseState, state: null device: null
V Avrcp   : Device: no name: 
I Avrcp   : updatePlayStatusForDevice: device: null
V Avrcp   : Focus gained for player: com.kabouzeid.gramophone
D Avrcp   : Exit updateCurrentMediaController()
D Avrcp   : Exit onActiveSessionsChanged()
V Avrcp   : MSG_UPDATE_RCC_CHANGE
V Avrcp   : processRCCStateChange: com.kabouzeid.gramophone
V Avrcp   : MSG_UPDATE_RCC_CHANGE
V Avrcp   : processRCCStateChange: com.kabouzeid.gramophone
V Avrcp   : updateAddressedMediaPlayer
V Avrcp   : current Player: 0
V Avrcp   : Requested Player: 0
V Avrcp   : updateAddressedMediaPlayer
V Avrcp   : current Player: 0
V Avrcp   : Requested Player: 0

您应该使用$URL="C://<path to html file>/A.html"; 模块header('Location: testApp/A.html'); (来自文件)或s = <your json string>(来自字符串)函数解析此问题。你会得到一个带有2个词组的词典和带有词典的列表。