我有类似这样的数据,我不知道如何分割并转换成表格。
我使用pandas sep by |,但我不知道如何sep by |和=在这种情况下同时。
数据样本如下所示:txt:
SPK_VOLUME=|DEVICE_STATUS=|WAKE_UP=|SCS_STATUS=|SCS_CLASS=||MUSIC_URL_STATUS=|MUSIC_LOGIN_STATUS=|MUSIC_STREAMING_CONNECT_STATUS=|MUSIC_STREAMING_STATUS=|PLAYER_PLAYING_TIME=|TTS_STATUS=|TTS_CLASS=|ALARM_STATUS=|ALARM_END_REASON=|FOTA_STATUS=|FOTA_FAIL_REASON=
....
我用pandas加载数据
log_file = pd.read_csv("./log_file.txt",
sep = "|")
但是,我也想分开" ="并按值创建表。
SPK_VOLUME DEVICE_STATUS WAKE_UP
5 22221 0
2 42241 2
3 125214 1
感谢您的帮助
答案 0 :(得分:2)
尝试传递sep=r'\=\|'
,这对我有用:
In [189]:
t="""SPK_VOLUME=|DEVICE_STATUS=|WAKE_UP=|SCS_STATUS=|SCS_CLASS=||MUSIC_URL_STATUS=|MUSIC_LOGIN_STATUS=|MUSIC_STREAMING_CONNECT_STATUS=|MUSIC_STREAMING_STATUS=|PLAYER_PLAYING_TIME=|TTS_STATUS=|TTS_CLASS=|ALARM_STATUS=|ALARM_END_REASON=|FOTA_STATUS=|FOTA_FAIL_REASON="""
df = pd.read_csv(io.StringIO(t), sep=r'\=\|')
df.columns.tolist()
Out[189]:
['SPK_VOLUME',
'DEVICE_STATUS',
'WAKE_UP',
'SCS_STATUS',
'SCS_CLASS',
'|MUSIC_URL_STATUS',
'MUSIC_LOGIN_STATUS',
'MUSIC_STREAMING_CONNECT_STATUS',
'MUSIC_STREAMING_STATUS',
'PLAYER_PLAYING_TIME',
'TTS_STATUS',
'TTS_CLASS',
'ALARM_STATUS',
'ALARM_END_REASON',
'FOTA_STATUS',
'FOTA_FAIL_REASON=']
或者,您只需在.columns
属性上调用.str.rstrip
作为后处理步骤:
In [192]:
df.columns = df.columns.str.rstrip('=')
df.columns.tolist()
Out[192]:
['SPK_VOLUME',
'DEVICE_STATUS',
'WAKE_UP',
'SCS_STATUS',
'SCS_CLASS',
'Unnamed: 5',
'MUSIC_URL_STATUS',
'MUSIC_LOGIN_STATUS',
'MUSIC_STREAMING_CONNECT_STATUS',
'MUSIC_STREAMING_STATUS',
'PLAYER_PLAYING_TIME',
'TTS_STATUS',
'TTS_CLASS',
'ALARM_STATUS',
'ALARM_END_REASON',
'FOTA_STATUS',
'FOTA_FAIL_REASON']