我正在使用以下代码从命令行结果生成df:-
df_output_lines = [s.split() for s in os.popen("my command linecode").read().splitlines()]
df_output_lines = list(filter(None, df_output_lines))
并将其转换为数据帧:-
df=pd.DataFrame(df_output_lines)
df
数据采用以下格式:-
abc = pd.DataFrame([['time:"08:59:38.000"', 'instance:"(null)"','id:"3214039276626790405"'],['time:"08:59:38.000"', 'instance:"(Ops-MacBook-Pro.local)"','id:"3214039276626790405"'],['time:"08:59:38.000"', 'instance:"(Ops-MacBook-Pro.local)"','id:"3214039276626790405"']])
abc
我想以某种方式对其进行过滤,以使值before :
将成为列名,而quotes " "
中的值将成为该值,并且所有列都一样。输出应该像:-
截至目前,我正在努力地做到这一点:-
abc.rename(columns={0:'time',1:'instance',2:'id'},inplace=True)
然后
abc['time'] = abc['time'].map(lambda x: str(x)[:-1])
abc['time'] = abc['time'].map(lambda x: str(x)[6:])
abc['instance'] = abc['instance'].map(lambda x: str(x)[:-1])
abc['instance'] = abc['instance'].map(lambda x: str(x)[10:])
abc['id'] = abc.id.str.extract('(\d+)', expand=True).astype(int)
任何建议使用lambda表达或任何一种衬里都可以做到这一点。
我的原始日志输出如下:-
time:"11:22:20.000" instance:"(null)" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.000631" level:"info" operation:"Init" message:"Initialize (version 4.9.0002.30618) ... "
time:"11:22:21.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.067122" level:"info" operation:"Connect" message:"Connecting to https://hrpd.www.vivox.com/api2/"
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.685700" level:"info" operation:"Connect" message:"Connected to https://hrpd.www.vivox.com/api2/"
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.814268" level:"info" operation:"Login" message:"Logged in .tester_food."
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.912255" level:"error" operation:"Call" message:".tester_food. failed to join sip:confctl-2@hrpd.vivox.com error:Access token has invalid signature(403)"
time:"12:30:41.000" instance:"Ops-MacBook-Pro.local" id:"10316899144153251411" channel:"sip:confctl-2@hrpd.vivox.com" type:"media" sampleperiod:"0.000000" incomingpktsreceived:"0" incomingpktsexpected:"0" incomingpktsloss:"0" incomingpktssoutoftime:"0" incomingpktsdiscarded:"0" outgoingpktssent:"0" predictedmos:"3" latencypktssent:"0" latencycount:"0" latencysum:"0.000000" latencymin:"0.000000" latencymax:"0.000000" callid:"2477580077" r_factor:"0.000000"
答案 0 :(得分:0)
const data = {
labels: ['Group1', 'Group2'],
datasets: [
{
label: 'label1',
fillColor: 'rgba(20,220,220,0.5)',
strokeColor: 'rgba(220,20,220,0.8)',
highlightFill: 'rgba(220,220,22,0.75)',
highlightStroke: 'rgba(220,220,220,1)',
data: [60, 30],
},
{
label: 'label2',
fillColor: 'rgba(11,17,205,0.5)',
strokeColor: 'rgba(151,18,05,0.8)',
highlightFill: 'rgba(51,87,25,0.75)',
highlightStroke: 'rgba(190,148,7,1)',
data: [28, 50],
},
],
};
const options = {
legend: {
display: false,
},
tooltips: {
enabled: true,
mode: 'single',
callbacks: {
label: (tooltipItems, data) => {
console.log(tooltipItems);
return `${tooltipItems.yLabel}€`;
},
},
},
};
pd.DataFrame
构造函数直接接受字典列表。您可以在列表理解中使用pd.DataFrame
和str.rstrip
:
str.split
目前尚不清楚您使用哪种逻辑来确定仅res = pd.DataFrame([dict(i.rstrip('"').split(':"') for i in row) for row in abc.values])
print(res)
id instance time
0 3214039276626790405 (null) 08:59:38.000
1 3214039276626790405 (Ops-MacBook-Pro.local) 08:59:38.000
2 3214039276626790405 (Ops-MacBook-Pro.local) 08:59:38.000
字符串被括号括起来。
答案 1 :(得分:0)
尽管答案已经产生,但是想添加一个基于正则表达式的方法来实现相同的目的:
date_format:G:i
只需在DataFrame中应用>>> abc
time instance id
0 time:"08:59:38.000" instance:"(null)" id:"3214039276626790405"
1 time:"08:59:38.000" instance:"(Ops-MacBook-Pro.local)" id:"3214039276626790405"
2 time:"08:59:38.000" instance:"(Ops-MacBook-Pro.local)" id:"3214039276626790405"
。
regex=True
正则表达式说明:
第一个替代'instance:'instance:匹配字符'instance:字面上(区分大小写)
第二个替代id:id:匹配字符id:从字面上(区分大小写)
第3个替代时间:时间:与字符时间匹配:字面意义(区分大小写)
第4个替代字符\“与字符“从字面上匹配(区分大小写)
第5个替代项[()]'匹配[[)]下列表中存在的单个字符 ()匹配列表()中的单个字符(区分大小写)
答案 2 :(得分:0)
输入以下示例:
time:"11:22:20.000" instance:"(null)" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.000631" level:"info" operation:"Init" message:"Initialize (version 4.9.0002.30618) ... "
time:"11:22:21.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl.com" type:"control" elapsedtime:"0.067122" level:"info" operation:"Connect" message:"Connecting to https://hrpd.www.vivox.com/api2/"
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.685700" level:"info" operation:"Connect" message:"Connected to https://hrpd.www.vivox.com/api2/"
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.814268" level:"info" operation:"Login" message:"Logged in .tester_food."
time:"11:22:23.000" instance:"Ops-MacBook-Pro.local" id:"723927731576482920" channel:"sip:confctl-.com" type:"control" elapsedtime:"2.912255" level:"error" operation:"Call" message:".tester_food. failed to join sip:confctl-2@hrpd.vivox.com error:Access token has invalid signature(403)"
这是从您的os.popen
命令来的,然后我们过滤出空白行并尝试shlex.split
行,以便保留引用项目中的空格(但引用本身被删除),例如:
import os
import shlex
import pandas as pd
rows = [shlex.split(line) for line in os.popen("my command linecode").read().splitlines() if line.strip()]
这将为您提供rows[0]
,例如:
['time:11:22:20.000',
'instance:(null)',
'id:723927731576482920',
'channel:sip:confctl.com',
'type:control',
'elapsedtime:0.000631',
'level:info',
'operation:Init',
'message:Initialize (version 4.9.0002.30618) ... ']
然后您将:
上的内容进行分区,以将标识符与值分开,并将其输入到pd.DataFrame
中,例如:
df = pd.DataFrame(dict(col.partition(':')[::2] for col in row) for row in rows)
为您提供df
的
channel elapsedtime id instance level message operation time type
0 sip:confctl.com 0.000631 723927731576482920 (null) info Initialize (version 4.9.0002.30618) ... Init 11:22:20.000 control
1 sip:confctl.com 0.067122 723927731576482920 Ops-MacBook-Pro.local info Connecting to https://hrpd.www.vivox.com/api2/ Connect 11:22:21.000 control
2 sip:confctl-.com 2.685700 723927731576482920 Ops-MacBook-Pro.local info Connected to https://hrpd.www.vivox.com/api2/ Connect 11:22:23.000 control
3 sip:confctl-.com 2.814268 723927731576482920 Ops-MacBook-Pro.local info Logged in .tester_food. Login 11:22:23.000 control
4 sip:confctl-.com 2.912255 723927731576482920 Ops-MacBook-Pro.local error .tester_food. failed to join sip:confctl-2@hrp... Call 11:22:23.000 control