用Python创建一个新表

时间:2019-06-30 17:34:01

标签: python dataframe

我正在尝试从CNC Machine提取数据。

事件每毫秒发生一次,我需要过滤掉一些用竖线“ |”分隔的变量定界符。 PuTTy.exe程序生成的日志文件。

我尝试阅读大熊猫,但各列的位置不同。

df=pd.read_table('data.log', sep = '|'])

部分日志文件如下所示。

=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2019.05.24 19:47:51 =~=~=~=~=~=~=~=~=~=~=~=
2019-05-24T22:47:50.894Z|message||PLACA ABERTA-ESQ
2019-05-24T22:47:50.894Z|avail|AVAILABLE|part_count|0|SspeedOvr|50|Fovr|100|tool_id|100|program|51.51|program_comment|UNAVAILABLE|line|0|block|O0051(C1S-LADO2)|path_feedrate|0|path_position|13.9260000000 0.0000000000 5.0000000000|active_axes|X Z C|mode|AUTOMATIC
2019-05-24T22:47:50.894Z|servo|NORMAL||||
2019-05-24T22:47:50.894Z|comms|NORMAL||||
2019-05-24T22:47:50.894Z|logic|NORMAL||||
2019-05-24T22:47:50.894Z|motion|NORMAL||||
2019-05-24T22:47:50.894Z|system|NORMAL||||
2019-05-24T22:47:50.894Z|execution|STOPPED|f_command|0|estop|ARMED|Xact|-182.561|Xload|20
2019-05-24T22:47:50.894Z|Xtravel|NORMAL||||
2019-05-24T22:47:50.894Z|Xoverheat|NORMAL||||
2019-05-24T22:47:50.894Z|Xservo|NORMAL||||
2019-05-24T22:47:50.894Z|Zact|-297.913|Zload|8
2019-05-24T22:47:50.894Z|Ztravel|NORMAL||||
2019-05-24T22:47:50.894Z|Zoverheat|NORMAL||||
2019-05-24T22:47:50.894Z|Zservo|NORMAL||||
2019-05-24T22:47:50.894Z|Cact|0|Cload|0
2019-05-24T22:47:50.894Z|Ctravel|NORMAL||||
2019-05-24T22:47:50.894Z|Coverheat|NORMAL||||
2019-05-24T22:47:50.894Z|Cservo|NORMAL||||
2019-05-24T22:47:50.894Z|S1speed|0|S1load|0
2019-05-24T22:47:50.894Z|S1servo|NORMAL||||
2019-05-24T22:47:50.894Z|S2speed|0|S2load|0
2019-05-24T22:47:50.894Z|S2servo|NORMAL||||
2019-05-24T22:47:51.261Z|S2load|1
2019-05-24T22:47:51.712Z|Zload|9|S2load|0
2019-05-24T22:47:53.056Z|line|650|block|N630G21G40G90G95|path_feedrate|14142|path_position|37.9260000000 0.0000000000 17.0000000000|execution|ACTIVE|Xact|-158.561|Xload|88|Zact|-285.913|Zload|60
2019-05-24T22:47:53.497Z|block|N650G28U0W0|path_position|187.2590000000 0.0000000000 91.6670000000|Xact|-9.228|Xload|49|Zact|-211.246|Zload|20
2019-05-24T22:47:53.932Z|path_feedrate|10000|path_position|196.4870000000 0.0000000000 166.3330000000|Xact|0|Xload|43|Zact|-136.58|Zload|17
2019-05-24T22:47:54.428Z|path_position|196.4870000000 0.0000000000 246.3330000000|Xload|38|Zact|-56.58|Zload|14
2019-05-24T22:47:54.892Z|tool_id|101|path_feedrate|0|path_position|196.4870000000 0.0000000000 302.9130000000|Zact|0|Zload|40
2019-05-24T22:47:55.360Z|line|680|block|N680G92S2500M4|f_command|25|Xload|36|Zload|5|S1speed|402|S1load|110
2019-05-24T22:47:55.852Z|line|690|block|N690G0X68Z5.8M8|path_feedrate|10000|path_position|68.0000000000 0.0000000000 222.9130000000|Xact|-128.487|Xload|64|Zact|-80|Zload|17|S1speed|701|S1load|5
2019-05-24T22:47:56.348Z|path_position|68.0000000000 0.0000000000 142.9130000000|Xload|20|Zact|-160|Zload|16|S1load|2
2019-05-24T22:47:56.812Z|path_position|68.0000000000 0.0000000000 62.9130000000|Xload|21|Zact|-240|Zload|19|S1speed|700
2019-05-24T22:47:57.308Z|path_feedrate|0|path_position|68.0000000000 0.0000000000 5.8000000000|Zact|-297.113|Zload|21|S1speed|701
2019-05-24T22:47:57.772Z|line|700|block|N700G75X-2R1Z0.2P35000Q800F0.25|path_feedrate|180|path_position|65.3420000000 0.0000000000 5.8000000000|Xact|-131.145|Xload|12|Zload|10|S1speed|733|S1load|3
2019-05-24T22:47:58.268Z|path_feedrate|189|path_position|62.3680000000 0.0000000000 5.8000000000|Xact|-134.119|Xload|13|S1speed|768
2019-05-24T22:47:58.704Z|path_feedrate|199|path_position|59.4610000000 0.0000000000 5.8000000000|Xact|-137.026|Xload|15|Zload|9|S1speed|806|S1load|4
2019-05-24T22:47:59.199Z|path_feedrate|209|path_position|56.1810000000 0.0000000000 5.8000000000|Xact|-140.306|Xload|16|Zload|10|S1speed|854|S1load|5
2019-05-24T22:47:59.665Z|path_feedrate|223|path_position|52.6980000000 0.0000000000 5.8000000000|Xact|-143.789|Zload|9|S1speed|915
2019-05-24T22:48:00.188Z|path_feedrate|241|path_position|48.7150000000 0.0000000000 5.8000000000|Xact|-147.772|Xload|12|S1speed|985|S1load|6
2019-05-24T22:48:00.681Z|path_feedrate|263|path_position|44.6650000000 0.0000000000 5.8000000000|Xact|-151.822|Xload|14|Zload|10|S1speed|1077|S1load|7
2019-05-24T22:48:01.148Z|path_feedrate|288|path_position|40.2160000000 0.0000000000 5.8000000000|Xact|-156.271|Xload|16|S1speed|1208|S1load|10
2019-05-24T22:48:01.641Z|path_feedrate|312|path_position|35.3040000000 0.0000000000 5.8000000000|Xact|-161.183|Xload|14|S1speed|1246|S1load|2
2019-05-24T22:48:02.109Z|path_position|30.3130000000 0.0000000000 5.8000000000|Xact|-166.174|Xload|15|Zload|9|S1speed|1248|S1load|3
2019-05-24T22:48:02.573Z|path_position|25.3230000000 0.0000000000 5.8000000000|Xact|-171.164|Xload|11|Zload|10
2019-05-24T22:48:03.040Z|path_position|20.6660000000 0.0000000000 5.8000000000|Xact|-175.821|Zload|9|S1load|2
2019-05-24T22:48:03.481Z|path_position|16.0080000000 0.0000000000 5.8000000000|Xact|-180.479|Xload|15

我需要按日期和时间过滤每一行,并选择变量和值以在“ .csv”中建立新表。

我需要的变量是:日期和时间,Xload,Zload,S1load和S1speed。

我不知道如何读取此文件并仅使用所需的变量来创建新表。

2 个答案:

答案 0 :(得分:0)

这应该使您入门。读取日志,跳过标题,然后按管道将行划分为以逗号分隔的列表列表:

import csv
with open("'data.log", "r") as file:
     csvreader = csv.reader(file,delimiter='|')
     next(csvreader)
     csvFile = list(csvreader)

您需要从每一行中选择想要的列值(如果存在)。最后,虽然日志看起来顺序正确,但是您可以使用sorted()函数中的键对csvFile进行排序;有关详情,请参见此处:i need to sort a python list of lists by date,例如:

csvFile = sorted(csvFile, key=lambda x: datetime.strptime(x[0], "%Y-%m-%dT%H:%M:%S.%fZ").replace(tzinfo=timezone(timedelta(0))))

答案 1 :(得分:0)

首先,我们逐行读取文件,并将每一行拆分并存储。并假设“ Xload”和其他参数的值紧随其后。

data=[]
with open('data.log','r') as file:
    for row in file:
        data.append(row.rstrip('\n').split('|'))
columns =['DateTime','Xload','Zload','S1load','S1speed']

data_dic = []
for row in data:
    tmp ={}
    tmp['DateTime']=row[0]
    for i in range(1,len(row)-1):
        if row[i] in columns:
            tmp[row[i]]=row[i+1]
    for c in columns:
        if c not in tmp:
            tmp[c] = '' #for rows which donot have the property
    data_dic.append(tmp)

df = pd.DataFrame(data_dic)

从data.log中删除第一行,也可以通过编程的方式进行。

对于根据DateTime排序,无需使用任何额外的库。它已经采用ISO格式,可以直接进行比较。

sorted_dic = sorted(data_dic, key=lambda x:x['DateTime'])

此外,输入数据将始终进行排序,因此无需进行排序。