已更新示例CSV数据:
c1,c2,v1,v2,p1,p2,r1,a1,f1,f2,f3,Time_Stamp
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:00
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:01
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:02
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:03
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:04
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:05
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:06
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:07
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:08
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:09
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:10
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:11
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:12
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:13
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:14
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:15
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:16
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:17
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:18
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:19
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:20
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:21
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:22
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:23
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:24
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:25
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:26
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:27
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:28
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:29
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:30
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:31
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:32
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:33
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:34
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:35
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:36
已编辑 - 阅读CSV,Python代码:
import numpy as np
from datetime import date,time,datetime
import pandas as pd
def readcsv(x): #def function to read csv files based on input below
Data = pd.read_csv(x, parse_dates=['Time_Stamp'], infer_datetime_format=True)
Data['Date'] = Data.Time_Stamp.dt.date #creating Date Column in the Data Frame ( does not affect the main .csv file)
Data['Time'] = Data.Time_Stamp.dt.time #creating Time Column in the Data Frame ( also does not affect the main .csv file)
#print (Data) #<-- prints the output
#print (Data['Time_Stamp'][6000:7000]) <- print from row 6000 to 7000 of the data frame (has over 150'000 rows)
Data['Time_Stamp'] = pd.to_datetime(Data['Time_Stamp']) # Time_Stamp Data Frame
print(Data[1:6])
return Data
Data = readcsv('data.csv')
#Data = csv file data
def getMask(start,end,Data):
mask = (Data['Time_Stamp'] > start) & (Data['Time_Stamp'] <= end)
return mask;
start = '2017-06-13 16:00:00'
end = '2017-06-13 16:40:00'
timerange = Data.loc[getMask(start, end, Data)]
pspike = timerange.loc[timerange['c1'] <= 5.0]
pspike输出:
行 :例如 - &GT;打印pspike
后,打印的行的time
值为16:38:15
,下一个打印的行的time
值为16:38:17
,这意味着它跳过一行时间值为16:38:16
[例如下面]
13/06/2017 16:38:12
13/06/2017 16:38:13
13/06/2017 16:38:14
13/06/2017 16:38:15
13/06/2017 16:38:17
13/06/2017 16:38:18
运行下面的代码后,它会打印出跳过的行(仅限Time_Stamp值),其time
值为16:38:16
,16:38:22
和16:38:32
根据{{1}}
pspike
输出:
for i in range(timerange.shape[0] - 1):
row1 = timerange.iloc[i]
row2 = timerange.iloc[i+1]
if (row2[-1] - row1[-1]).seconds > 1:
print (row1[-1] + pd.Timedelta('1s'))
我想要的是打印2017-06-13 16:38:16
2017-06-13 16:38:22
2017-06-13 16:38:32
值为Time_Stamp
的整行,其中唯一的行是 2017-06-13 16:38:16
值超过 5.0 ,在这种情况下(基于示例代码),其c1
而不是:
415.7
我想像这样打印:
13/06/2017 16:38:16
打印完该行后,我必须使用415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:16
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:22
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:32
直接替换上方 c1
- 415.7
的值。我该怎么做?
修改
要替换的内容:
pspike输出
中缺失行的0.0
答案 0 :(得分:1)
我在这里有点困惑,因为你可以这样做:
pspike = timerange[timerange['c1'].gt(5.0)] #gr=greater than, lt=lower than
返回带有以下内容的数据框:
16 415.7 12.5 30.2 154.6 4675.2 1 -1 5199.4 0 50 0 2017-06-13 16:38:16
将列“c1”的值设置为0.0
pspike["c1"] = 0.0
从第一行创建一个字符串(index = 0):
','.join(pspike.astype(str).values.tolist()[0])
打印:
'0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:16'
<强>更新强>
string = """c1,c2,v1,v2,p1,p2,r1,a1,f1,f2,f3,Time_Stamp
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:00
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:01
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:02
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:03
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:04
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:05
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:06
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:07
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:08
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:09
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:10
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:11
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:12
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:13
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:14
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:15
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:16
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:17
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:18
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:19
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:20
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:21
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:22
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:23
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:24
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:25
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:26
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:27
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:28
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:29
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:30
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:31
415.7,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,13/06/2017 16:38:32
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:33
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:34
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:35
0,2.3,0.6,-0.9,-0.5,1,-1,941.0,0,50,0,13/06/2017 16:38:36"""
df = pd.read_csv(io.StringIO(string)) # reads data from string above
#Use : df = pd.read_csv('filename.csv') for csv file (handling tons of data)
df["Time_Stamp"] = pd.to_datetime(df["Time_Stamp"]) # convert to Datetime
df_filter = df[df["c1"].le(0.5)] # new df with less or equal to 0.5
where = (df_filter[df_filter["Time_Stamp"].diff().dt.total_seconds() > 1] ["Time_Stamp"] - pd.Timedelta("1s")).astype(str).tolist() # Find where diff > 1 second
df_filter2 = df[df["Time_Stamp"].isin(where)] # Create new df with those
df_filter2["c1"] = 0.0 # Set c1 to 0.0
for index, row in df_filter2.iterrows():
values = row.astype(str).tolist()
print(','.join(values))
返回
0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:16
0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:22
0.0,12.5,30.2,154.6,4675.2,1,-1,5199.4,0,50,0,2017-06-13 16:38:32