我有一个包含6250行和275列的数据集。
假设我有一列为:
DATETIME:23.539070 2019] [ssl:info] [pid 2277] [remote ::1:8043] AH02411: SSL Proxy: Peer certificate does not match for hostname example.com
[DATETIME:23.542946 2019] [ssl:info] [pid 2277] [remote ::1:8043] AH01998: Connection closed to child 0 with abortive shutdown (server example.com:80)
[DATETIME:23.542998 2019] [proxy:error] [pid 2277] (502)Unknown error 502: [client 86.91.123.69:50609] AH01084: pass request body failed to [::1]:8043 (localhost)
[DATETIME:23.543040 2019] [proxy:error] [pid 2277] [client 86.91.123.69:50609] AH00898: Error during SSL Handshake with remote server returned by /
[DATETIME:23.543045 2019] [proxy_http:error] [pid 2277] [client 86.91.123.69:50609] AH01097: pass request body failed to [::1]:8043 (localhost) from 86.91.123.69 ()
What am I doing wrong ?
使用Time
0.0
0.0
0.0
18:56.5
我正在尝试:
我尝试了几次,没有任何结果。
我尝试遵循:
df = pd.read_clipboard(sep=',')
它没有显示任何错误,结果仅显示t = record['Time'].iloc[0]
t1 = record['Time'].iloc[1]
t2 = record['Time'].iloc[2]
t3 = record['Time'].iloc[3]
if t != 0:
print ('Start_time: ', t)
elif t1 != 0:
print('Start_time: ', t1)
elif t2 != 0:
print('Start_time: ', t2)
else:
print('Start_time: ',t3)
值。它没有通过条件。
我也尝试过:
t's
它说明:
TypeError:“ str”和“ int”的实例之间不支持“ <=”
答案 0 :(得分:0)
循环if语句
如果“ if,elif,else条件”满足任何一个条件,则不检查剩余条件。因此,以上代码仅显示了结果的t值。您必须检查每个if语句的所有值。
for i in range(4):
t = record['Time'].iloc[i]
t = float(t) # t seems like "str" type, you mentioned type error.
if t != 0:
print("At t{}, start_time : {}".format(i, t))
熊猫的状况
record[record['Time'] != 0]
它将显示您想要的记录。
答案 1 :(得分:0)
看看这是否适合您。仅当时间格式使用冒号而不是点时,此功能才有效。我已经将18:56.5更改为18:56:5(用冒号代替56后的点)。
df['Time']= pd.to_datetime(df['Time'], format='%H:%M:%S',errors='coerce').fillna(0)
for index, row in df.iterrows():
if row['Time']!= 0:
print(row['Time'].strftime("%H:%M:%S"))
else:
pass
答案 2 :(得分:0)
if
中使用elif
,else
和pandas
!='0.0'
DataFrame
是一个糟糕的主意,因为它效率不高。
DataFrames
中使用向量化操作。 Pandas: Essential basic functionality 0.0
和13:15.3
不是时间格式object
或str
类型,因此t1 <= 0
将不起作用。 pd.to_datetime
也将不起作用。df = pd.DataFrame({'Time': ['0.0', '0.0', '0.0', '18:56.5', '0.0', '0.0', '07:45.4'],
'Time2': ['0.0', '13:15.3', '0.0', '17:03.0', '0.0', '0.0', '07:45.4']})
dict
中的DataFrames
:DataFrame
添加到df_dict
。 keys
是column
中的df
个名称,而values
是!= '0.0'
的所有内容。df_dict = dict()
for x in df.columns:
df_dict[x] = df[x][df[x] != '0.0'] # .reset_index(drop=True) can be added here
df_dict
输出:df_dict['Time']
3 18:56.5
6 07:45.4
Name: Time, dtype: object
df_dict['Time2']
1 13:15.3
3 17:03.0
6 07:45.4
Name: Time2, dtype: object
DataFrame
:df_all = pd.concat([v for _, v in df_dict.items()], axis=1)
.reset_index(drop=True)
将解决索引号问题df_all
的末尾将有NaN
个值填充列的末尾。