我正在提高脚本的速度,并看到以下答案:Iterrows Performance Issues。在这里,答案说很少需要使用iterrows。
在我的代码中,我使用了iterrows,因为它使用非常简单直观,但也非常慢。所以我想要渲染我使用iterrows的代码片段。这里有两个例子,我无法找到解决方案。在这两个示例中,列中的值都是具有以下格式的日期时间值:%Y-%m-%d %H:%M:%S
for index, row in df.iterrows():
df.loc[index, 'Time_Between']= row['Time_Begin'] + timedelta(seconds=row['Some_Integer_Seconds_In_A_Column'])
df.loc[index, 'Time_Required']= row['Time_End'] - timedelta(seconds=SomeIntegerSecondsAsAVariable)
df.loc[index, 'Tota_Time']= ((row['Time_Begin'] - row['Time_First']).total_seconds())/60
for index, row in df.iterrows():
if row['Time_Required'] > row['Time_Between']:
df.loc[index, 'Check']= 0
else:
df.loc[index, 'Check']= 1
我该如何对此进行矢量化?我试着掩盖和申请,但我无法得到任何工作。我得到的大部分时间都是:TypeError: Cannot change data-type for object array.
我不知道的东西......
答案 0 :(得分:2)
我认为你可以使用:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Time_End': {0: pd.Timestamp('2015-11-15 00:00:00'), 1: pd.Timestamp('2015-10-18 00:00:00'), 2: pd.Timestamp('2015-10-17 00:00:00'), 3: pd.Timestamp('2015-10-16 00:00:00')}, 'Int_Sec': {0: 4, 1: 2, 2: 7, 3: 10}, 'Time_First': {0: pd.Timestamp('2015-10-15 00:00:00'), 1: pd.Timestamp('2015-10-15 00:00:00'), 2: pd.Timestamp('2015-12-15 00:00:00'), 3: pd.Timestamp('2015-12-15 00:00:00')}, 'Time_Begin': {0: pd.Timestamp('2015-10-15 10:00:00'), 1: pd.Timestamp('2015-10-15 12:00:00'), 2: pd.Timestamp('2015-12-15 10:00:00'), 3: pd.Timestamp('2015-12-15 10:00:00')}})
print (df)
Int_Sec Time_Begin Time_End Time_First
0 4 2015-10-15 10:00:00 2015-11-15 2015-10-15
1 2 2015-10-15 12:00:00 2015-10-18 2015-10-15
2 7 2015-12-15 10:00:00 2015-10-17 2015-12-15
3 10 2015-12-15 10:00:00 2015-10-16 2015-12-15
Sec_Var = 20
df['Time_Between'] = df['Time_Begin'] + pd.to_timedelta(df['Int_Sec'], unit='s')
df['Time_Required'] = df['Time_End'] - pd.to_timedelta(Sec_Var, unit='s')
df['Tota_Time'] = ((df['Time_Begin'] - df['Time_First']).dt.total_seconds()) / 60
df['Check'] = np.where(df['Time_Required'] > df['Time_Between'], 0, 1)
print (df)
Int_Sec Time_Begin Time_End Time_First Time_Between \
0 4 2015-10-15 10:00:00 2015-11-15 2015-10-15 2015-10-15 10:00:04
1 2 2015-10-15 12:00:00 2015-10-18 2015-10-15 2015-10-15 12:00:02
2 7 2015-12-15 10:00:00 2015-10-17 2015-12-15 2015-12-15 10:00:07
3 10 2015-12-15 10:00:00 2015-10-16 2015-12-15 2015-12-15 10:00:10
Time_Required Tota_Time Check
0 2015-11-14 23:59:40 600.0 0
1 2015-10-17 23:59:40 720.0 0
2 2015-10-16 23:59:40 600.0 1
3 2015-10-15 23:59:40 600.0 1
答案 1 :(得分:0)
第二集:
public async static Task<string> Something()
{
var http = new HttpClient();
var url = "https://maps.googleapis.com/maps/api/geocode/json?address=Los%20Angeles,CA=AIzaSyDABt";
var response = await http.GetAsync(url);
if (response.IsSuccessStatusCode)
{
var result = await response.Content.ReadAsStringAsync();
result = JsonConvert.DeserializeObject<string>(result);
return result;
}
return "";
}
var result = Task.Run(() => Something()).Result;
第一集:
for index, row in df.iterrows():
if row['Time_Required'] > row['Time_Between']:
df.loc[index, 'Check']= 0
else:
df.loc[index, 'Check']= 1
np.where(row['Time_Required'] > row['Time_Between'], df['Check']= 0, df['Check']= 1)