我很难计算csv付款日期之间的平均时间。我尝试了多种在线观察方法(更改为data.table,使用ddply),但未成功
WorkerID PaymentDate
1 2015-07-18
1 2015-08-18
3 2015-09-18
4 2015-10-18
4 2015-11-18
这是我的数据集的一个示例-我想以最简单的方式计算出PaymentDate之间的平均时间(以天数为单位)。我想按workerID分组。 谢谢!
答案 0 :(得分:0)
这是import pytest
import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.keys import Keys
# Open a browser and log in
#browser = webdriver.Firefox(firefox_profile=fp)
driver = webdriver.Firefox(executable_path =r"C:\Users\")
driver.get("website")
time.sleep(3)
driver.find_element(By.NAME, "username").click()
driver.find_element(By.NAME, "username").send_keys("")
driver.find_element(By.NAME, "password").click()
driver.find_element(By.NAME, "password").send_keys("")
driver.find_element(By.CSS_SELECTOR, "span:nth-child(1) > span:nth-child(1)").click()
time.sleep(3)
driver.find_element(By.CSS_SELECTOR, ".tb-text-box-input").send_keys("integrated systems & automation")
driver.find_element(By.CSS_SELECTOR, ".tb-text-box-input").send_keys(Keys.ENTER)
#driver.find_element(By.CSS_SELECTOR, ".tb-react-dg-brow:nth-child(3) .shared-widgets-datagrid_bcell-wrapper_f18a59jp").click()
time.sleep(3)
driver.find_element(By.LINK_TEXT, "").click()
time.sleep(3)
driver.find_element(By.LINK_TEXT, "").click()
time.sleep(3)
driver.find_element(By.LINK_TEXT, "Site Details").click()
time.sleep(20)
driver.switch_to.frame(1)
driver.find_element(By.CSS_SELECTOR, ".tab-clip").click()
driver.find_element(By.CSS_SELECTOR, ".tab-tvTLSpacer > img").click()
time.sleep(5)
driver.find_element(By.CSS_SELECTOR, ".tab-icon-download").click()
driver.find_element(By.CSS_SELECTOR, ".tab-downloadDialog > div:nth-child(4)").click()
time.sleep(30)
driver.find_element(By.LINK_TEXT, "Download").click()
# Code to auto fill firefox login here.....
的完美工作。它将aggregate()
按PaymentDate
分组,并将功能WorkerID
应用于每个组。
mean(diff(.))
答案 1 :(得分:0)
作为AkselA答案的另一种选择,如果人们更喜欢使用R而不是基数R,则可以使用data.table
包。
这类似于使用aggregate
,但有时可能会提高速度。在下面的示例中,我通过将差设置为0来进行了单次处理,以说明如何实现。
library(lubridate)
library(data.table)
df <- fread("WorkerID PaymentDate
1 2015-07-18
1 2015-08-18
3 2015-09-18
4 2015-10-18
4 2015-11-18")
df[,PaymentDate := as.Date(PaymentDate)]
df[,{
if(length(PaymentDate) > 1){
mean(diff(as.numeric(PaymentDate)))
}else
0
}, by = WorkerID]