您好我正在使用pandas从两个excel文件导入数据,其中一个文件中包含的数据示例如下所示。基本上我试图找到两个文件中相同的时间戳,然后排序例如“Power”列中的数据,该列对应于从两个文件到一些箱中的相同时间戳。该例子中的箱子是0-50,50-100,依此类推,间隔为50,例如, 1000
1. Location UnitName Timestamp Power Windspeed Yaw
2. Bull Creek F10 01/11/2014 00:00:00 7,563641548 3,957911002 280,5478821
3. Bull Creek F10 01/11/2014 00:20:00 60,73444748 4,24157236 280,4075012
4. Bull Creek F10 01/11/2014 00:30:00 63,15441132 4,241089859 280,3903809
5. Bull Creek F10 01/11/2014 00:40:00 59,09280396 4,38904965 280,4152527
6. Bull Creek F10 01/11/2014 00:50:00 69,26197052 4,374599175 280,3750916
7. Bull Creek F10 01/11/2014 01:00:00 101,0624237 5,343887005 280,5173035
8. Bull Creek F10 01/11/2014 01:10:00 122,7936935 5,183885235 280,4681702
9. Bull Creek F10 01/11/2014 01:20:00 86,57110596 5,046733923 280,3834534
10. Bull Creek F10 01/11/2014 01:40:00 16,74042702 3,024427626 280,1408386
11. Bull Creek F10 01/11/2014 01:50:00 12,5870142 2,931351769 280,1185913
12. Bull Creek F10 01/11/2014 02:00:00 -1,029753685 3,116549245 279,9686279
13. Bull Creek F10 01/11/2014 02:10:00 13,35998058 3,448055706 279,8687134
14. Bull Creek F10 01/11/2014 02:20:00 17,42461395 2,943588415 280,1383057
15. Bull Creek F10 01/11/2014 02:30:00 -9,614940643 2,744164819 280,6514893
16. Bull Creek F10 01/11/2014 02:50:00 -11,01966286 3,554833538 283,1451416
17. Bull Creek F10 01/11/2014 03:00:00 -4,383010387 4,279259377 283,3281555
我想知道是否有更聪明的方法来做到这一点,而不是我迄今为止所做的,因为箱子的大小和最大值可能会改变。但这是我的代码,它有效,但不是很聪明。
import pandas as pd
fileREF = 'FilterDataREF.xlsx'
dataREF = pd.read_excel(fileREF, sheetname='Sheet1')
filePCU = 'FilterDataPCU.xlsx'
dataPCU = pd.read_excel(filePCU, sheetname='Ark1')
dateREF = dataREF['Timestamp']
datePCU = dataPCU['Timestamp']
n = 50
PowerLim = 1500
nBins = PowerLim/n
bins = range(0, PowerLim+1, n)
for i in range(len(dataREF)):
for j in range(len(dataPCU)):
if dataREF['Timestamp'][i] == dataPCU['Timestamp'][j] and
dataREF['Power'][i] > 0 and dataPCU['Power'][j] > 0:
data_common = [dataREF.loc[i], dataPCU.loc[j]]
data_power = [data_common[0][3], data_common[1][3]]
power_dif = data_common[1][3]-data_common[0][3]
power_REF = data_power[:][0]
power_PCU = data_power[:][1]
bin1 = power_REF[power_REF < 50]
bin2 = power_REF[power_REF > 50 and power_REF < 100]
bin3 = power_REF[power_REF > 100 and power_REF < 150]
答案 0 :(得分:0)
您可以使用.cut功能:
data_common['bin'] = pd.cut(data_common['power_REF'],bins=(0,max(data_common['power_REF'])+50,50),labels=range(0,max(data_common['powerREF'])+50,50))