我有这段代码:
gg=df_met[['Less','Middle','Greater']].resample('h').mean()
Filtered_mean=Filtered[['Conc']].resample('h').mean()
result = pd.concat([Filtered_mean, gg], axis=1, join_axes=[df1.index])
Reduced_result=result.dropna(axis=0,how='any')
gg是一个文件:
Less Middle Greater
Date
2004-02-27 00:00:00 0.000000 1.000000 0.000000
2004-02-27 01:00:00 0.000000 1.000000 0.000000
2004-02-27 02:00:00 0.000000 1.000000 0.000000
2004-02-27 03:00:00 0.083333 0.916667 0.000000
2004-02-27 04:00:00 0.583333 0.416667 0.000000
2004-02-27 05:00:00 0.083333 0.916667 0.000000
2004-02-27 06:00:00 0.666667 0.333333 0.000000
2004-02-27 07:00:00 0.750000 0.250000 0.000000
2004-02-27 08:00:00 0.250000 0.750000 0.000000
2004-02-27 09:00:00 1.000000 0.000000 0.000000
2004-02-27 10:00:00 0.250000 0.750000 0.000000
2004-02-27 11:00:00 1.000000 0.000000 0.000000
2004-02-27 12:00:00 0.916667 0.083333 0.000000
2004-02-27 13:00:00 0.000000 1.000000 0.000000
2004-02-27 14:00:00 0.000000 1.000000 0.000000
2004-02-27 15:00:00 0.000000 1.000000 0.000000
2004-02-27 16:00:00 0.000000 1.000000 0.000000
2004-02-27 17:00:00 0.000000 1.000000 0.000000
2004-02-27 18:00:00 0.000000 1.000000 0.000000
2004-02-27 19:00:00 0.083333 0.916667 0.000000
2004-02-27 20:00:00 0.000000 0.500000 0.500000
2004-02-27 21:00:00 0.000000 0.000000 1.000000
2004-02-27 22:00:00 0.000000 0.000000 1.000000
2004-02-27 23:00:00 0.000000 0.000000 1.000000
2004-02-28 00:00:00 0.000000 0.666667 0.333333
2004-02-28 01:00:00 0.000000 0.833333 0.166667
2004-02-28 02:00:00 0.000000 0.166667 0.833333
2004-02-28 03:00:00 0.000000 0.000000 1.000000
2004-02-28 04:00:00 0.000000 0.000000 1.000000
2004-02-28 05:00:00 0.000000 0.000000 1.000000
等
Filtered_mean是:
Conc
2004-02-27 15:00 30.166667
2004-02-27 16:00 24.218182
2004-02-27 17:00 44.781818
2004-02-27 18:00 15.200000
2004-02-27 19:00 33.490000
2004-02-27 20:00 17.100000
2004-02-27 21:00 15.470000
2004-02-27 22:00 13.100000
2004-02-27 23:00 17.736364
2004-02-28 00:00 19.225000
2004-02-28 01:00 9.760000
2004-02-28 02:00 2.737500
2004-02-28 03:00 4.175000
2004-02-28 04:00 2.990000
2004-02-28 05:00 4.983333
2004-02-28 06:00 3.370000
2004-02-28 07:00 2.983333
2004-02-28 08:00 3.508333
2004-02-28 09:00 2.641667
2004-02-28 10:00 4.916667
2004-02-28 11:00 7.100000
2004-02-28 12:00 11.609091
2004-02-28 13:00 5.540000
2004-02-28 14:00 3.025000
2004-02-28 15:00 5.127273
2004-02-28 16:00 11.660000
2004-02-28 17:00 5.833333
2004-02-28 18:00 8.183333
2004-02-28 19:00 -0.158333
2004-02-28 20:00 6.575000
当我将它们连接起来时
Conc Less Middle Greater
Date
2004-02-27 15:00 30.166667 NaN NaN NaN
2004-02-27 15:00 30.166667 NaN NaN NaN
2004-02-27 15:00 30.166667 NaN NaN NaN
2004-02-27 16:00 24.218182 NaN NaN NaN
这是因为索引列是一个整数
dtype='int64', length=34342, freq='H')
和“gg”是日期时间。
dtype='datetime64[ns]', name='Date', length=42479, freq='H')
如果是这样,如何将整个帧转换为另一个?
完整代码:
import pandas as pd
import datetime as dt
import io
import numpy as np
names=['Date','Wind Speed','Wind Direction']
df2 = pd.read_csv('Met_12_13.csv', index_col=0, names=names, parse_dates=[0])
df_met=df2
df_met.insert(2,'Less','Nan')
df_met.insert(3,'Middle','Nan')
df_met.insert(4,'Greater','Nan')
for line in df2:
flag1=(df2['Wind Speed']<4)
flag1=flag1.astype(int)
flag2=(df2['Wind Speed']>=4 ) & (df2['Wind Speed']<=10)
flag2=flag2.astype(int)
flag3=(df2['Wind Speed']>10)
flag3=flag3.astype(int)
df_met['Less']=flag1
df_met['Middle']=flag2
df_met['Greater']=flag3
aethalometer=['Date','Chanel0','Chanel1','Chanel2','Chanel3','Chanel4','Chanel5','Chanel6','Chanel7']
#df1=pd.read_csv('result.txt', index_col=0,sep='\n', names=aethalometer, parse_dates=[0])
df1 = pd.read_csv('Ath_12_13.csv', sep=',', names=aethalometer ) #Spirows=1
df1['Date'] = pd.to_datetime(df1['Date'], errors='coerce')
for y in range (0,6):
x=y+1
df1[aethalometer[x]]= pd.to_numeric(df1[aethalometer[x]], errors='coerce')
df1=df1[df1[aethalometer[x]]>-250]
df1=df1[df1[aethalometer[x]]<500]
df1['Date'] = pd.to_datetime(df1['Date'], errors='coerce')
df1.index
print(len(df1))
#df1 = pd.read_csv(io.StringIO('Output14.csv'), parse_dates=[0], names=['Date','A','B','C','D','E','F','G', 'H'])
#df_mean = df1[['Conc']].resample('h').mean()
print("here")
#df1.index = df1.index.to_period('h')
df_met['per'] = df_met.index.to_period('h')
#df_mean.index=df_mean.index.to_period('h')
#print(len(df_mean))
pers = df_met.loc[(df2['Wind Direction'] > 340) | (df_met['Wind Direction'] < 12) , 'per'].unique()
print (pers)
print("here")
#%%
Filtered=df1.drop(pers)
#del Filtered['Date']
a=Filtered['Chanel1']
a.index = pd.to_datetime(a.index, errors='coerce')
b=Filtered['Chanel2']
b.index = pd.to_datetime(b.index, errors='coerce')
c=Filtered['Chanel3']
c.index = pd.to_datetime(c.index, errors='coerce')
d=Filtered['Chanel4']
d.index = pd.to_datetime(d.index, errors='coerce')
e=Filtered['Chanel5']
e.index = pd.to_datetime(e.index, errors='coerce')
f=Filtered['Chanel0']
f.index = pd.to_datetime(f.index, errors='coerce')
g=Filtered['Chanel7']
g.index = pd.to_datetime(g.index, errors='coerce')
a=a.resample('h').mean()
a_median=a.resample('h').median() #This is how you would make it median
b=b.resample('h').mean()
c=c.resample('h').mean()
d=d.resample('h').mean()
e=e.resample('h').mean()
f=f.resample('h').mean()
g = pd.to_numeric(g, errors='coerce')
g=g.resample('h').mean()
Series=pd.concat([a,b,c,d,e,f,g],join='outer',axis=1)
gg=df_met[['Less','Middle','Greater']].resample('h').mean()
result_mean = pd.concat([Series, gg], axis=1, join_axes=[gg.index])
Reduced_result_mean=result_mean.dropna(axis=0,how='any')
Reduced_result_mean.to_csv("Final2012-13.csv")
答案 0 :(得分:2)
使用
filtered_mean.reset_index(inplace=True)
filtered_mean['date']=pd.to_datetime(filtered_mean['date'])
filtered_mean.set_index('date',inplace=True)
现在filtered_mean
和gg
都应该有日期时间索引。