使用pandas

时间:2016-08-24 14:09:45

标签: python pandas dataframe merge

我有两个数据帧:

df_energy.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 34673 entries, 1 to 43228
Data columns (total 6 columns):
TIMESTAMP        34673 non-null datetime64[ns]
P_ACT_KW         34673 non-null float64
PERIODE_TARIF    34673 non-null object
P_SOUSCR         34673 non-null float64
SITE             34673 non-null object
TARIF            34673 non-null object
dtypes: datetime64[ns](1), float64(2), object(3)
memory usage: 1.9+ MB

和df1:

df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38840 entries, 0 to 38839
Data columns (total 7 columns):
TIMESTAMP                 38840 non-null datetime64[ns]
ACT_TIME_AERATEUR_1_F1    38696 non-null float64
ACT_TIME_AERATEUR_1_F3    38697 non-null float64
ACT_TIME_AERATEUR_1_F5    38695 non-null float64
ACT_TIME_AERATEUR_1_F6    38695 non-null float64
ACT_TIME_AERATEUR_1_F7    38693 non-null float64
ACT_TIME_AERATEUR_1_F8    38696 non-null float64
dtypes: datetime64[ns](1), float64(6)
memory usage: 2.1 MB

我尝试根据TIMESTAMP列合并这两个数据帧:

merged_df_energy = pd.merge(df_energy.set_index('TIMESTAMP'), 
                     df1,
                     right_index=True,
                     left_index =True)

但是我收到了这个错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-190-34cd0916eb6a> in <module>()
      2                      df1,
      3                      right_index=True,
----> 4                      left_index =True)
      5 merged_df_energy.info()

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\tools\merge.py
     合并中的

(左,右,如何,上,左_,右_,左_指数,   right_index,sort,suffixes,copy,indicator)            37 right_index = right_index,sort = sort,suffixes = suffixes,            38 copy = copy,indicator = indicator)       ---&GT; 39返回op.get_result()            40如果调试:            41合并。 doc = _merge_doc%&#39; \ nleft:DataFrame&#39;

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\tools\merge.py
     get_result中的

(个体经营)           215 self.left,self.right)           216        - &GT; 217 join_index,left_indexer,right_indexer = self._get_join_info()           218           219 ldata,rdata = self.left._data,self.right._data

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\tools\merge.py
     _get_join_info中的

(个体经营)           337如果self.left_index和self.right_index:           338 join_index,left_indexer,right_indexer = \        - &GT; 339 left_ax.join(right_ax,how = self.how,return_indexers = True)           340 elif self.right_index和self.how ==&#39; left&#39;:           341 join_index,left_indexer,right_indexer = \

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\tseries\index.py
     

in join(self,other,how,level,return_indexers)          1072这个,其他= self._maybe_utc_convert(其他)          1073返回Index.join(this,other,how = how,level = level,        - &GT; 1074 return_indexers = return_indexers)          1075          1076 def _maybe_utc_convert(self,other):

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\indexes\base.py
     

in join(self,other,how,level,return_indexers)          2480 this = self.astype(&#39; O&#39;)          2481其他= other.astype(&#39; O&#39;)        - &GT; 2482返回this.join(其他,how = how,return_indexers = return_indexers)          2483          2484 _validate_join_method(how)

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\indexes\base.py
     

in join(self,other,how,level,return_indexers)          2493其他:          2494返回self._join_non_unique(其他,如何=如何,        - &GT; 2495 return_indexers = return_indexers)          2496 elif self.is_monotonic和other.is_monotonic:          2497尝试:

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\indexes\base.py
     _join_non_unique中的

(self,other,how,return_indexers)          2571 left_idx,right_idx = _get_join_indexers([self.values],          2572 [other._values],怎么样=怎么样,        - &GT; 2573 sort = True)          2574          2575 left_idx = com._ensure_platform_int(left_idx)

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\tools\merge.py
     _get_join_indexers中的

(left_keys,right_keys,sort,how)           544           545#离开&amp;右连接标签和数量。每个地点的水平        - &GT; 546 llab,rlab,shape = map(list,zip(* map(fkeys,left_keys,right_keys)))           547           548#从标签列表中获取平板i8键

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\tools\merge.py
     _factorize_keys中的

(lk,rk,sort)           718如果排序:           719 uniques = rizer.uniques.to_array()        - &GT; 720 llab,rlab = _sort_labels(uniques,llab,rlab)           721           722#NA组

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\tools\merge.py
     _sort_labels中的

(独特,左,右)           741 uniques =索引(唯一).values           742        - &GT; 743 sorter = uniques.argsort()           744           745 reverse_indexer = np.empty(len(sorter),dtype = np.int64)

pandas\tslib.pyx in pandas.tslib._Timestamp.__richcmp__ (pandas\tslib.c:18619)()

TypeError: Cannot compare type 'Timestamp' with type 'int'

你能帮我解决这个问题吗?

谢谢

2 个答案:

答案 0 :(得分:2)

你试试看之后能告诉我输出吗? 这应该有效:

merged_inner = pd.merge(left=df_energy, right=df1, 
                       left_on='TIMESTAMP', right_on='TIMESTAMP')

答案 1 :(得分:2)

试试这个:

import pandas

result = pandas.merge(df_energy, df1, on='TIMESTAMP')

如果您想保存它:

result.to_csv(path_or_buf='result.csv', sep=',')

或检查列:

result_fields = result.columns.tolist()
print (result_fields)