广播两个数据帧

时间:2019-06-25 09:17:07

标签: python pandas datetime

我有2个数据框,如下所示:

第一个数据帧data

             2019-06-19     2019-06-20     2019-06-21     2019-06-22     2019-06-23     2019-06-24     2019-06-25
 currency                                                                                                         
BCH          485.424079     485.424079      57.574609      57.559609      57.559609      57.559609      57.559609
BTC          202.204572     256.085103     197.291801     177.359726     177.359726     177.359726     252.859726
BTG         4065.370000    4065.370000    4065.370000    4065.370000    4065.370000    4065.370000    4065.370000
ETC        40001.000000   40001.000000   40001.000000   40001.000000   40001.000000   40001.000000       0.000000
ETH         4092.917231    4092.917231    1497.655594    1497.655594    1497.655594    1497.655594    1497.655594

第二个数据帧sys_bal

created_at  2019-06-19  2019-06-20  2019-06-21  2019-06-22  2019-06-23  2019-06-24  2019-06-25
 currency                                                                                      
1WO            1997308     1996908     1996908     1996908     1996908     1996908     1996908
ABX             241444      241444      241444      241444      241444      241444      241444
ADH            5981797     5981797     5981797     5981797     5981797     5981797     5981797
ALX             385466      385466      385466      385466      385466      385466      385466
AMLT           4749604     4749604     4749604     4687869     4687869     4687869     4687869
BCH               4547        4547        4483        4463        4465        4467        4403
BRC            1231312     1231312     1231312     1231312     1231312     1231312     1231142
BTC               7366        7342        7287        7307        8292        8635        7772
BTRN          15236038    15236038    15236038    15236038    15236038    15236233    15236233

我尝试通过进行pos_bal = sys_bal + data来添加一个。它们的大小相同,但是我有一个错误。

错误:

pos_bal = sys_bal + data
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/ops.py", line 1547, in f
other = _align_method_FRAME(self, other, axis)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/ops.py", line 1481, in _align_method_FRAME
right = to_series(right)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/ops.py", line 1456, in to_series
given_len=len(right)))
ValueError: Unable to coerce to Series, length must be 7: given 2

我同时打印了两个数据框的dtype,并且得到了以下内容:

第一个数据帧:

2019-06-19    float64
2019-06-20    float64
2019-06-21    float64
2019-06-22    float64
2019-06-23    float64
2019-06-24    float64
2019-06-25    float64
dtype: object

第二个数据帧:

   created_at
0  2019-06-19    int64
   2019-06-20    int64
   2019-06-21    int64
   2019-06-22    int64
   2019-06-23    int64
   2019-06-24    int64
   2019-06-25    int64
 dtype: object

data.info()输出:

<class 'pandas.core.frame.DataFrame'>
Index: 12 entries, BCH to XRP
Data columns (total 7 columns):
2019-06-20    12 non-null float64
2019-06-21    12 non-null float64
2019-06-22    12 non-null float64
2019-06-23    12 non-null float64
2019-06-24   12 non-null float64
2019-06-25    12 non-null float64
2019-06-26   12 non-null float64
dtypes: float64(7)
memory usage: 768.0+ bytes
None

sys_bal.info()输出:

<class 'pandas.core.frame.DataFrame'>
 Index: 126 entries, 1WO to ZPR
 Data columns (total 7 columns):
 2019-06-20    126 non-null int64
 2019-06-21    126 non-null int64
 2019-06-22    126 non-null int64
 2019-06-23    126 non-null int64
 2019-06-24    126 non-null int64
 2019-06-25    126 non-null int64
 2019-06-26    126 non-null int64
 dtypes: int64(7)
 memory usage: 7.9+ KB
 None

2 个答案:

答案 0 :(得分:0)

data=pd.DataFrame({'currency':['BCH','BTC'],'2019-06-19 ':['485.424079','202.204572'],'2019-06-20':['485.424079','256.085103']})
sys_bal=pd.DataFrame({'currency':['1WO','ABX'],'2019-06-19 ':['1997308','241444'],'2019-06-20':['1996908','241444']})

编辑:如果您收到'dict' object has no attribute 'set_index' 这意味着您没有像我期望的那样使用数据帧,请尝试对数据使用:

data=pd.DataFrame.from_dict(data)
sys_bal=pd.DataFrame.from_dict(sys_bal)
data=data.set_index('currency')
sys_bal=sys_bal.set_index('currency')

df=pd.concat([data,sys_bal])
print(df)
         2019-06-19   2019-06-20
currency                        
BCH       485.424079  485.424079
BTC       202.204572  256.085103
1WO          1997308     1996908
ABX           241444      241444

它应该为您工作,如果可能不尝试查看您的数据帧,我会发现在sys_bal中您还有其他标头名称created_at

答案 1 :(得分:0)

问题是第二个DataFrame列中有MultIindex,因此需要首先通过Index.droplevel删除它:

data.columns = data.columns.droplevel(0)
pos_bal = sys_bal + data