将两个看起来相同类型的数据框连接在一起会出现错误“ ValueError:您正在尝试合并对象和int64列”

时间:2019-10-13 03:56:45

标签: python pandas

我想在字段'ga:dimension1'上加入两个数据帧,session1和session2。

sessions1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15775 entries, 0 to 15774
Data columns (total 9 columns):
ga:dimension1                15775 non-null object
ga:date                      15775 non-null object
ga:deviceCategory            15775 non-null object
ga:landingPagePath           15775 non-null object
ga:userType                  15775 non-null object
ga:operatingSystem           15775 non-null object
ga:operatingSystemVersion    15775 non-null object
ga:sessions                  15775 non-null int64
ga:bounces                   15775 non-null int64
dtypes: int64(2), object(7)
memory usage: 1.1+ MB
sessions2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15774 entries, 0 to 15773
Data columns (total 9 columns):
ga:dimension1         15774 non-null object
ga:source             15774 non-null object
ga:medium             15774 non-null object
ga:campaign           15774 non-null object
ga:adContent          15774 non-null object
ga:keyword            15774 non-null object
ga:channelGrouping    15774 non-null object
ga:sessions           15774 non-null int64
ga:bounces            15774 non-null int64
dtypes: int64(2), object(7)
memory usage: 1.1+ MB

至少看一下前几行:

sessions1.head()
            ga:dimension1   ga:date  ... ga:sessions ga:bounces
0  1567331564026.evxjzuot  20190901  ...           1          1
1  1567331572999.vtnsczsj  20190901  ...           1          1
2  1567331693070.fkdbmcj6  20190901  ...           1          1
3  1567335919816.ctz12xcl  20190901  ...           1          0
4  1567345181556.b3yowmbh  20190901  ...           1          1

sessions2.head()
            ga:dimension1 ga:source  ... ga:sessions ga:bounces
0  1567331564026.evxjzuot  (direct)  ...           1          1
1  1567331572999.vtnsczsj  (direct)  ...           1          1
2  1567331693070.fkdbmcj6  (direct)  ...           1          1
3  1567335919816.ctz12xcl  (direct)  ...           1          0
4  1567345181556.b3yowmbh  (direct)  ...           1          1

但是,当我尝试此操作时:

sessions_combined = sessions1.join(sessions2,
                                   on = 'ga:dimension1',
                                   how = 'left')

我收到一条错误消息:

  

ValueError:您正在尝试合并object和int64列。如果   您希望继续,应该使用pd.concat

这是为什么,我应该如何将两个数据帧连接在一起?

1 个答案:

答案 0 :(得分:0)

使用merge

sessions_combined = sessions1.merge(sessions2,
                                   on = 'ga:dimension1',
                                   how = 'left')