我想在字段'ga:dimension1'上加入两个数据帧,session1和session2。
sessions1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15775 entries, 0 to 15774
Data columns (total 9 columns):
ga:dimension1 15775 non-null object
ga:date 15775 non-null object
ga:deviceCategory 15775 non-null object
ga:landingPagePath 15775 non-null object
ga:userType 15775 non-null object
ga:operatingSystem 15775 non-null object
ga:operatingSystemVersion 15775 non-null object
ga:sessions 15775 non-null int64
ga:bounces 15775 non-null int64
dtypes: int64(2), object(7)
memory usage: 1.1+ MB
sessions2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15774 entries, 0 to 15773
Data columns (total 9 columns):
ga:dimension1 15774 non-null object
ga:source 15774 non-null object
ga:medium 15774 non-null object
ga:campaign 15774 non-null object
ga:adContent 15774 non-null object
ga:keyword 15774 non-null object
ga:channelGrouping 15774 non-null object
ga:sessions 15774 non-null int64
ga:bounces 15774 non-null int64
dtypes: int64(2), object(7)
memory usage: 1.1+ MB
至少看一下前几行:
sessions1.head()
ga:dimension1 ga:date ... ga:sessions ga:bounces
0 1567331564026.evxjzuot 20190901 ... 1 1
1 1567331572999.vtnsczsj 20190901 ... 1 1
2 1567331693070.fkdbmcj6 20190901 ... 1 1
3 1567335919816.ctz12xcl 20190901 ... 1 0
4 1567345181556.b3yowmbh 20190901 ... 1 1
sessions2.head()
ga:dimension1 ga:source ... ga:sessions ga:bounces
0 1567331564026.evxjzuot (direct) ... 1 1
1 1567331572999.vtnsczsj (direct) ... 1 1
2 1567331693070.fkdbmcj6 (direct) ... 1 1
3 1567335919816.ctz12xcl (direct) ... 1 0
4 1567345181556.b3yowmbh (direct) ... 1 1
但是,当我尝试此操作时:
sessions_combined = sessions1.join(sessions2,
on = 'ga:dimension1',
how = 'left')
我收到一条错误消息:
ValueError:您正在尝试合并object和int64列。如果 您希望继续,应该使用pd.concat
这是为什么,我应该如何将两个数据帧连接在一起?
答案 0 :(得分:0)
使用merge
sessions_combined = sessions1.merge(sessions2,
on = 'ga:dimension1',
how = 'left')