在Pandas中,使用isin将数据帧与其他数据帧匹配

时间:2017-01-11 11:48:13

标签: python pandas

我有2个数据帧:

local_PC_user_filer_OpCode_sum:

   client_op  clienthostid  eventSum   feeling  usersidid
0       5030             1         1    Happy        5
1       5030             1         2    Mad          5
2       5030             1         8    Sick         6
3       5030             3         9  GoingCrazy     8

df_old_enough_users:

    client_op   clienthostid    eventSum    filerid timestamp   usersidid
0   5030              1             1           1     1/11/2015    5

现在,我尝试做的是从 local_PC_user_filer_OpCode_sum中获取所有匹配[<&#39; usersidid&#39;,&#39; clienthostid&# 39;]]使用df_old_enough_users,所以我期望找到的是:

      client_op  clienthostid  eventSum    feeling       usersidid
0       5030             1         1        Happy          5

我试着用isin:

这样做
local_PC_user_filer_OpCode_sum[local_PC_user_filer_OpCode_sum.clienthostid.isin(df_old_enough_users.loc[:,['usersidid','clienthostid']])].reset_index(drop=True)

但是我得到一个空数据框:( 我做错了什么,是否有(更好的)方法来做我需要的事情?

谢谢,

2 个答案:

答案 0 :(得分:2)

您可以使用join

cols = ['usersidid', 'clienthostid']
a = local_PC_user_filer_OpCode_sum.set_index(cols)
print (df_old_enough_users.join(a, on=cols, lsuffix='_x')[local_PC_user_filer_OpCode_sum.columns].reset_index(drop=True))

   client_op  clienthostid  eventSum  filerid feeling  usersidid
0       5030             1         1        1   Happy          5
1       5030             1         2        1     Mad          5

isin解决方案不起作用,因为columnsindex都需要DataFrames<div class="container" ng-controller="myCtrl"> <textarea rows="10" cols="50" ng-model="word" ng-change="wordCount()"></textarea> <br> {{word}} <ul> <li ng-repeat="(key, value) in obj">{{key}} : {{value}}</li> </ul> <dir text="largetext"></dir> </div> var myApp = angular.module('myApp', []); myApp.controller('myCtrl', ['$scope', function($scope){ $scope.obj = {} $scope.word = "d a c a d a b c b b c b d d"; function wordCount() { var str = $scope.word.trim().replace(/[\.\,\']/g,"").split(" "); $scope.obj = {} str.forEach(function(word){ (word in $scope.obj) ? $scope.obj[word]++ : $scope.obj[word] = 1 }) return $scope.obj; } $scope.wordCount = wordCount; console.log($scope.wordCount()); $scope.obj = wordCount(); }]) 匹配。

答案 1 :(得分:1)

如果您有兴趣修改@ jezrael的答案,这可能会给您一个更清晰的答案。

df = pd.merge(local_PC_user_filer_OpCode_sum, 
              df_old_enough_users[['usersidid','clienthostid']], 
              on=['usersidid','clienthostid'], 
              how="right")["client_op", "clienthostid", "eventSum",  "filerid", "timestamp", "usersidid"]

df将具有原始local_PC_user_filer_OpCode_sum数据框中的确切列,并且返回的行将仅位于您用作过滤器的右表中。