如何比较嵌套for循环python中数据框的列表和列的值?

时间:2017-01-04 14:03:13

标签: python pandas

我有一个清单:

c=['8','24','17','12','15']

以及带有' id'的数据框作为其索引:

    email             first name    last name     company     address
id
 8  adi@xyz.com       adi           dominic       kpmg        farrer park
21  kendric@xyz.com   kendrick      lamar         adidas      boon keng
14  keisha@xyz.com    keisha        jinga         lenovo      chinatown

我正在尝试构建一个嵌套的for循环,比较list c和dataframe acitve_users的元素。但是当我尝试使用嵌套for循环和if语句来比较它们时,我收到了一个关键错误。这是我正在使用的代码

for user_id in c:
        for ids in active_users['id']:
            if user_id==ids:
                print(user_id)

我认为这是一个问题,因为' id'是' active_users'中的索引。数据帧。下面是完整的代码以及函数

 def get_inactive_users(self):
    df = self.all_sent.set_index('created')
    now = dt.datetime.now()
    a=df.resample(now.strftime('W')).sender_user_id.unique()
                                           .reset_index(name='user_ids')
    c=a['user_ids'].iloc[-1]
    e=list(set(self.active_user_ids) - set(c))
    for user_id in c:
        for ids in self.active_users['id']:
            if user_id==ids:
                print(user_id)
    return e

 cd = ClientData(1)
 a=cd.get_inactive_users()
 a

以下是堆栈跟踪:

KeyError                                  Traceback (most recent call last)
C:\Users\aditya\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_   loc(self, key, method, tolerance)
   1944             try:
-> 1945                 return self._engine.get_loc(key)
   1946             except KeyError:

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()

KeyError: 'id'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-48-e5f62de3e239> in <module>()
      1 cd = ClientData(1)
----> 2 a=cd.get_inactive_users()
      3 a

<ipython-input-47-25d3329cdfd2> in get_inactive_users(self)
    315         e=list(set(self.active_user_ids) - set(c))
    316         for user_id in c:
--> 317             for ids in self.active_users['id']:
    318                 if user_id==ids:
    319                     print(user_id)

C:\Users\aditya\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   1995             return self._getitem_multilevel(key)
   1996         else:
-> 1997             return self._getitem_column(key)
   1998 
   1999     def _getitem_column(self, key):

C:\Users\aditya\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   2002         # get column
   2003         if self.columns.is_unique:
-> 2004             return self._get_item_cache(key)
   2005 
   2006         # duplicate columns & possible reduce dimensionality

C:\Users\aditya\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1348         res = cache.get(item)
   1349         if res is None:
-> 1350             values = self._data.get(item)
   1351             res = self._box_item_values(item, values)
   1352             cache[item] = res

C:\Users\aditya\Anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   3288 
   3289             if not isnull(item):
-> 3290                 loc = self.items.get_loc(item)
   3291             else:
   3292                 indexer = np.arange(len(self.items))[isnull(self.items)]

C:\Users\aditya\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)
   1945                 return self._engine.get_loc(key)
   1946             except KeyError:
-> 1947                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   1948 
   1949         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()

KeyError: 'id'

1 个答案:

答案 0 :(得分:0)

<强>更新

In [113]: df
Out[113]:
              email first name last name company      address
id
8       adi@xyz.com        adi   dominic    kpmg  farrer park
21  kendric@xyz.com   kendrick     lamar  adidas    boon keng
14   keisha@xyz.com     keisha     jinga  lenovo    chinatown

In [114]: c = [8,24,17]

In [115]: df[df.index.isin(c)]
Out[115]:
          email first name last name company      address
id
8   adi@xyz.com        adi   dominic    kpmg  farrer park

In [116]: df.query('index in @c')
Out[116]:
          email first name last name company      address
id
8   adi@xyz.com        adi   dominic    kpmg  farrer park

旧回答:

你还在吗?

df.index.isin(c)

或:

df.query('index in @c')

假设cid(索引)具有相同的dtype