如何将熊猫中的两个“ iterrows”循环转换为熊猫方法

时间:2019-03-27 14:20:03

标签: pandas

我正在尝试在使用两个iterrows的熊猫中转换两个嵌套的for循环,以提高性能和使用熊猫方法的速度。

最初,我已经使用两个数据帧并遍历嵌套循环并使用条件比较值来解决此问题,然后使用其索引在嵌套循环中设置了值。但是,由于这样做速度较慢,而且熊猫使用不当,因此我尝试使用诸如applymerge之类的方法,但无法解决该问题。 link为我提供了一些指导,但没有太多。

poa_col = [col for col in poa.columns if 'CODIGO_' in col]
for idx, row in df_non_dup.iterrows():
    for sub_idx, sub_row in poa.iterrows():
        if row['CODIGO_SITE'] == sub_row[poa_col[0]]:
            if '/' in row['g_names']:

                g_names_split = row['g_names'].split('/')
                for g_name in g_names_split:
                    if '2G' in g_name:
                         if pd.isnull(sub_row['ALARMAS 2G']):
                             poa['ALARMAS 2G'].loc[sub_idx] = row['Name']
                         else:
                             poa['ALARMAS 2G'].loc[sub_idx] = str(
                                     sub_row['ALARMAS 2G']) + '/' + row['Name']
                    elif '3G' in g_name:
                         if pd.isnull(sub_row['ALARMAS 3G']):
                             poa['ALARMAS 3G'].loc[sub_idx] = row['Name']
                         else:
                             poa['ALARMAS 3G'].loc[sub_idx] = str(
                                     sub_row['ALARMAS 3G']) + '/' + row['Name']
                    elif '4G' in g_name:
                         if pd.isnull(sub_row['ALARMAS 4G']):
                             poa['ALARMAS 4G'].loc[sub_idx] = row['Name']
                         else:
                             poa['ALARMAS 4G'].loc[sub_idx] = str(
                                     sub_row['ALARMAS 4G']) + '/' + row['Name']

            else: 
                if '2G' in row['g_names']:
                    if pd.isnull(sub_row['ALARMAS 2G']):
                        poa['ALARMAS 2G'].loc[sub_idx] = row['Name']
                    else:
                        poa['ALARMAS 2G'].loc[sub_idx] = str(
                                sub_row['ALARMAS 2G']) + '/' + row['Name']
                elif '3G' in row['g_names']:
                    if pd.isnull(sub_row['ALARMAS 3G']):
                        poa['ALARMAS 3G'].loc[sub_idx] = row['Name']
                    else:
                        poa['ALARMAS 3G'].loc[sub_idx] = str(
                                sub_row['ALARMAS 3G']) + '/' + row['Name']
                elif '4G' in row['g_names']:
                    if pd.isnull(sub_row['ALARMAS 4G']):
                        poa['ALARMAS 4G'].loc[sub_idx] = row['Name']
                    else:
                        poa['ALARMAS 4G'].loc[sub_idx] = str(
                                sub_row['ALARMAS 4G']) + '/' + row['Name']

以上是我最初的尝试,虽然可以,但是需要很长时间。

下面是一些示例数据;

poa.head(1)
Out[230]: 
   CODIGO_Elemento Red  ALARMAS 4G  ALARMAS 3G  ALARMAS 2G
   DAF                  NaN         NaN         NaN


df_non_dup.head(2)
Out[231]: 
      Name              CODIGO_SITE    g_names
0  -  Clapham           DAF            2G
1  -  Brixton           DAF            2G

使用显示的数据,我希望能够将ALARMAS 2G附加到df_non_dup['Name']中,因为df_non_dup['g_names']都是2G,所以poa.head(1)看起来像;

Out[230]: 
   CODIGO_Elemento Red  ALARMAS 4G  ALARMAS 3G  ALARMAS 2G
   DAF                  NaN         NaN         Clapham/Brixton

0 个答案:

没有答案