Question

我有以下脚本：

import pandas as pd

ls = [
      ['A', 1, 'A1', 9],
      ['A', 1, 'A1', 6],
      ['A', 1, 'A1', 3],
      ['A', 2, 'A2', 7],
      ['A', 3, 'A3', 9],
      ['B', 1, 'B1', 7],
      ['B', 1, 'B1', 3],
      ['B', 2, 'B2', 7],
      ['B', 2, 'B2', 8],
      ['C', 1, 'C1', 9],

      ]

#convert to dataframe
df = pd.DataFrame(ls, columns = ["Main_Group", "Sub_Group", "Concat_GRP_Name", "X_1"]) 

#get count and sum of concatenated groups
df_sum = df.groupby('Concat_GRP_Name')['X_1'].agg(['sum','count']).reset_index()

#print in permutations formula to calculate different permutation combos   
import itertools as it
perms = it.permutations(df_sum.Concat_GRP_Name)


def combute_combinations(df, colname):
    l = []
    import itertools as it
    perms = it.permutations(df[colname])

    for perm_pairs in perms:
        #take in only the first three pairs of permuations and make sure
        #the first column starts with A, secon with B, and third with C
        if 'A' in perm_pairs[0] and 'B' in perm_pairs[1] and 'C' in perm_pairs[2]:
            l.append([perm_pairs[0], perm_pairs[1], perm_pairs[2]])
    return l

#apply function, this will generate a list of all of the permuation pairs
t = combute_combinations(df_sum, 'Concat_GRP_Name' )

#convert to dataframe and drop duplicate pairs
df2 = pd.DataFrame(t, columns = ["Item1", 'Item2', 'Item3']) .drop_duplicates()

我不确定如何在IF语句中组合循环的各个组成部分。从上面的示例中，我知道我有三种不同类型的Main_Group变量。假设我不知道Main_Group列中存在多少个唯一值。如何更新以下IF语句以解决此问题？

if 'A' in perm_pairs[0] and 'B' in perm_pairs[1] and 'C' in perm_pairs[2]:
                l.append([perm_pairs[0], perm_pairs[1], perm_pairs[2]])

我希望每个变量都在其自己的列中。如果我有5种主要组，那么我的IF语句中将有perm_pairs [0]至perm_pairs [4]。我正在考虑提取Main_Group中的值并将其转换为一个集合。然后，我将遍历每个值并使用其长度来确定IF语句，但到目前为止，逻辑还没有解决。如何遍历集合，然后更新IF语句？

Answer 1

要使条件更加动态，可以这样重构函数：

import numpy as np

def combute_combinations(df, colname, main_group_series):
    l = []
    import itertools as it
    perms = it.permutations(df[colname])

    # Provides sorted list of unique values in the Series
    unique_groups = np.unique(main_group_series)

    for perm_pairs in perms:
        #take in only the first three pairs of permuations and make sure
        #the first column starts with A, secon with B, and third with C
        if all([main_group in perm_pairs[ind] for ind, main_group in enumerate(unique_groups)]):
            l.append([perm_pairs[ind] for ind in range(unique_groups.shape[0])])
    return l

然后，您可以像以前一样调用该函数，但要包含主要组列的系列

t = combute_combinations(df_sum, 'Concat_GRP_Name', df['Main_Group'])

IF语句取决于循环

1 个答案: