Question

我想根据其他2列中的值填充1到16的数字列。我可以从提供列标题开始或创建新列（对我而言无关紧要）。

我试图创建一个对数字1-10进行迭代的函数，然后根据b和y的值将值分配给z变量。然后，我想将此功能应用于数据框中的每一行。

将熊猫作为pd导入

将numpy导入为np

data = pd.read_csv（'Nuc.csv'）

def write_Pcolumns(df):

    """populates a column in the given dataframe, df, based on the values in two other columns in the same dataframe"""

    #create string of numbers for each nucleotide position 
    positions = ('1','2','3','4','5','6','7','8','9','10')
    a = "Po "
    x = "O.Po "
    #for each position create a variable for the nucleotide in the sequence (Po) and opposite to the sequence(o. Po)
for each in positions: 
        b = a + each
        y = x + each
        z = 'P' + each
        #assign a value to z based on the nucleotide identities in the sequence and opposite position
        if df[b] == 'A' and df[y]=='A':
            df[z]==1
        elif df[b] == 'A' and df[y]=='C':
            df[z]==2
        elif df[b] == 'A' and df[y]=='G':
            df[z]==3
        elif df[b] == 'A' and df[y]=='T':
            df[z]==4
        ...
        elif df[b] == 'T' and df[y]=='G':
            df[z]==15
        else:
            df[z]==16
    return(df)

data.apply（write_Pcolumns（data），轴= 1）

我收到以下错误消息：系列的真实值是不明确的。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）。

Answer 1

之所以会这样，是因为df[index]=='value'返回了一系列布尔值，而不是每个值都返回一个布尔值。

签出Pandas error when using if-else to create new column: The truth value of a Series is ambiguous

如何创建多个新列，并使用pandas / python根据其他2列中的值填充列？

1 个答案: