Python,Pandas ===>根据其他列创建新列

时间:2017-05-21 13:01:41

标签: python-3.x pandas dataframe

我有一个这样的数据框:

        nt
12062   Python Pandas: Create new column out of other columns where value is not null
12063   Python Pandas Create New Column with Groupby().Sum()
12064   
12065   Python - Pandas - create “first fail” column from other column data
12066   
12067   
12068   Creating new column in pandas based on value of other column
12070   Merge with pandas creating new columns?

我想得到的是:

如果nt列具有“Create”字,则创建一个新列(列名为CreateC),其行等于1。像这样:

        nt                                                                             CreateC
12062   Python Pandas: Create new column out of other columns where value is not null   1
12063   Python Pandas Create New Column with Groupby().Sum()                            1
12064                                                                           0
12065   Python - Pandas - create “first fail” column from other column data     1
12066                                                                           0
12067                                                                   0
12068   Creating new column in pandas based on value of other column    0
12070   Merge with pandas creating new columns?                         0

我所做的是:

我在索引上创建了一个新的列 然后找到包含'创建'的行 然后找到这些行的索引号

df['index1'] = df.index
dfCreate = df[df['dataframe'].str.contains("Create", na = False)]
dfCreateIndex = dfCreate.index.tolist()

def CreateCs (row):
    RowIndex1 = pd.to_numeric(row['index1'], errors='coerce')
    for i in dfCreateIndex:
        y = dfCreateIndex
        if RowIndex1 == y:
            return '1'
        else:
            return '0'
df['CreateC'] = df.apply(lambda row: CreateCs(row), axis=1)

但我只得到了:

ValueError: ('The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0')

有什么简单的方法吗?

1 个答案:

答案 0 :(得分:1)

您可以将str.contains用于布尔掩码,然后将TrueFalse转换为1,将0转换为astype至{{1然后由另一个int转换为str(如有必要):

astype

numpy.where的另一个解决方案:

df['CreateC'] = df['nt'].str.contains('Create', case=False).astype(int).astype(str)
print (df)
                                                      nt CreateC
12062  Python Pandas: Create new column out of other ...       1
12063  Python Pandas Create New Column with Groupby()...       1
12064                                                          0
12065  Python - Pandas - create “first fail” column f...       1
12066                                                          0
12067                                                          0
12068  Creating new column in pandas based on value o...       0
12070            Merge with pandas creating new columns?       0