Question

在我的pandas DataFrame中，我想根据另一列（NewCol）的数据后面的一些条件添加新列（OldCol）。

更具体地说，我的专栏OldCol包含三种类型的字符串：

BB_sometext
sometext1
sometext 1

我想区分这三种类型的字符串。现在，我使用以下代码执行此操作：

df['NewCol'] = pd.Series()
for i in range(0, len(df)):
    if str(df.loc[i, 'OldCol']).split('_')[0] == "BB":
        df.loc[i, 'NewCol'] = "A"
    elif len(str(df.loc[i, 'OldCol']).split(' ')) == 1:
        df.loc[i, 'NewCol'] = "B"
    else:
        df.loc[i, 'NewCol'] = "C"

即使这段代码似乎有效，但我确信有更好的方法可以做这样的事情，因为这看起来非常低效。有谁知道更好的方法吗？提前谢谢。

Answer 1

一般来说，您需要以下类似的内容：

let cell = tableView.dequeueResusableCell(withIdentifier:...)!
 // configure your cell
cell.delegate = self // set the delegate here!

或者，for multiple conditions（请注意每个条件的括号，而不是>>> df.loc[boolean_test, 'NewCol'] = desired_result而不是& ：

and

实施例

让我们从示例>>> df.loc[(boolean_test1) & (boolean_test2), 'NewCol'] = desired_result开始：

Data.Frame

然后你做：

>>>  df = pd.DataFrame(dict(OldCol=['sometext1', 'sometext 1', 'BB_ccc', 'sometext1']))

将所有>>> df.loc[df['OldCol'].str.split('_').str[0] == 'BB', 'NewCol'] = "A"列设置为BB_。你甚至可以（可选地，为了便于阅读）将布尔条件分离到它自己的行上：

我喜欢这种方法，这意味着读者无需计算隐藏在>>> oldcol_starts_BB = df['OldCol'].str.split('_').str[0] == 'BB' >>> df.loc[oldcol_starts_BB, 'NewCol'] = "A"部分内的逻辑。

然后，设置所有没有空格的列，这些列仍未设置（即split('_').str[0]为真）：

isnull

最后，将>>> oldcol_has_no_space = df['OldCol'].str.find(' ') < 0 >>> newcol_is_null = df['NewCol'].isnull() >>> df.loc[(oldcol_has_no_space) & (newcol_is_null), 'NewCol'] = 'C'的所有剩余值设置为NewCol：

根据条件更改整个熊猫系列

1 个答案:

实施例