Question

我有一个数据框df1，其中多个列包含相同的字符串字符子集。我如何单独对这些列进行更改。例如，删除最后三个字符，更改列的dtype等。我只是想更舒适地更改具有相同字符子集的列（例如，“会话”，如下所示）

第一个示例：

df1:

Session1    Session2    Session3    Total
3           4           5           5.0
2           1           4           Nan

df2 (Intended Out):

Sessi    Sessi    Sessi    Total
3        4        5        5.0
2        1        4        NaN

第二个例子：

{{1}}

Answer 1

关于您的第一点：

n_columns_with_session = 3
# create the names of the target columns
cols = ["Session{}".format(i) for i in range(1,n_columns_with_session+1)]

# change the dtype of the target columns
df1[cols] = df1[cols].astype('int64')

第二点：

# create the new names
new_names_cols = ["Sess{}".format(i) for i in range(1,n_columns_with_session+1)]

# append "Total" name since you do not want to change this
new_names_cols.append('Total')    

# rename the columns
df1.columns = new_names_cols

Answer 2

第一步是过滤所需的所有目标列。您可以使用

target_cols = [col for col in df if col.startswith('Session')]

然后，您可以将所需的任何操作应用于这些列。例如，要更改数据类型，您可以执行以下操作

df[target_cols] = df[target_cols].astype('int64')

编辑： 对于重命名和删除最后三个字符的第二点，可以使用如下重命名功能：

new_cols = [i[:-3] for i in target_cols]
df.rename(columns=dict(zip(target_cols, new_cols)), inplace=True)

熊猫重命名特定列并更改dtype

2 个答案: