Question

我有使用R进行数据处理的基本背景，但是对Python来说是新手。我从Coursera上的教程中看到了此代码段。

有人可以向我解释列= {col：'Gold'+ col [4：]}，inplace = True 是什么意思吗？

（1）以我的理解，df.rename是将现有的列名重命名为（对于第一行，是Gold），但是为什么在它之后需要+ col [4：]？

（2）将函数 inplace声明为True 是否意味着要将结果df输出分配给原始df？

import pandas as pd

df = pd.read_csv('olympics.csv', index_col=0, skiprows=1)

for col in df.columns:
    if col[:2]=='01':
        df.rename(columns={col:'Gold'+col[4:]}, inplace=True)
    if col[:2]=='02':
        df.rename(columns={col:'Silver'+col[4:]}, inplace=True)
    if col[:2]=='03':
        df.rename(columns={col:'Bronze'+col[4:]}, inplace=True)
    if col[:1]=='№':
        df.rename(columns={col:'#'+col[1:]}, inplace=True)

谢谢。

Answer 1

这意味着：

#for each column name
for col in df.columns:
    #check first 2 chars for 01
    if col[:2]=='01':
        #replace column name with text gold and all characters after 4th letter
        df.rename(columns={col:'Gold'+col[4:]}, inplace=True)
    #similar like above
    if col[:2]=='02':
        df.rename(columns={col:'Silver'+col[4:]}, inplace=True)
    #similar like above
    if col[:2]=='03':
        df.rename(columns={col:'Bronze'+col[4:]}, inplace=True)
    #check first letter
    if col[:1]=='№':
        #add # after first letter
        df.rename(columns={col:'#'+col[1:]}, inplace=True)

是否将函数声明为True意味着将结果df输出分配给原始数据帧

是的，您是对的。它替换了就地列名。

Answer 2

if col[:2]=='01':
        #replace column name with text gold and all characters after 4th letter
        df.rename(columns={col:'Gold'+col[4:]}, inplace=True)

（1）。如果col的列名称为'01xx1234'，
1. col [：2] = 01为True
2.'Gold'+ col [4：] =>'Gold'+ col [4：] =>'Gold1234'
3.因此，“ 01xx1234”被替换为“ Gold1234”。

（2）inplace = True直接应用于数据框，并且不返回结果。
如果不添加此选项，则必须这样做。
df = df.rename(columns={col:'Gold'+col[4:]})

Answer 3

inplace =真正的意思：这些列将在原始数据框（df）中重命名

您的情况（inplace = True）：

import pandas as pd

df = pd.DataFrame(columns={"A": [1, 2, 3], "B": [4, 5, 6]})
df.rename(columns={"A": "a", "B": "c"}, inplace=True)

print(df.columns)
# Index(['a', 'c'], dtype='object')
# df already has the renamed columns, because inplace=True.

如果您不使用inplace = True，则重命名方法将生成一个新的数据框，如下所示：

import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
new_frame = df.rename(columns={"A": "a", "B": "c"})

print(df.columns) 
# Index(['A', 'B'], dtype='object') 
# It contains the old column names

print(new_frame.columns)
# Index(['a', 'c'], dtype='object') 
# It's a new dataframe and has renamed columns

注意：在这种情况下，更好的方法是将新数据帧分配给原始数据帧（df）

df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df = df.rename(columns={"A": "a", "B": "c"})

解释Pandas列引用语法

3 个答案: