我是这样的DataFrame:
col1 col2 col3 col4 col5 col6 col7 col8
0 5345 rrf rrf rrf rrf rrf rrf
1 2527 erfr erfr erfr erfr erfr erfr
2 2727 f f f f f f
我想重命名所有列,但不是 col1 和 col2 。
所以我试着制作一个循环
print(df.columns)
for col in df.columns:
if col != 'col1' and col != 'col2':
col.rename = str(col) + '_x'
但它效率不高......它不起作用!
答案 0 :(得分:13)
您可以使用DataFrame.rename()方法
new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)
答案 1 :(得分:4)
如果col1
和col2
是第一和第二列名称,则是最简单的解决方案:
df.columns = df.columns[:2].union(df.columns[2:] + '_x')
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
isin
或列表理解的另一种解决方案:
cols = df.columns[~df.columns.isin(['col1','col2'])]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
cols = [col for col in df.columns if col not in ['col1', 'col2']]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
最快的是列表理解:
df.columns = [col+'_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
<强>计时强>:
In [350]: %timeit (akot(df))
1000 loops, best of 3: 387 µs per loop
In [351]: %timeit (jez(df1))
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 207 µs per loop
In [363]: %timeit (jez3(df2))
The slowest run took 6.41 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 75.7 µs per loop
df1 = df.copy()
df2 = df.copy()
def jez(df):
df.columns = df.columns[:2].union(df.columns[2:] + '_x')
return df
def akot(df):
new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)
return df
def jez3(df):
df.columns = [col + '_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
return df
print (akot(df))
print (jez(df1))
print (jez2(df1))
答案 2 :(得分:3)
您可以使用带有正则表达式模式的str.contains
来过滤感兴趣的cols,然后使用zip
构建一个dict并将其作为arg传递给rename
:
In [94]:
cols = df.columns[~df.columns.str.contains('col1|col2')]
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
df
Out[94]:
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
因此,使用str.contains
过滤列会返回不匹配的列,因此列顺序无关紧要