在Python中使用Loop创建虚拟变量

时间:2017-10-19 09:21:30

标签: python pandas loops dummy-variable

我正在尝试为包含某个单词的某些列创建一堆新的二元变量(我想命名这些新的二进制变量BINARY_ +column name),我正在尝试这样做是方式,但它不起作用:

# create empty list
List_of_dummy_names = [] 

# word
string = "WORD"

for col in list(df.columns.values):
    if string in df.columns.values[col]:
        List_of_dummy_names.append('BINARY_'+col)

2 个答案:

答案 0 :(得分:0)

在您的情况下,col看起来像某种集合。你可能想这样做:

List_of_dummy_names.append('BINARY_'+string)

答案 1 :(得分:0)

如果你想用新创建的List_of_dummy_names重命名pandas dataframe列(带有' BINARY _' + column_name的元素),那么你可以按照我的回答。

让我们说

cv = list(df.columns.values)
#cv = ['aword', 'bword', 'c']
search_String = 'word'
replace_dict = dict(zip(cv,['BINARY_'+x if search_String in x else x for x in cv]))
#{'aword': 'BINARY_aword', 'bword': 'BINARY_bword', 'c': 'c'}
#Then in pandas dataframe rename method, use this dictinary
new_df = df.rename(col=replace_dict)

同时检查您是否可以使用以下

List_of_dummy_names = ['BINARY_'+x for x in cv if search_String in x ]
#['BINARY_aword', 'BINARY_bword']  #filters the element having 'word' in them and prefixed with 'BINARY_'

检查您是否需要这个(因为我很困惑'您在寻找什么')

#df has only one column named 'col_to_replace'
col_to_replace
   aword
   bword
   c
df['col_to_replace'] = ['BINARY_'+x if search_String in x else x for x in df['col_to_replace']]
#col_to_replace
   BINARY_aword  #prefixed
   BINARY_bword  #prefixed
   c             #word not found, so as it was

现在,您在列表中获得了新的列名列表。

List_of_dummy_names #['BINARY_aword', 'BINARY_bword']
#loop over it and create new columns in existing dataframe
for col_Name in List_of_dummy_names:
    df[col_Name] = 'default_value_1' #it will create new column "BINARY_aword" and all the row_values as string 'default_value_1' for first loop and in 2nd loop new column "BINARY_aword" with all values as 'default_value_1'.

如果列表中的值已经包含len(list)== len(df),则将该列表指定为df [col_Name] = list_of_values_having_same_length_as_DF