import re
#Creating several new colums with a for loop and adding them to the original df.
#Creating permutations for a second level of binary variables for df
for i in list_ib:
for j in list_ib:
if i == j:
break
else:
bina = df[i]*df[j]
print(i,j)
我是属于数据框(df)的二进制列,j是相同的列。 我已经计算了每列与每列的乘法。我现在的问题是,如何将所有新的二进制产品列添加到原始df?
我试过了:
df = df + df[i,j,bina]
但我没有得到我需要的结果。有什么建议吗?
答案 0 :(得分:2)
据我了解,i,j,bina
不属于你的df。为每一个构建数组,每个数组元素代表一个'行'一旦准备好i,j,bina
的所有行,就可以像这样连接:
>>> new_df = pd.DataFrame(data={'i':i, 'j':j, 'bina':bina}, columns=['i','j','bina'])
>>> pd.concat([df, new_df], axis=1)
或者,一旦您收集了'i', 'j' and 'bina'
的所有数据并假设您在单独的数组中拥有每个数据,您就可以这样做:
>>> df['i'] = i
>>> df['j'] = j
>>> df['bina'] = bina
仅当这三个数组的元素与DataFrame df中的行数一样多时才有效。
我希望这有帮助!
答案 1 :(得分:0)
通常,您使用内置的Dataframe
向__setitem__()
添加列,您可以使用[]
访问这些列。例如:
import pandas as pd
df = pd.DataFrame()
df["one"] = 1, 1, 1
df["two"] = 2, 2, 2
df["three"] = 3, 3, 3
print df
# Output:
# one two three
# 0 1 2 3
# 1 1 2 3
# 2 1 2 3
list_ib = df.columns.values
for i in list_ib:
for j in list_ib:
if i == j:
break
else:
bina = df[i] * df[j]
df['bina_' + str(i) + '_' + str(j)] = bina # Add new column which is the result of multiplying columns i and j together
print df
# Output:
# one two three bina_two_one bina_three_one bina_three_two
# 0 1 2 3 2 3 6
# 1 1 2 3 2 3 6
# 2 1 2 3 2 3 6