我有一个空数组,我试图根据Col1和Col2中的值进行追加。如果任一列中存在大于零的数值,则根据相应的值附加数组。
例如:
Col1 Col2
1 2
2 3
0 0
4 2
输出应为:
Col1 Col2 Col3
1 2 [1,2,2]
2 3 [1,1,2,2,2]
0 0 []
4 2 [1,1,1,1,2,2]
到目前为止,代码返回'系列的真值是模棱两可的'。我对context其他主题中的这个错误很熟悉,但似乎无法将此与我的相符。
df = pd.read_csv('rawdata.csv')
x_array =[]
for x in df['emails_opened'], df['emails_clicked']:
if (x > 0 & pd.notnull(x) & x != '' & x in df['emails_opened']):
x_array == np.append(x_array, x * [2])
elif (x > 0 & pd.notnull(x) & x != '' & x in df['emails_clicked']):
x_array == np.append(x_array, x * [3])
else: 0
print x_array
非常感谢任何帮助!
答案 0 :(得分:0)
我认为你需要:
#replace to 0 by conditions
m1 = (df['Col2'] > 0) & (df['Col2'].notnull()) & (df['Col2'].astype(str) != '')
m2 = (df['Col1'] > 0) & (df['Col1'].notnull()) & (df['Col1'].astype(str) != '')
col1 = df['Col1'].where(m1, 0)
col2 = df['Col2'].where(m2, 0)
#repeat array by filtered values, last create list and for no values add empty list
a = pd.Series(np.repeat([1] * len(col1), col1),
index = np.repeat(col1.index, col1))
a = a.groupby(level=0).apply(list).reindex(df.index, fill_value=[])
b = pd.Series(np.repeat([2] * len(col2), col2),
index = np.repeat(col2.index, col2))
b = b.groupby(level=0).apply(list).reindex(df.index, fill_value=[])
df['Col3'] = a + b
print (df)
Col1 Col2 Col3
0 1 2 [1, 2, 2]
1 2 3 [1, 1, 2, 2, 2]
2 0 0 []
3 4 2 [1, 1, 1, 1, 2, 2]