Question

例如，我有很长的索引列表：{1,3,7,9，...}。

我的numpy / pandas看起来像这样：

Col1   Col2   
1       99
2       95
3       91
4       97
...
n       86

我想附加一个值为0或1的列，具体取决于是否可以在索引列表中找到最左边的列值（如果是，则为1）。

如果不循环索引列表，我该怎么做？我尝试过不同的方法而没有成功。

非常感谢！

P.S。我知道numpy是由一个数组数组组成的，所以每一列都只对应于numpy内部数组中的一个索引。

Answer 1

假设col1和col2位于名为DataFrame的Panda df中......

selected_indices = [1,3,7,9]   

# set index as col1, since that seems to be the point of column1
df.set_index('col1')

# define col3 value as 0 or 1 based on selected_indices list
df['col3'] = 0
df['col3'].loc[selected_indices] = 1

Answer 2

<强>设置

l=[1,3,7,9]
df = pd.DataFrame({'Col1': {0: 1, 1: 2, 2: 3, 3: 4}, 'Col2': {0: 99, 1: 95, 2: 91, 3: 97}})
df
Out[190]: 
   Col1  Col2
0     1    99
1     2    95
2     3    91
3     4    97

<强>解决方案

您可以使用np.in1d检查indice列表中是否存在Col1，然后将bool结果转换为int。

df['indicator'] = np.in1d(df.Col1,l).astype(int)

df
Out[186]: 
   Col1  Col2  indicator
0     1    99          1
1     2    95          0
2     3    91          1
3     4    97          0

numpy / pandas：如何使用基于索引列表的值添加新列？

2 个答案: