Question

我有两个数据框，

df1

ID   Key
1    A
2    B
3    C
4    D

df2

ID   Key
1    D
2    C
3    B
4    E

现在，如果在df2中找到了df1中的键，则新列将具有一个找到的值，否则未找到

带有输出数据帧的df1变为

  ID   Key   Result
1    A        Not Found
2    B        Found
3    C        Found
4    D        Found

我们如何使用熊猫来做到这一点？这不是ID的加入/合并/合并。

Answer 1

将numpy.where与isin一起使用：

df1['Result'] = np.where(df1['Key'].isin(df2['Key']), 'Found', 'Not Found')
print (df1)
   ID Key     Result
0   1   A  Not Found
1   2   B      Found
2   3   C      Found
3   4   D      Found

Answer 2

使用合并

的另一种解决方案

import pandas as pd
import numpy as np

res = pd.merge(df1,df2,how="left",left_on="Key",right_on="Key",suffixes=('', '_'))

res["Result"] = np.where(pd.isna(res.ID_),"Not Found","Found")

del res["ID_"]

res

Answer 3

使用# I Understand this part ############### #self.words is a dictionary with all words + [occurence,index] if size_of_table > 0: #Test if we have at least one word #the probability for selecting a word as a negative sample is related to its frequency #Compute sum of weights sum_of_pow = float(sum([self.words[w][0]**power for word in self.words])) # End of the part that I understand ################ idx1 = 0 # Now, we will divide frequency by this sum p = self.words[self.indextoword[idx1]][0]**power / total_words_pow # I thought we were supposed to do that for all words?? #Go through the whole table and fill it up with the word indexes proportional to a word's count^power # I really don't understand this part for idx2 in range(size_of_table): self.negative_samples[idx2] = idx1 if (idx2 / size_of_table) > p: idx1 += 1 p += self.words[self.indextoword[idx2]][0]**power / sum_of_pow if (idx2 / vocab_size): idx2 = vocab_size - 1 # End of the part that I really don't understand #Reset all projection weights to an initial (untrained) state np.random.seed(1) self.syn0 = np.empty((len(self.vocabulary), self.layer1_size), dtype=np.float32) for i in xrange(len(self.vocabulary)): self.syn0[i] = (np.random.rand(self.layer1_size) - 0.5) / self.layer1_size self.syn1neg = np.zeros((len(self.vocabulary), self.layer1_size), dtype=np.float32) self.syn0norm = None self.syn1neg = np.array(self.syn1neg)和merge的另一种方式

np.where

Python-检查df2列中是否存在df1列中的值

3 个答案: