我有两个数据框,
df1
ID Key
1 A
2 B
3 C
4 D
df2
ID Key
1 D
2 C
3 B
4 E
现在,如果在df2中找到了df1中的键,则新列将具有一个找到的值,否则未找到
带有输出数据帧的df1变为
ID Key Result
1 A Not Found
2 B Found
3 C Found
4 D Found
我们如何使用熊猫来做到这一点?这不是ID的加入/合并/合并。
答案 0 :(得分:6)
将numpy.where
与isin
一起使用:
df1['Result'] = np.where(df1['Key'].isin(df2['Key']), 'Found', 'Not Found')
print (df1)
ID Key Result
0 1 A Not Found
1 2 B Found
2 3 C Found
3 4 D Found
答案 1 :(得分:1)
使用合并
的另一种解决方案import pandas as pd
import numpy as np
res = pd.merge(df1,df2,how="left",left_on="Key",right_on="Key",suffixes=('', '_'))
res["Result"] = np.where(pd.isna(res.ID_),"Not Found","Found")
del res["ID_"]
res
答案 2 :(得分:1)
使用# I Understand this part ###############
#self.words is a dictionary with all words + [occurence,index]
if size_of_table > 0: #Test if we have at least one word
#the probability for selecting a word as a negative sample is related to its frequency
#Compute sum of weights
sum_of_pow = float(sum([self.words[w][0]**power for word in self.words]))
# End of the part that I understand ################
idx1 = 0
# Now, we will divide frequency by this sum
p = self.words[self.indextoword[idx1]][0]**power / total_words_pow # I thought we were supposed to do that for all words??
#Go through the whole table and fill it up with the word indexes proportional to a word's count^power
# I really don't understand this part
for idx2 in range(size_of_table):
self.negative_samples[idx2] = idx1
if (idx2 / size_of_table) > p:
idx1 += 1
p += self.words[self.indextoword[idx2]][0]**power / sum_of_pow
if (idx2 / vocab_size):
idx2 = vocab_size - 1
# End of the part that I really don't understand
#Reset all projection weights to an initial (untrained) state
np.random.seed(1)
self.syn0 = np.empty((len(self.vocabulary), self.layer1_size), dtype=np.float32)
for i in xrange(len(self.vocabulary)):
self.syn0[i] = (np.random.rand(self.layer1_size) - 0.5) / self.layer1_size
self.syn1neg = np.zeros((len(self.vocabulary), self.layer1_size), dtype=np.float32)
self.syn0norm = None
self.syn1neg = np.array(self.syn1neg)
和merge
的另一种方式
np.where