我有两个数据帧:
DF1:
name
abc
lmn
pqr
DF2:
m_name n_name loc
abc tyu IND
bcd abc RSA
efg poi SL
lmn ert AUS
nne bnm ENG
pqr lmn NZ
xyz asd BAN
我想在以下条件下生成新的数据框:
如果df2.m_name == df1.name或df2.n_name == df1.name
消除重复行
以下是期望的输出:
m_name n_name loc
abc tyu IND
bcd abc RSA
lmn ert AUS
pqr lmn NZ
我可以获得有关如何实现这一目标的任何建议吗?
答案 0 :(得分:2)
使用
BindingSeq
或使用查询
import com.thoughtworks.binding.Binding
import com.thoughtworks.binding.Binding.{BindingSeq, Var, Vars}
val x1: Var[Int] = Var{1}
val x2: Binding[Int] = Binding{1}
val x3: Vars[Int] = Vars{Seq(1,2,3): _*}
val x4: BindingSeq[Int] = BindingSeq{Seq(1,2,3): _*}
ScalaFiddle.scala:18: error: .this.com.thoughtworks.binding.Binding.BindingSeq.type does not take parameters
val x4: BindingSeq[Int] = BindingSeq{Seq(1,2,3): _*}
答案 1 :(得分:2)
使用:
print (df2)
m_name n_name loc
0 abc tyu IND
1 abc tyu IND
2 bcd abc RSA
3 efg poi SL
4 lmn ert AUS
5 nne bnm ENG
6 pqr lmn NZ
7 xyz asd BAN
df3 = df2.filter(like='name')
#another solution is filter columns by columns names in list
#df3 = df2[['m_name','n_name']]
df = df2[df3.isin(df1['name'].tolist()).any(axis=1)]
df = df.drop_duplicates(df3.columns)
print (df)
m_name n_name loc
0 abc tyu IND
2 bcd abc RSA
4 lmn ert AUS
6 pqr lmn NZ
<强>详情:
使用filter
name
print (df2.filter(like='name'))
m_name n_name
0 abc tyu
1 abc tyu
2 bcd abc
3 efg poi
4 lmn ert
5 nne bnm
6 pqr lmn
7 xyz asd
找到所有列。
print (df2.filter(like='name').isin(df1['name'].tolist()))
m_name n_name
0 True False
1 True False
2 False True
3 False False
4 True False
5 False False
6 True True
7 False False
按DataFrame.isin
比较:
True
any
每行至少获得一个print (df2.filter(like='name').isin(df1['name'].tolist()).any(axis=1))
0 True
1 True
2 True
3 False
4 True
5 False
6 True
7 False
dtype: bool
:
df = df2[df2.filter(like='name').isin(df1['name'].tolist()).any(axis=1)]
print (df)
m_name n_name loc
0 abc tyu IND
1 abc tyu IND
2 bcd abc RSA
4 lmn ert AUS
6 pqr lmn NZ
按boolean indexing
过滤:
name
最后删除重复项drop_duplicates
(如果需要删除所有subset
列的dupes,请添加df = df.drop_duplicates(subset=df3.columns)
print (df)
m_name n_name loc
0 abc tyu IND
2 bcd abc RSA
4 lmn ert AUS
6 pqr lmn NZ
参数)
$string = "Loreim ipsum lorem ipsum @Leader_abcXyz! loreim ipsum loreim ipsum @Leader_xyzAbc! loreim ipsum lorem ipsul @Leader_jklMno oremipsuim!";
$pattern = "@Leader_";
$result = someRegularExpression($string,$pattern);