我有一个带有列名称的数据框:
DF1:
+------------+
| colsNames|
+------------+
|col1 |
|col2 |
|col3 |
+------------+
还有一个带有值数组的数据框:
DF2:
+------------+
| set|
+------------+
|[11, 20] |
|[1] |
|[10, 17, 54]|
+------------+
最后,通常,另一个数据框的数组比带数字的DF2大:
DF3:
+--------------------+
| origin |
+--------------------+
|[11, 17, 1, 2, 3] |
|[1, 17, 54, 66, 1] |
|[11, 20, 10, 20] |
+--------------------+
将DF2的每一行与DF3中列原始的所有值进行比较,并创建一个列,其中所有元素之间的总匹配项相同。结果将是这样的:
resultDF:
+--------------------+------------+------------+------------+
| origin | col1| col2| col3|
+--------------------+------------+------------+------------+
|[11, 17, 1, 2, 3] | 1| 1| 1|
|[1, 17, 54, 66, 1] | 0| 2| 2|
|[11, 20, 10, 20] | 3| 0| 1|
+--------------------+------------+------------+------------+
此resultDF第一行的说明(其余相同):