我的Excel电子表格包含以下数据集,但您可以看到其中一些是重复的,而其他数据集的名称相似。我想找到相同和相似的重复项。条件是,如果他们有三个或三个以上相似的关键字,那么它们也是重复的。我有以下excel函数,但如何扩展它以找到类似的重复项:
=IF(COUNTIF($C$2:C2,C2)>1, "Duplicate!","Original")
电子表格:
The Power by Naomi Alderman
Grant by Ron Chernow*********
Exit West by Mohsin Hamid
Janesville: An American Story by Amy Goldstein
Exit West by Mohsin Hamid
Five-Carat Soul by James McBride
Anything Is Possible by Elizabeth Strout
Dying: A Memoir by Cory Taylor
A Gentleman in Moscow by Amor Towles
Janesville: An American Story by Amy Goldstein
Exit West by Mohsin Hamid
Five-Carat Soul by James McBride
Janesville: An Story by Amy
Exit West by Mohsin Hamid
Five-Carat Soul by James McBride
Evicted: Poverty and Profit in the American City Matthew Desmond
Exit West by Mohsin Hamid
An American Story by Amy Goldstein
Poverty and Profit American City Matthew
Grant by Ron*********
Grant by Ron Chernow
正如你所看到的那样 由Ron Chernow授予 有多个完全相同的副本,而另一个只有 由Ron授予 > 没有 Chernow 。请帮忙。
以下是屏幕截图:Link
答案 0 :(得分:0)
如果你的数据序列无法改变,我就不会想到一种方法来做你需要的东西,只使用Excel功能(但考虑到这个网站的聪明才智,我不能这么肯定)。但是,如果您可以对数据进行排序,则下面的公式可能会有效。
=IF(COUNTIF(C$2:C2,C2 & "*")>1,"Duplicate!","Original")
但是在按降序排序数据之后。
<强>考虑:强>
答案 1 :(得分:0)
如果你不想对它进行排序应该可行。您可以更改数字11以标识要匹配的左侧字符数。
=IF(COUNTIFS(C$2:C2, LEFT(C2,11)& "*")>1, "Duplicate!","Original")