仅当长度超过一定数量的字符时,才能删除单元格中的重复单词

时间:2019-01-24 08:40:17

标签: google-sheets

我需要删除单元格中的重复文本,但前提是该文本必须长于4个字符。

我有一个仅适用于任何类型的重复单词的公式:

=join(" ",unique(transpose(split(A1,", "))))

在这种情况下,如果一个单元格包含:

W3-X500 Samsung Galaxy W3-X500 5 inches and 5 different colors

它返回我:

W3-X500 Samsung Galaxy 5 inches and different colors

我将失去我需要的第二个5字符。

我该怎么办?

2 个答案:

答案 0 :(得分:2)

我敢肯定有人会提出比这更短,更简单的东西,但是与此同时

=ArrayFormula(substitute(join(" ",unique(if(len(transpose(split(A1,", ")))<=4,
transpose(split(A1,", "))&rept("*",row(indirect("1:"&counta(split(A1,", "))))),transpose(split(A1,", "))))),"*",""))

这个想法是,如果一个单词少于五个字母,则在其末尾加一些星号,否则保持原样。然后使用独特的,聚在一起,最后摆脱星星。

enter image description here

如果字符串中可能出现星号,则可以改用其他字符。

编辑

这会删除其中一个转置,但“唯一”需要一列,因此仍然留下两个:

=ArrayFormula(substitute(join(" ",unique(transpose(if(len(split(A1,", "))<=4,
split(A1,", ")&rept("*",transpose(row(indirect("1:"&counta(split(A1,", ")))))),split(A1,", "))))),"*",""))

编辑2

以上两种均可进一步简化:

=ArrayFormula(substitute(join(" ",unique(transpose(split(A1,", "))&if(len(transpose(split(A1,", ")))<=4,
rept("*",row(indirect("1:"&counta(split(A1,", "))))),""))),"*",""))

=ArrayFormula(substitute(join(" ",unique(transpose(split(A1,", ")&if(len(split(A1,", "))<=4,
rept("*",transpose(row(indirect("1:"&counta(split(A1,", ")))))),"")))),"*",""))

答案 1 :(得分:0)

=REGEXREPLACE(ARRAYFORMULA(JOIN(" ", 
 UNIQUE(IF(LEN((SPLIT(B1, ", "))), 
        IF(LEN((SPLIT(B1, ", ")))>4, 
               (SPLIT(B1, ", ")), 
        IF(LEN((SPLIT(B1, ", ")))<=4, 
               (SPLIT(B1, ", "))&
 "ᅇ"&CHAR(RANDBETWEEN(SIGN(ROW($A:$A))*1041, 1071))&
      CHAR(RANDBETWEEN(SIGN(ROW($A:$A))*1041, 1071))&
      CHAR(RANDBETWEEN(SIGN(ROW($A:$A))*1041, 1071))&"ᅇ", )), )))), 
 "\ᅇ([Б-Я]+)\ᅇ", "")

0