Removing the strings between two characters in excel

时间:2016-07-11 23:22:02

标签: excel

I have a large excel data sheet with thousands of rows and its filled with html codes with less than and greater than tags containing codes and other real information is outside of tags. Like;

    Cables Supported                                      </TD>        <td class="specDetail fs18" data-selenium="specDetail">         RG-6 coax              </TD>       </TR>                               <tr>        <td class="specTopic fs18" data-selenium="specTopic">         Weight Supported                                      </TD>        <td class="specDetail fs18" data-selenium="specDetail">         80 lb (36 kg)              </TD>       </TR>               </TBODY>   </TABLE>      <table class="specTable" data-selenium="specTable">   <tbody data-selenium="specBody">   <tr>    <th class="specHeader Header" colspan="2" data-selenium="specHeader">     <a name="shipping" class="fs32 OpenSans-300-normal" id="shipping" data-selenium="specHeaderLink">Packaging Info</A>    </TH>   </TR>      <tr>    <td class="specTopic fs18" data-selenium="specTopic">     Package Weight    </TD>    <td class="specDetail fs18" data-selenium="specDetail">     0.05 lb    </TD>   </TR>      <tr>    <td class="specTopic fs18" data-selenium="specTopic">     Box Dimensions (LxWxH)    </TD>    <td class="specDetail fs18" data-selenium="specDetail">     3.3 x 2.0 x 0.2"    </TD>   </TR>      </TBODY>  </TABLE>       </DIV>                            </DIV>             <div class="rightPanel">                                                        <div class="video-container">                     <div class="content" data-selenium="content">    <script src="//players.brightcove.net/1661991858001/N1cfLQmFe_default/index.min.js" type="text/javascript"></SCRIPT> </DIV><!-- end content -->              </DIV>             </DIV>                                       </DIV>

I need to remove everything inside "<" ">" including the less than and greater than characters. So at the end, result should look like this:

Cables Supported RG-6 coax Weight Supported 80 lb (36 kg) Package Weight 0.05 lb
Box Dimensions (LxWxH)  3.3 x 2.0 x 0.2"

Data to columns won't with that large data. It just doesn't fit. I'm stuck.

1 个答案:

答案 0 :(得分:0)

我不能正确地徒手写出公式,并且目前不能访问excel所以这可能需要一些名称函数名称更改,并且可能需要更改属性顺序,但它看起来像

=replace(a1, left(right(a1, find(a1,"<")),find(a1,">")),"")

应该在伪代码中执行以下操作

s = string
m = position of first >
n = position of first <
o = the text to the left of the m'th letter of s
p = the text to the right of the n'th letter of o
replace p from s with ""

然后将具有公式in和past特殊值(值)的列复制到原始字符串列中,直到不再有&lt;或者&gt; (或者录制一个宏来执行该操作,然后将密钥垃圾邮件或编辑它并用循环围绕宏。