Question

我正在实施一种从字符串txt中移除某些字符的方法，就地。以下是我的代码。结果预计为“bdeg”。但结果是“bdegfg”，似乎没有设置null终止符。奇怪的是，当我使用gdb进行调试时，设置了null终结符

(gdb) p txt
$5 = (std::string &) @0xbffff248: {static npos = <optimized out>, 
  _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x804b014 "bdeg"}}

它对我来说是正确的。那么这里的问题是什么？

#include <iostream>
#include <string>

using namespace std;

void censorString(string &txt, string rem)
{
    // create look-up table
    bool lut[256]={false};
    for (int i=0; i<rem.size(); i++)
    {
        lut[rem[i]] = true;
    }
    int i=0;
    int j=0;

    // iterate txt to remove chars
    for (i=0, j=0; i<txt.size(); i++)
    {
        if (!lut[txt[i]]){
            txt[j]=txt[i];
            j++;
        }
    }

    // set null-terminator
    txt[j]='\0';
}

int main(){
    string txt="abcdefg";
    censorString(txt, "acf");

    // expect: "bdeg"
    std::cout << txt <<endl;
}

后续问题：

如果字符串没有像c字符串那样被截断。那么txt[j]='\0'会发生什么为什么它是“bdegfg”而不是'bdeg'\ 0'g'或一些损坏的字符串。

其他后续行动：如果我使用txt.erase(txt.begin()+j, txt.end()）; 它工作正常。所以我最好使用字符串相关的api。关键是我不知道这些api的底层代码的时间复杂性。

Answer 1

std :: string不会因为您认为必须使用其他方法执行此操作而终止

将功能修改为：

void censorString(string &txt, string rem)
{
    // create look-up table
    bool lut[256]={false};
    for (int i=0; i<rem.size(); i++)
    {
        lut[rem[i]] = true;
    }

    // iterate txt to remove chars
    for (std::string::iterator it=txt.begin();it!=txt.end();)
    {

        if(lut[*it]){
            it=txt.erase(it);//erase the character pointed by it and returns the iterator to next character
            continue;
        }
        //increment iterator here to avoid increment after erasing the character
        it++;
    }
}

这里基本上你必须使用std::string::erase函数来擦除字符串中的任何字符，它将迭代器作为输入并将迭代器返回到下一个字符 http://en.cppreference.com/w/cpp/string/basic_string/erase http://www.cplusplus.com/reference/string/string/erase/

擦除功能的复杂性是O（n）。所以整个函数的复杂度为o（n ^ 2）。非常长的字符串的空间复杂度，即> 256个字符将是O（n）。那么还有另一种方式只有O（n）时间的复杂性。创建另一个字符串并在迭代未被删失的txt字符串时追加该字符。

新功能将是：

void censorString(string &txt, string rem)
{
    // create look-up set
    std::unordered_set<char> luckUpSet(rem.begin(),rem.end());
    std::string newString;

    // iterate txt to remove chars
    for (std::string::iterator it=txt.begin();it!=txt.end();it++)
    {

        if(luckUpSet.find(*it)==luckUpSet.end()){
            newString.push_back(*it);
        }
    }
    txt=std::move(newString);
}

现在这个函数具有O（n）的复杂性，因为函数std::unordered_set::find和std::string::push_back具有O（1）的复杂性。如果使用正常的std :: set find，其复杂度为O（log n），则整个函数的复杂度将变为O（n log n）。

Answer 2

在std::string中嵌入空终止符是完全有效的，不会更改字符串的长度。例如，如果您尝试使用流提取输出它，它会给您带来意想不到的结果。

您尝试触及的目标可以轻松完成：

#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>

int main()
{
    std::string txt="abcdefg";
    std::string filter = "acf";
    txt.erase(std::remove_if(txt.begin(), txt.end(), [&](char c) 
    { 
        return std::find(filter.begin(), filter.end(), c) != filter.end(); 
    }), txt.end());

    // expect: "bdeg"
    std::cout << txt << std::endl;
}

与Himanshu的回答一样，你可以实现O（N）复杂度（使用附加内存），如下所示：

#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
#include <unordered_set>

int main()
{
    std::string txt="abcdefg";
    std::string filter = "acf";

    std::unordered_set<char> filter_set(filter.begin(), filter.end());
    std::string output;

    std::copy_if(txt.begin(), txt.end(), std::back_inserter(output), [&](char c)
    {
        return filter_set.find(c) == filter_set.end();  
    });

    // expect: "bdeg"
    std::cout << output << std::endl;
}

Answer 3

你没有告诉字符串你改变了它的大小。如果从字符串中删除任何字符，则需要使用resize方法更新大小。

Answer 4

问题是你不能像C风格的字符串那样对待C ++字符串是个问题。即你不能只在C中插入一个0来说服你自己这个，把它添加到你的代码“cout＆lt;＆lt; txt.length（）＆lt;＆lt; endl;” - 你会得到7.你想使用erase（）方法;

Removes specified characters from the string.
1) Removes min(count, size() - index) characters starting at index.
2) Removes the character at position.
3) Removes the character in the range [first; last).

Answer 5

Text是一个字符串而不是字符数组。这段代码

// set null-terminator
txt[j]='\0';

不会在第j个位置截断字符串。

在c ++中删除字符串中的字符

5 个答案: