我正在尝试从.txt文件中删除评论。我的文本文件如下所示:
(* Sunspot data collected by Robin McQuinn from *)
(* http://sidc.oma.be/html/sunspot.html *)
(* Month: 1749 01 *) 58
(* Month: 1749 02 *) 63
(* Month: 1749 03 *) 70
(* Month: 1749 04 *) 56
评论是(*和*)之间的所有内容。我只需保留此文件中的58,63,70和56.
我的代码删除了一些字符,但不正确。我的代码如下所示:
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
#include <fstream>
#include <string>
#include <cctype>
#include <numeric>
#include <iomanip>
using namespace std;
int main() {
int digit = 1;
string filename;
//cout for getting user path
//the compiler parses string literals differently so use a double backslash or a forward slash
cout << "Enter the path of the data file, be sure to include extension." << endl;
cout << "You can use either of the following:" << endl;
cout << "A forwardslash or double backslash to separate each directory." << endl;
getline(cin, filename);
//gets file
ifstream infile{filename};
istream_iterator<char> infile_begin{ infile };
istream_iterator<char> eof{};
vector<char> file{ infile_begin, eof };
for(int i =0; i < file.size(); i++){
if(!isdigit(file[i])) {
if(file[i] != ')') {
file.erase(file.begin(),file.begin()+i);
}
}
}
copy(begin(file), end(file), ostream_iterator<char>(cout, " "));
}
我不应该使用vector.erase()
吗?我知道这段代码不对。如果是这种情况,什么是更好的解决方案?我知道在C中你可以将它写入内存并转到每个位置,这会是更好的方法吗?
答案 0 :(得分:4)
我首先将所有内容保存为字符串,准备字符串然后然后将结果安全地推送回向量。 现在我使用std :: regex来过滤你的文件。但这并不是最容易的。
#include <iostream>
#include <string>
#include <regex>
#include <fstream>
int main(){
std::string file_name;
std::cout << "Enter name/path of the txt file: ";
std::getline(std::cin, file_name);
std::ifstream file(file_name);
std::vector<int> vec; //here save integers
std::string text; //save current line here
std::smatch match; //here the found "comment" get's saved, later to be removed from text
std::regex remove("[\(\*]\.*[\*\)] *"); //the expression to search for
//translation
// _[\(\*] -> (*
// _\.* -> any number of characters
// _[\*\)] -> *)
// _ * -> any number of whitespaces (important to cast to integer)..
while (std::getline(file, text)){ //loop through all lines in file.txt
if (std::regex_search(text, match, remove)){ //if a comment was found
text.erase(text.begin(), text.begin() + match[0].length()); //remove the comment
}
if (!text.empty()) { //empty, line was a pure comment
vec.push_back(std::stoi(text)); //else add integer to list
}
}
std::cout << "The file contains:" << std::endl;
for (int i = 0; i < vec.size(); i++){
std::cout << vec.at(i) << std::endl;
}
return 0;
}
输出中:
Enter name/path of the txt file: file.txt
The file contains:
58
63
70
56
当然,使用std::stoi
只有在整数之后没有字符时才有效。嗯,这只是一个想法,当然是高度可修改的。
答案 1 :(得分:2)
嗯,正如你所注意到的那样,逻辑是错误的。
如果当前字符不是数字,也不是)
,则从头开始删除字符。
您可能要删除评论,那么为什么不搜索开始(*
并结束*)
并删除其中的所有内容?
std::vector<std::string> fileContent;
std::string line;
while (std::getline(infile, line))
{
//Find starting character sequence
auto begin = line.find("(*");
if (begin != std::string::npos)
{
//Find matching ending sequence, it's not a comment otherwise
auto end = line.find("*)", begin);
if (end != std::string::npos)
line.erase(line.begin() + begin, line.begin() + end + 2);
}
fileContent.push_back(line);
}
答案 2 :(得分:0)
您可以使用std::getline读取结束')'
字符,然后您知道下一次阅读将是您的号码:
int main()
{
std::ifstream ifs("test.txt");
std::string line;
while(std::getline(ifs, line)) // line by line
{
std::string skip;
int value;
// skip data upto and past ')', then read number
if(std::getline(std::istringstream(line), skip, ')') >> value)
std::cout << "found: " << value << '\n';
}
}
<强>输出:强>
found: 58
found: 63
found: 70
found: 56