需要帮助根据另一个文本文件的内容从大文本文件中删除条目

时间:2011-10-18 11:30:31

标签: file text editing

美好的一天。我真的可以在这个问题上使用你的帮助。我有以下格式的统计文本文件。

ID=1000000 
Name=Name1
Field1=Value1 
...(Fields 2 to 25)
Field26=Value26 

ID=1000001
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26

ID=1000002
Name=Name2
Field1=Value1 
...(Fields 2 to 25) 
Field26=Value26 

...goes up to 15000

我有一个由换行符分隔的活动人员文本文件。

Name2
Name5
Name11
Name12 
...goes up to 1400 Random Names

如果在活动人文本文件中找不到该名称,我需要能够从统计文本文件(ID,Name,Fields1到26)中删除记录。在上面的示例中,应删除Name1(ID,Name,Fields1到26)的关联记录,因为它不在活动人员文本文件中。

我尝试使用TextFX-> Quick->查找/替换通过notepad ++重新格式化stats文件,将其转换为逗号分隔文件,每个记录用换行符分隔。我把它重新安排到了

ID       Name    Field1  ...Fields2 to Fields 25... Field26
1000000  Name1   Value1  ...Value2 to Value 25...   Value26
1000001  Name2   Value1  ...Value2 to Value 25...   Value26
1000002  Name2   Value1  ...Value2 to Value 25...   Value26

我用excel打开它,我使用csv文件文件在mysql中创建了两个表(stats表和一个活动的名称表)。我不确定如何在自动功能中处理它。除了删除非活动记录外,我遇到的另一个问题是将其重写为旧格式。

我一直在尽力将这个问题弄清楚几个小时。有没有一个解决方案不需要我在两个文件之间使用查找,复制,粘贴和切换1400次?不幸的是,我必须以这种格式保存stats文件。

请帮忙。谢谢。

1 个答案:

答案 0 :(得分:1)

这是一个C ++程序,它将为您处理文件:

#include <algorithm>
#include <fstream>
#include <iostream>
#include <locale>
#include <set>
#include <string>
#include <vector>

//trim functions taken:
//http://stackoverflow.com/questions/216823/whats-the-best-way-to-trim-stdstring/217605#217605
//with a slight change because of trouble with ambiguity
static int myIsSpace(int test)
{
    static std::locale loc;
    return std::isspace(test,loc);
}
static std::string &rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(), std::not1(std::ptr_fun<int, int>(myIsSpace))).base(), s.end());
    return s;
}

static std::string &ltrim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(), std::not1(std::ptr_fun<int, int>(myIsSpace))));
    return s;
}

static std::string &trim(std::string &s) {return ltrim(rtrim(s));}

int main(int argc,char * argv[])
{
    std::ifstream peopleFile;
    peopleFile.open("people.txt");

    if (!peopleFile.is_open()) {
        std::cout << "Could not open people.txt" << std::endl;
        return -1;
    }

    std::set<std::string> people;

    while (!peopleFile.eof()) {
        std::string somePerson;
        std::getline(peopleFile,somePerson);
        trim(somePerson);
        if (!somePerson.empty()) {
            people.insert(somePerson);
        }
    }

    peopleFile.close();

    std::ifstream statsFile;
    statsFile.open("stats.txt");

    if (!statsFile.is_open()) {
        std::cout << "could not open stats.txt" << std::endl;
        return -2;
    }

    std::ofstream newStats;
    newStats.open("new_stats.txt");

    if (!newStats.is_open()) {
        std::cout << "could not open new_stats.txt" << std::endl;
        statsFile.close();
        return -3;
    }

    size_t totalRecords=0;
    size_t includedRecords=0;

    bool firstRecord=true;
    bool included=false;
    std::vector<std::string> record;
    while (!statsFile.eof()) {
        std::string recordLine;
        getline(statsFile,recordLine);
        std::string trimmedRecordLine(recordLine);
        trim(trimmedRecordLine);

        if (trimmedRecordLine.empty()) {
            if (!record.empty()) {
                ++totalRecords;

                if (included) {
                    ++includedRecords;

                    if (firstRecord) {
                        firstRecord=false;
                    } else {
                        newStats << std::endl;
                    }

                    for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                        newStats << *i << std::endl;
                    }
                    included=false;
                }

                record.clear();
            }
        } else {
            record.push_back(recordLine);
            if (!included) {
                if (0==trimmedRecordLine.compare(0,4,"Name")) {
                    trimmedRecordLine=trimmedRecordLine.substr(4);
                    ltrim(trimmedRecordLine);
                    if (!trimmedRecordLine.empty() && '='==trimmedRecordLine[0]) {
                        trimmedRecordLine=trimmedRecordLine.substr(1);
                        ltrim(trimmedRecordLine);
                        included=people.end()!=people.find(trimmedRecordLine);
                    }
                }
            }
        }
    }

    if (!record.empty()) {
        ++totalRecords;

        if (included) {
            ++includedRecords;

            if (firstRecord) {
                firstRecord=false;
            } else {
                newStats << std::endl;
            }

            for (std::vector<std::string>::iterator i=record.begin();i!=record.end();++i) {
                newStats << *i << std::endl;
            }
            included=false;
        }

        record.clear();
    }

    statsFile.close();
    newStats.close();

    std::cout << "Wrote new_stats.txt with " << includedRecords << " of the " << totalRecords << ((1==totalRecords)?" record":" records") << "found in stats.txt after filtering against the " << people.size() << ((1==people.size())?" person":" people") << " found in people.txt" << std::endl;

    return 0;
}