要简化具有多个术语的功能,可以使用一个程序在文件中搜索模式并将它们排列在排名列表中。我可以想象这是一个复杂的过程,但是我敢肯定有些人已经建立了这样的东西。 文本示例:
sin(t1)*cos(t1)*t1+t1-sin(t1)*sin(t1-pi)
这应该给我这样的输出(至少2个字母):
6x: "t1"
4x: "(t1"
3x: "n(t1"
3x: "sin"
3x: "sin("
2x: "sin(t1)"
etc.
这个问题有名字吗(我不知道)?是否有已知的算法可以为我解决问题?
答案 0 :(得分:0)
我用QT编写了一个小程序,可以完成任务。方法是尝试一切。要解决我的问题,可能需要几天的时间,因为文本文件很大。 如果我将以下文本作为输入(“ text.txt”):
sin(t1)*cos(t1)*t1+t1-sin(t1)*sin(t1-pi)
我有以下参数:长度2-5,最小出现次数:3
以下结果:
t1 6
(t 4
(t1 4
si 3
in 3
n( 3
1) 3
)* 3
sin 3
in( 3
n(t 3
t1) 3
1)* 3
sin( 3
in(t 3
n(t1 3
(t1) 3
t1)* 3
sin(t 3
in(t1 3
(t1)* 3
代码:
#include <QCoreApplication>
#include <qdebug.h>
#include <qstring.h>
#include <qfile.h>
#include <qtextstream.h>
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QString * wholefile = new QString;
uint64_t minchar = 2;
uint64_t maxchar = 5;
uint64_t min_occur = 3;
QFile file("text.txt");
if(!file.open(QIODevice::ReadOnly)) {
qDebug()<<"error reading file";
}
QTextStream in(&file);
while(!in.atEnd()) {
QString line = in.readLine();
wholefile->append(line);
}
file.close();
QStringList * allpatterns = new QStringList;
for(uint64_t i=minchar; i<=maxchar;i++){
for(uint64_t pos=0; pos<wholefile->length()-i;pos++){
QString pattern = wholefile->mid(pos,i);
if(allpatterns->contains(pattern)==0){
allpatterns->append(pattern);
}
}
}
uint64_t * strcnt = new uint64_t[allpatterns->length()];
uint64_t maximum_cnt = 0;
QStringList * interestingpatterns = new QStringList;
uint64_t nr_of_patterns = 0;
for(uint64_t i=0; i<allpatterns->length();i++){
QString str = allpatterns->at(i);
strcnt[nr_of_patterns] = wholefile->count(str);
if(strcnt[nr_of_patterns]>=min_occur){
if(strcnt[nr_of_patterns]>maximum_cnt){
maximum_cnt = strcnt[nr_of_patterns];
}
interestingpatterns->append(str);
nr_of_patterns++;
}
}
/* display result*/
QFile file2("out.txt");
if (!file2.open(QIODevice::WriteOnly | QIODevice::Text))
qDebug()<<"error writing file";
QTextStream out(&file2);
uint64_t current_max = maximum_cnt;
while(current_max>=min_occur){
for(uint64_t i=0; i<interestingpatterns->length();i++){
if(strcnt[i]==current_max){
QString str = interestingpatterns->at(i);
qDebug()<<str<<strcnt[i];
out <<str<<" "<< QString::number(strcnt[i])<<"\n";
}
}
current_max--;
}
file2.close();
return a.exec();
}