读取每个单词的文件并使用二进制搜索树对这些单词进行排序(按字典顺序)

时间:2017-03-20 19:39:48

标签: c++ sorting linked-list binary-search-tree lexicographic

Hello其他程序员,

我正在进行一项任务,要求我们读取一个文件并从该文件中取出每个单词并在表格中对其进行排序,该表格显示该单词及其存在的行号。

示例:

获取的文件包含:

this is a
text file is

因此它将输出以下内容:

a       1
file    2
is      1   2
text    2
this    1

所以它需要每个单词然后按字母顺序对其进行排序,然后还会打印它出现的行。到目前为止我已经有了这段代码( EDITED ):

#include <iostream>
#include <fstream>
#include <sstream>
#include <cstdlib>
#include <vector>

using namespace std;

struct TreeNode{
        string word;               //word will store the word from text file
        vector<int>lines;          //for keeping record of lines in which it was found
        TreeNode*left;             //pointer to left subtree
        TreeNode*right;            //pointer to right subtree
        TreeNode*temp;
    }; //end TreeNode

//check function for comparing strings
bool check(string a,string b)
{
    if(a<b)
      return false;
    return true;
}//end check

void insert(TreeNode *root,string word,int lineNumber){
    //Tree is NULL
   if(root==NULL){
      root=new TreeNode();
      root->word=word;
      root->lines.push_back(lineNumber);
   }//end if
    //words match
   if(root->word==word)
      root->lines.push_back(lineNumber);

   //check(a,b)is function that returns 1 if 'string a' is bigger than 'string b' lexographically
   if(check(root->word,word)){ //present word is lexographically bigger than root's word
      if(root->right)          //if right node to root is not null we insert word recursively
        insert(root->right,word,lineNumber);
      else{                    //if right node is NULL a new node is created
        TreeNode*temp=root->right;
        temp=new TreeNode();
        temp->word=word;
        temp->lines.push_back(lineNumber);
     }//end else
    }//end if
    else{ //present word is lexographically smaller than root's word
      if(root->left)
        insert(root->left,word,lineNumber);
      else{
        TreeNode*temp=root->left;
        temp=new TreeNode();
        temp->word=word;
        temp->lines.push_back(lineNumber);
      }//end nested else
    }//end else
}//end insert

//Print tree in In-Order traversal
void InOrder(TreeNode* node)
{
    if(!node) //end if pointing to null
        return;
    InOrder(node->left);        //display the left subtree
    cout << node->word << " ";  //display current node
    InOrder(node->right);        //display the right subtree
}//end InOrder

int main() { //main
	//int lineNumber = 0; //number of lines
	ifstream file("text.txt"); //takes input stream from designated file
	if(file) { //if file is there
		string line, word ; //setting line and word strings
		while(getline(file, line)) { //getting the lines from the file
            //++lineNumber; //incrementing number of lines when a new line is read
			istringstream is(line); //checking for a line
			while(is >> word) { //while a word exists
				InOrder(root); //<< lineNumber << "\n"; //outputting the words and tabbing to then print the line number and then a new line
			}//end word while
		}//end getline while
	}//end file if
	file.close();
	file.clear();
	return 0;
}//end main

我当前的输出:

this    1
is      1
a       1
text    2
file    2
is      2
#

(#字符只显示它是表格的结尾)

但我之前从未构建过一棵树,并且还在考虑让树首先搜索文件并将每个单词放在一个按字母顺序排列的列表中。然后在该链接列表中搜索该单词的副本,并将其发送到嵌套链接列表,以打印出原始单词和重复单词的行号。

我只是在寻找这项任务的帮助,我真的很困惑,不要发现自己是一个优秀的程序员,这就是我在学校的原因!非常感谢任何帮助!

2 个答案:

答案 0 :(得分:0)

我认为你使问题复杂化了。如果我是你,我会定义一个像

这样的结构
struct Whatever
{
    string word;
    vector<int> line;
};

并声明vector<Whatever> text

之后,您可以将算法sort与自定义比较函数一起使用,该函数会按字典顺序对text向量进行排序。

如果您真的必须使用二叉搜索树进行分配,请花一些时间进行一些研究。基本的想法是,每次要在树中插入节点时,都要将它与当前节点(在开始时将是根节点)进行比较,并查看它的值是否大于或小(在您的情况下,是字典比较)然后进入指定的方向(左侧为较小,右侧为较大),直到您到达NULL节点并在那里添加它。之后,您可以通过对树进行left-root-right解析来生成词典顺序。同样,我会轻松使用struct

答案 1 :(得分:0)

这里是您应该使用的结构以及使用二叉树实现问题的插入函数 -

struct TreeNode{
        string word;               //word will store the word from text file
        vector<int>lines;     //for keeping record of lines in which it was found
        TreeNode*left;             //pointer to left subtree
        TreeNode*right;            //pointer to right subtree
    };

在您的情况下,您将此结构包含在class LexTree中,这不是很有帮助,而是使事情变得复杂。

以下是如何实现插入功能 -

TreeNode* insert(TreeNode *root,string word,int lineNumber)
{
    //Tree is NULL
   if(root==NULL){
      root=new TreeNode();
      root->word=word;
      root->lines.push_back(lineNUmber);
   }
    //words match
   else if(root->word==word)
      root->lines.push_back(linenumber);

   //check(a,b)is funtion that returns 1 if 'string a' is bigger than 'string b' lexographically
   else if(check(root->word,word)){//present word is lexographically bigger than root's word
      if(root->right)         //if right node to root is not null we insert word recursively
        root->right=insert(root->right,word,lineNumber);
      else{                   //if right node is NULL a new node is created
        Node*temp=new TreeNode();
        temp->word=word;
        temp->lines.push_back(lineNumber);
        root->right=temp;
     }
    }
    else{ //present word is lexographically smaller than root's word
      if(root->left)
        root->left=insert(root->left,word,lineNumber);
      else{
        Node*temp=new TreeNode();
        temp->word=word;
        temp->lines.push_back(lineNumber);
        root->left=temp;
      }
    }
} 

修改 -

这是你的检查功能 -

bool check(string a,string b)
{
    if(a>b)
      return false;
    return true;
}

使用insert只需将root node of your BSTword作为insert

的参数传递

InOrder和main中的一些调整可以解决问题

//Print tree in In-Order traversal
void InOrder(TreeNode* node)
{
if(!node) //end if pointing to null
    return;
InOrder(node->left);        //display the left subtree
cout << node->word << " ";  //display current node



for(int i=0;i<node->lines.length();i++)  //vector lines contains all the line numbers where we had the word node->word
     cout<<nodes->lines[i]<<" "; 

 cout<<endl;


}//end InOrder

int main() { //main
//int lineNumber = 0; //number of lines


TreeNode *root=NULL;   //you need to declare root to make a BST


ifstream file("text.txt"); //takes input stream from designated file
if(file) { //if file is there
    string line, word ; //setting line and word strings
    while(getline(file, line)) { //getting the lines from the file
        //++lineNumber; //incrementing number of lines when a new line is read
        istringstream is(line); //checking for a line
        while(is >> word) { //while a word exists


            root=insert(root,word);  // insert word in your BST


        }//end word while
    }//end getline while


  Inorder(root);   //gives required output


}//end file if
file.close();
file.clear();
return 0;
}//end main