我有两种类型的文本,我想将它们分成单词。
第一种文本文件只是用换行符分隔的单词。
And then she tried to run
but she was stunned by the view of
...
第二种类型的文本文件是书中的文本,只有空格。 (没有昏迷,问号等)。
while(fgets(line,sizeof(line),wordlist) != NULL)
{
/* Checks Words |
printf("%s",line);*/
InsertWord(W,line);/*Function that inserts the word to a tree*/
}
你知道哪种方法最好吗?
我尝试了以下两种方式,但似乎我正在进行分割。
对于我使用的第一种文字:
while(fgets(line,sizeof(line),out) != NULL)
{
bp = line ;
while(1)
{
cp = strtok(bp," ");
bp = NULL ;
if(cp == NULL)
break;
/*printf("Word by Word : %s \n",cp);*/
CheckWord(Words, cp);/*Function that checks if the word from the book is the same with one in a tree */
}
}
对于我使用的第二种文字:
for (i = 0 ; i <=2 ; i++)
{
if (i==0)
InsertWord(W,"A");
if (i==1)
InsertWord(W,"B");
if (i==2)
InsertWord(W,"c");
}*/
如果在这些问题上出错,你能否提出更好的建议或纠正我?
InsertWord是一个将单词插入树中的函数。 当我使用这段代码时:
char this_word[15];
while (fscanf(wordlist, "%14s", this_word) == 1)
{
printf("Latest word that was read: '%s'\n", this_word);
InsertWord(W,this_word);
}
树插入的字很好并打印出来,这意味着我的树工作正常,它的功能(它们也由我们的老师给出)。 但当我尝试这样做时:
{{1}}
我从树上得到错误。所以,我猜这是某种分段。 有什么想法吗?
答案 0 :(得分:3)
这是输入char this_word[15];
while (fscanf(tsin, "%14s", this_word) == 1) {
printf("Latest word that was read: '%s'.\n", this_word);
// Process the word...
}
和script
的类型:
script -q -c "java NameReader" log.txt
答案 1 :(得分:1)
您想要从文件中读取,可能会想到fgets()。
您希望通过分隔符(空格)拆分为令牌,strtok()应该牢记在心。
所以,你可以这样做:
#include <stdio.h>
#include <string.h>
int main(void)
{
FILE * pFile;
char mystring [100];
char* pch;
pFile = fopen ("text_newlines.txt" , "r");
if (pFile == NULL) perror ("Error opening file");
else {
while ( fgets (mystring , 100 , pFile) != NULL )
printf ("%s", mystring);
fclose (pFile);
}
pFile = fopen ("text_wspaces.txt" , "r");
if (pFile == NULL) perror ("Error opening file");
else {
while ( fgets (mystring , 100 , pFile) != NULL ) {
printf ("%s", mystring);
pch = strtok (mystring," ");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ");
}
}
fclose (pFile);
}
return 0;
}
输出:
linux25:/home/users/grad1459>./a.out
Milk
Work
Chair
And then she tried to run
And
then
she
tried
to
run
but she was stunned by the view of
but
she
was
stunned
by
the
view
of
//newline here as well
答案 2 :(得分:0)
最简单的方法可能是逐字逐句:
char word[50];
char *word_pos = word;
// Discard characters until the first word character
while ((ch = fgetch(out)) != EOF &&
ch != '\n' &&
ch != ' ');
do {
if (ch == '\n' || ch == ' ') {
*word_pos++ = '\0';
word_pos = word;
CheckWord(Words, word);
while ((ch = fgetch(out)) != EOF &&
ch != '\n' &&
ch != ' ');
}
*word_pos++ = ch;
} while ((ch = fgetch(out)) != EOF);
您已被word
的尺寸限制,并且您需要将每个停止字符添加到while
和if
条件中。