如何循环文本文件的目录并回显PHP中每个文件的wordcount?

时间:2014-12-03 06:14:28

标签: php text

我想遍历text files目录并回显每个文件的word count。例如,如果目录包含两个包含以下内容的文本文件:

file1.txt -> this is file1.
file2.txt -> this is another file called file2.

然后输出应为:

wordcount: 3
wordcount: 6

我有以下代码:

$directory = "C:\\dir";
$files = scandir($directory);
foreach($files as $file) {
$fh = fopen($file, "r");
$contents = fread($fh, filesize($file));
fclose($fh);
echo "wordcount: "; //this should be modified to display the wordcount for each file..
}

应修改回声以回显每个文件的wordcount ..

4 个答案:

答案 0 :(得分:1)

这应该适合你:

$directory = "C:\\xampp\\htdocs\\Sandbox";
foreach (glob("$directory\\*.txt") as $file)  {
    $fh = fopen($file, "r");
    if(filesize($file) > 0) {
        $contents = fread($fh, filesize($file));
        $count = str_word_count($contents, 0); 
    } else {
        $count = 0;
    }
    fclose($fh);
    echo "File: " . basename($file) . "  Wordcount: $count<br />";      
}

输出可能如下所示:

File: test - Kopie (2).txt  Wordcount: 3
File: test - Kopie.txt  Wordcount: 2
File: test.txt  Wordcount: 7

答案 1 :(得分:0)

<?php 

$directory = "/var/www/stackoverflow/jql"; //My directory , change it to yours

$files = scandir($directory);

foreach($files as $file) {

    $fh         = fopen($file, "r");

    $contents   = fread($fh, filesize($file));

    $contents2  = ereg_replace('[[:space:]]+', '', $contents);

    $numChar    = strlen($contents2);

    echo "The $file have   ". $numChar . "  Words count <br>";

}

输出:

The jql.js have 964 Words count
The my.svg have 71721 Words count
The test.csv have 60 Words count 

答案 2 :(得分:0)

简单代码:

$files = scandir('dir_name/');
foreach ($files as $file) { $str = file_get_contents('dir_name/'.$file);echo $file .'-'.str_word_count($str, 0);}

答案 3 :(得分:-1)

<?php

$wordFrequencyArray = array();

function countWords($file) use($wordFrequencyArray) {
    /* get content of $filename in $content */
    $content = strtolower(file_get_contents($filename));

    /* split $content into array of substrings of $content i.e wordwise */
    $wordArray = preg_split('/[^a-z]/', $content, -1, PREG_SPLIT_NO_EMPTY);

    /* "stop words", filter them */
    $filteredArray = array_filter($wordArray, function($x){
        return !preg_match("/^(.|a|an|and|the|this|at|in|or|of|is|for|to)$/",$x);
    });

    /* get associative array of values from $filteredArray as keys and their frequency count as value */
    foreach (array_count_values($filteredArray) as $word => $count) {
        if (!isset($wordFrequencyArray[$word])) $wordFrequencyArray[$word] = 0;
        $wordFrequencyArray[$word] += $count;
    }
}
$filenames = array('file1.txt', 'file2.txt', 'file3.txt', 'file4.txt' ...);
foreach ($filenames as $file) {
    countWords($file);
}

print_r($wordFrequencyArray);