用PHP替换字符串中的多个单词

时间:2012-01-30 17:23:01

标签: php string command-line replace

我需要一种系统的方法,通过为每个单词提供我自己的输入来分别替换字符串中的每个单词。我想在命令行上执行此操作。

所以程序读入一个字符串,并询问我要用第一个单词替换第一个单词,然后是第二个单词,然后是第三个单词,依此类推,直到所有单词都被处理完毕。

字符串中的句子必须保持格式良好,因此算法应注意不要弄乱标点和间距。

有没有正确的方法呢?

2 个答案:

答案 0 :(得分:2)

给出一些文字

$subject = <<<TEXT
I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to do this on the command line.

So the program reads in a string, and asks me what I want to replace the first word with, and then the second word, and then the third word, and so on, until all words have been processed.

The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation and spacing.

Is there a proper way to do this?
TEXT;

您首先将字符串标记为单词和“其他所有”标记(例如,将其称为 fill )。  正则表达式对此有帮助:

$pattern = '/(?P<fill>\W+)?(?P<word>\w+)?/';
$r = preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);

现在的工作是将返回值转换为更有用的数据结构,如标记数组和所用单词的索引:

$tokens = array(); # token stream
$tokenIndex = 0;
$words = array(); # index of words
foreach($matches as $matched)
{
    foreach($matched as $type => $match)
    {
        if (is_numeric($type)) continue;
        list($string, $offset) = $match;
        if ($offset < 0) continue;


        $token = new stdClass;
        $token->type = $type;
        $token->offset = $offset;
        $token->length = strlen($string);

        if ($token->type === 'word')
        {
            if (!isset($words[$string]))
            {
                $words[$string] = array('string' => $string, 'tokens' => array());
            }
            $words[$string]['tokens'][] = &$token;
            $token->string = &$words[$string]['string'];
        } else {
            $token->string = $string;
        }


        $tokens[$tokenIndex] = &$token;
        $tokenIndex++;
        unset($token);
    }
}

您可以输出所有单词的示例:

# list all words

foreach($words as $word)
{
    printf("Word '%s' used %d time(s)\n", $word['string'], count($word['tokens']));
}

这将为您提供示例文本:

Word 'I' used 3 time(s)
Word 'need' used 1 time(s)
Word 'a' used 4 time(s)
Word 'systematic' used 1 time(s)
Word 'way' used 2 time(s)
Word 'of' used 1 time(s)
Word 'replacing' used 1 time(s)
Word 'each' used 2 time(s)
Word 'word' used 5 time(s)
Word 'in' used 3 time(s)
Word 'string' used 3 time(s)
Word 'separately' used 1 time(s)
Word 'by' used 1 time(s)
Word 'providing' used 1 time(s)
Word 'my' used 1 time(s)
Word 'own' used 1 time(s)
Word 'input' used 1 time(s)
Word 'for' used 1 time(s)
Word 'want' used 2 time(s)
Word 'to' used 5 time(s)
Word 'do' used 2 time(s)
Word 'this' used 2 time(s)
Word 'on' used 2 time(s)
Word 'the' used 7 time(s)
Word 'command' used 1 time(s)
Word 'line' used 1 time(s)
Word 'So' used 1 time(s)
Word 'program' used 1 time(s)
Word 'reads' used 1 time(s)
Word 'and' used 5 time(s)
... (and so on)

然后你只在单词标记上完成工作。例如,将一个字符串替换为另一个字符串:

# change one word (and to AND)

$words['and']['string'] = 'AND';

最后,您将令牌连接成一个字符串:

# output the whole text

foreach($tokens as $token) echo $token->string;

再次提供示例文本:

I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to
 do this on the command line.

So the program reads in a string, AND asks me what I want to replace the first word with, AND then the second word, AND 
then the third word, AND so on, until all words have been processed.

The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation AND
 spacing.

Is there a proper way to do this?

完成工作。确保单词标记仅替换为有效的单词标记,因此也要对用户输入进行标记,如果不是单个单词标记(与单词模式不匹配)则给出错误。

Code/Demo

答案 1 :(得分:0)

当您了解使用PHP的命令行编程的基础时,看起来很简单。有很多教程。

一般来说,连续循环会让你不停地询问单词应该是基础知识。然后你只需要执行每个循环:str_replace(),它将完成你需要的基础知识。

不要忘记实现一个打破循环的技巧,比如输入exit或根据需要使用一些特殊命令。

我认为在这里回复一个完整的代码示例不是主意吗?那会完全回答这个问题,但也有点像脚本请求吗?