PowerShell中两个以上字符串的最长公共子字符串?

时间:2011-11-20 02:50:12

标签: powershell

如何在PowerShell中的字符串数组中找到匹配的字符串:

示例:

$Arr = "1 string first",
"2 string second",
"3 string third",
"4 string fourth"

使用此示例,我想要返回:

" string "

我想用它来查找文件名的匹配部分,然后删除文件名的那一部分(比如从一组mp3文件中删除艺术家的名字),而不必指定文件名的哪一部分应该手动更换。

4 个答案:

答案 0 :(得分:6)

$arr =  "qdfbsqds", "fbsqdt", "bsqda" 
$arr | %{

$substr = for ($s = 0; $s -lt $_.length; $s++) {
           for ($l = 1; $l -le ($_.length - $s); $l++) {
            $_.substring($s, $l);
           }
          } 
$substr | %{$_.toLower()} | select -unique

} | group | ?{$_.count -eq $arr.length} | sort {$_.name.length} | select -expand name -l 1
# returns bsqd
  • 生成输入字符串的所有唯一子字符串的列表
  • 过滤掉#inputstrings次数(即在所有输入字符串中)
  • 的子串
  • 根据子字符串的长度
  • 对这些过滤的子字符串进行排序
  • 返回此列表的最后一个(即最长的)

答案 1 :(得分:1)

如果它(艺术家姓名等)只是一个单词:

$Arr = "1 string first", "2 string second", "3 string third", "4 string fourth"
$common = $Arr | %{ $_.split() } | group | sort -property count | select -last 1 | select -expand name
$common = " {0} " -f $common

更新

似乎适用于多个单词的实现(找到单词的最长公共子串):

$arr = "1 string a first", "2 string a second", "3 string a third", "4 string a fourth"
$common = $arr | %{
$words = $_.split()
$noOfWords = $words.length
for($i=0;$i -lt $noOfWords;$i++){
    for($j=$i;$j -lt $noOfWords;$j++){
        $words[$i..$j] -join " "
    }
}

} | group | sort -property count,name | select -last 1 | select -expand name

$common = " {0} " -f $common
$common

答案 2 :(得分:1)

这是PowerShell中两个字符串的“最长公共子字符串”函数(基于wikibooks C# example):

Function get-LongestCommonSubstring
{
Param(
[string]$String1, 
[string]$String2
)
    if((!$String1) -or (!$String2)){Break}
    # .Net Two dimensional Array:
    $Num = New-Object 'object[,]' $String1.Length, $String2.Length
    [int]$maxlen = 0
    [int]$lastSubsBegin = 0
    $sequenceBuilder = New-Object -TypeName "System.Text.StringBuilder"

    for ([int]$i = 0; $i -lt $String1.Length; $i++)
    {
        for ([int]$j = 0; $j -lt $String2.Length; $j++)
        {
            if ($String1[$i] -ne $String2[$j])
            {
                    $Num[$i, $j] = 0
            }else{
                if (($i -eq 0) -or ($j -eq 0))
                {
                        $Num[$i, $j] = 1
                }else{
                        $Num[$i, $j] = 1 + $Num[($i - 1), ($j - 1)]
                }
                if ($Num[$i, $j] -gt $maxlen)
                {
                    $maxlen = $Num[$i, $j]
                    [int]$thisSubsBegin = $i - $Num[$i, $j] + 1
                    if($lastSubsBegin -eq $thisSubsBegin)
                    {#if the current LCS is the same as the last time this block ran
                            [void]$sequenceBuilder.Append($String1[$i]);
                    }else{ #this block resets the string builder if a different LCS is found
                        $lastSubsBegin = $thisSubsBegin
                        $sequenceBuilder.Length = 0 #clear it
                        [void]$sequenceBuilder.Append($String1.Substring($lastSubsBegin, (($i + 1) - $lastSubsBegin)))
                    }
                }
            }
        }
    }
    return $sequenceBuilder.ToString()
}

要将其用于两个以上的字符串,请按以下方式使用:

Function get-LongestCommonSubstringArray
{
Param(
[Parameter(Position=0, Mandatory=$True)][Array]$Array
)
    $PreviousSubString = $Null
    $LongestCommonSubstring = $Null
    foreach($SubString in $Array)
    {
        if($LongestCommonSubstring)
        {
            $LongestCommonSubstring = get-LongestCommonSubstring $SubString $LongestCommonSubstring
            write-verbose "Consequtive diff: $LongestCommonSubstring"
        }else{
            if($PreviousSubString)
            {
                $LongestCommonSubstring = get-LongestCommonSubstring $SubString $PreviousSubString
                write-verbose "first one diff: $LongestCommonSubstring"
            }else{
                $PreviousSubString = $SubString
                write-verbose "No PreviousSubstring yet, setting it to: $PreviousSubString"
            }
        }
    }
    Return $LongestCommonSubstring
}


get-LongestCommonSubstringArray $Arr -verbose

答案 3 :(得分:0)

如果我理解你的问题:

$Arr = "1 string first", "2 string second", "3 string third", "4 string fourth"
$Arr -match " string " | foreach {$_ -replace " string ", " "}