Applescript:找出两个单词列表之间的差异

时间:2014-07-09 17:49:10

标签: applescript

我有两个单词列表。第一个是主列表。它包含正确顺序的正确单词。第二个列表缺少单词,并且有一些不正常。但第二个列表至关重要,因为它包含视频中单词的坐标(帧数)。

第一个列表

set thePhrase to {"IT", "WAS", "THE", "BEST", "OF", "TIMES", "IT", "WAS", "THE", "WORST", "OF", "TIMES", "IT", "WAS", "THE", "AGE", "OF", "WISDOM"}

第二个清单

set theValsPhrase to {{280, "IT"}, {449, "WAS"}, {689, "THE"}, {959, "BEST"}, {1360, "OF"}, {1740, "TIMES"}, {2759, "THE"}, {2879, "WORST"}, {3240, "OF"}, {3379, "TIMES"}, {4420, "WAS"}, {4509, "THE"}, {5239, "WISDOM"}, {5440, "OF"}, {6190, "AGE"}}

目标:(1)填写缺失和无序的单词。 (2)估计那些遗失词的价值。

常数:两个列表的第一个单词都不匹配。

重要提示:如果单词出现故障,最好简单地将它们视为缺失。主列表是正确的命令。

如果脚本能够以某种方式确定缺少的位置以及缺少多少项,那么像这样的脚本可能会将缺失的项目放在近似值中:

set thePhrase to {"IT", "WAS", "THE", "BEST", "OF", "TIMES", "IT", "WAS", "THE", "WORST", "OF", "TIMES", "IT", "WAS", "THE", "AGE", "OF", "WISDOM"}
set theValsPhrase to {{280, "IT"}, {449, "WAS"}, {689, "THE"}, {959, "BEST"}, {1360, "OF"}, {1740, "TIMES"}, {2759, "THE"}, {2879, "WORST"}, {3240, "OF"}, {3379, "TIMES"}, {4420, "WAS"}, {4509, "THE"}, {5239, "WISDOM"}, {5440, "OF"}, {6190, "AGE"}}
-- approximate the value of the first missing value for the first missing word
set x to (((item 1 of item 7 of theValsPhrase) - (item 1 of item 6 of theValsPhrase)) / 3)
set approxVal_01 to x + (item 1 of item 6 of theValsPhrase) as integer
set approxVal_and_Word_sublist to {approxVal_01 as list}
-- add the first missing word from the master list to the temp list
set end of item 1 of approxVal_and_Word_sublist to item 7 of thePhrase
-- approximate the value of the second missing value
set approxVal_02 to (item 1 of item 7 of theValsPhrase) - x as integer
set end of approxVal_and_Word_sublist to {approxVal_02}
-- add the second missing word
set end of item 2 of approxVal_and_Word_sublist to item 8 of thePhrase
-- put it all back together
set tempValsPhrase to items 1 thru 6 of theValsPhrase
set end of tempValsPhrase to approxVal_and_Word_sublist
set theValsPhrase to items 7 thru end of theValsPhrase
set theValsPhrase to tempValsPhrase & theValsPhrase

我真正需要的是找到两个列表中文本项之间差异的方法。我需要在第二个列表中填写缺少的单词。我需要替换第二个列表中的无序单词。最后,我需要在第二个列表中近似得到缺失值。

不同的脚本语言是否更适合此任务?

2 个答案:

答案 0 :(得分:0)

试试这个

set thePhrase to {"IT", "WAS", "THE", "BEST", "OF", "TIMES", "IT", "WAS", "THE", "WORST", "OF", "TIMES", "IT", "WAS", "THE", "WISDOM", "OF", "AGE"}
set theValsPhrase to {{280, "IT"}, {449, "WAS"}, {689, "THE"}, {959, "BEST"}, {1360, "OF"}, {1740, "TIMES"}, {2759, "THE"}, {2879, "WORST"}, {3240, "OF"}, {3379, "TIMES"}, {4420, "WAS"}, {4509, "THE"}, {5239, "WISDOM"}, {5440, "OF"}, {6190, "AGE"}}

set finalList to {}
set n1 to 1
set tc2 to count theValsPhrase
set tc to count thePhrase
set i to 1
set found to true
repeat until i > tc
    set tw to item i of thePhrase
    if item 2 of list n1 of theValsPhrase = tw then -- the order is OK
        set end of finalList to list n1 of theValsPhrase
        set n1 to n1 + 1
        set i to i + 1
        if n1 > tc2 then exit repeat
    else -- missing word
        set nextWordL2 to item 2 of list n1 of theValsPhrase
        if found then set lastIndex to i
        set found to false
        repeat with j from (i + 1) to tc
            if nextWordL2 = (item j of thePhrase) then
                set nextFram to item 1 of list n1 of theValsPhrase
                set currFram to item 1 of last list of finalList
                set nF to (nextFram - currFram) div ((j - lastIndex) + 1) -- to approximate : ( the number of frames in this item - number of frames in last item of finalList) divide by (items in a row are missing + 1)
                my addApproxVal(finalList, nF, thePhrase, lastIndex, j - 1, currFram)
                set i to j -- to start at this index in the first loop
                set lastIndex to i
                set found to true
                exit repeat
            end if
        end repeat
        if not found then -- no match in the theValsPhrase
            set i to i + 1 --- next item in thePhrase
            set n1 to n1 + 1 --- next  item in theValsPhrase
        end if
    end if
    if n1 > tc2 then -- if the last words in the first list are not in the second list.
        set currFram to item 1 of last list of finalList
        set nF to currFram div (count finalList) -- average 
        my addApproxVal(finalList, nF, thePhrase, lastIndex, tc, currFram)
        exit repeat
    end if
end repeat
finalList
on addApproxVal(L, nF, L2, k, h, f)
    repeat with m from k to h
        set f to f + nF
        set end of L to {f, item m of L2}
    end repeat
end addApproxVal

答案 1 :(得分:0)

好吧,我明白了:)我会很快更新评论:)

set phrases to {"IT", "WAS", "THE", "BEST", "OF", "TIMES", "IT", "WAS", "THE", "WORST", "OF", "TIMES", "IT", "WAS", "THE", "AGE", "OF", "WISDOM"}
set locations to {{280, "IT"}, {449, "WAS"}, {689, "THE"}, {959, "BEST"}, {1360, "OF"}, {1740, "TIMES"}, {2759, "THE"}, {2879, "WORST"}, {3240, "OF"}, {3379, "TIMES"}, {4420, "WAS"}, {4509, "THE"}, {5239, "WISDOM"}, {5440, "OF"}, {6190, "AGE"}}

set outlist to {}
set locationCount to 1

-- loop through all the phrases
-- each phrase in phrases are already in the correct order and will simply be coppied to the outlist

repeat with i from 1 to count of phrases
    set currentPharase to item i of phrases

    -- becasue there are missing phrases we have to make sure we account for a shorter location list
    if locationCount is less than (count of locations) then
        set currentLocation to item 1 of item locationCount of locations
        set locationPhrase to item 2 of item locationCount of locations
    else
        set locationPhrase to "" -- list is longer but we still need to test if the phrases match so just setting it to something it will never be equal to
    end if

    -- only do this if we have gone past the first phrase
    if i > 1 then
        set previousLocation to item 1 of (item (length of outlist) of outlist)
    end if
    -- check to see if the phrases match
    if currentPharase = locationPhrase then
        -- some work must be done if we have gone beyond the available locations
        if locationCount is greater than (count of locations) then
            set F to item 1 of item 1 of outlist
            set L to item 1 of item -1 of outlist
            set split to round (previousLocation + getAvergeSteps(outlist, F, L))
            copy {split, currentPharase} to end of outlist
        else
            copy {item 1 of item locationCount of locations, currentPharase} to end of outlist
        end if
    else
        if locationCount is less than (count of locations) then
            set nextLocation to item 1 of item (locationCount + 1) of locations
            set split to round ((nextLocation + previousLocation) / 2)
        else
            set F to item 1 of item 1 of outlist
            set L to item 1 of item -1 of outlist
            set split to round (previousLocation + getAvergeSteps(outlist, F, L))
        end if
        copy {split, currentPharase} to end of outlist
        set locationCount to locationCount + 1 -- there was a location missing so lets up the location count
    end if

    set locationCount to locationCount + 1 -- location count is increased every round
end repeat

on getAvergeSteps(outlist, firstLocation, lastLocation)
    set LocationTotal to count of outlist
    -- get the estimate average change in steps to estimate what the numbers go in when we have gone beyond the number of locations available
    return round (((lastLocation - firstLocation) / LocationTotal) / 2)
end getAvergeSteps
return outlist