Question

我有一个简单的代码将文本文件读入列表。它是以这种格式列出的CMYK值：00, 100, 64, 33。出于某种原因，输出正在用奇怪的字符替换空格......“¬†”（返回和匕首？）。

所以这个脚本：

set cmykList to {}
set eachLine to paragraphs of (read POSIX file "/Users/me/Desktop/cmyk.txt")
repeat with nextLine in eachLine
    if length of nextLine is greater than 0 then
        copy (nextLine as text) to the end of cmykList
    end if
end repeat
choose from list cmykList

返回： 00,¬†100,¬†64,¬†33, 00,¬†00,¬†00,¬†00, 100,¬†72,¬†00, 100,¬†35,¬†00,¬†100

关于为什么会这样做的任何想法，以及我如何避免这种想法？

文本文件设置如下：

00, 100, 64, 33  
00, 00, 00, 00  
100, 72, 00, 18  
100, 35, 00, 100  
00, 16, 100, 00
00, 100, 63, 29
00, 66, 100, 07
03, 00, 00, 32
100, 35, 00, 100
00, 100, 81, 04
04, 02, 00, 45
00, 00, 00, 00
03, 00, 00, 32
100, 35, 00, 100

编辑：已解决此问题，执行查找/替换：

set cmykList to {}
set eachLine to paragraphs of (read POSIX file "/Users/me/Desktop/cmyk.txt")
repeat with nextLine in eachLine
    if length of nextLine is greater than 0 then
        set theText to (nextLine as text)
        set AppleScript's text item delimiters to "¬†"
        set theTextItems to text items of theText
        set AppleScript's text item delimiters to " "
        set theText to theTextItems as string
        set AppleScript's text item delimiters to {""}
        copy (theText as text) to the end of cmykList
    end if
end repeat
set chooseList to choose from list cmykList

但仍然非常好奇为什么会发生这种情况。

Answer 1

这两个字符（ASCII 194 160）是Unicode NO-BREAK SPACE字符的UTF-8表示。

您没有指定文本文件的来源，但无论它来自哪里，都使用不间断的空格而不是常规空格。正如您所发现的那样，您可以在读取文件时用常规空格替换它们来解决问题。

Answer 2

您的文件包含UTF8编码的Unicode文本。 Standard Additions的read和write命令（愚蠢地）默认使用古老的经典MacOS时代遗留编码，因此您需要告诉他们明确使用UTF8：

set eachLine to paragraphs of (read POSIX file "/Users/me/Desktop/cmyk.txt" as «class utf8»)

Applescript：奇怪的人物（“¬†”）取代空间

2 个答案: