Question

我一直在较小的文件上使用此脚本，它运行正常。但是当我找到一个包含大约6,000条记录的文件时，它开始给我错误＆＃34;该命令以非零状态退出。＆＃34;

set csvfile to (choose file with prompt "Please choose a CSV:" of type "csv")
set the_read_csv to (read csvfile)
set non_standard_chars to (do shell script "echo " & quoted form of the_read_csv & " | tr -d '[:alpha:]''[:cntrl:]''[:space:]''[:digit:]''[:punct:]'")

echo命令是否有限制？或者有更简单的方法吗？我基本上试图通过shell脚本检查文件中的非ASCII字符。

Answer 1

当您可以将echo与文件名一起使用时，请勿使用cat和一个巨大的字符串。更好的是，只需使用重定向，以便tr直接从文件中获取输入：

set non_standard_chars to (¬
  do shell script "tr -d '[:alpha:][:cntrl:][:space:][:digit:][:punct:]' <" & ¬
      quoted form of POSIX path of csvfile ¬
)

Answer 2

你说：

我基本上是在尝试检查文件是否为非ASCII字符

下一个：

perl -pe 's/[[:ascii:]]//g;' <<EOF
asciiáščíí
EOF

删除所有ascii字符，因此打印非ascii

áščíí

用于文件运行

perl -pe 's/[[:ascii:]]//g;'  filename
#or
perl -pe 's/[[:ascii:]]//g;'  < filename
#or
something | perl -pe 's/[[:ascii:]]//g;'

和

perl -pe 's/[^[:ascii:]]//g;'

将删除所有非ascii并仅打印ascii

ascii

在回声中遇到一些麻烦＆＃39;＆＃39; | tr -d＆＃39;＆＃39; bash命令

2 个答案: