因此,我的代码只使用HTML标签创建字符串的inline-diff(基于每个单词),因此CSS可以隐藏/显示已删除/添加的字符串。 在我的测试中,我使用()进行添加,使用{}进行删除。
这是我的文字 输入:
"e <b><u>Zerg</u></b> a"
"e Zerg a"
输出:
"e(?)(\240){ <b>}{<u>}Zerg(?)(\240){</u>}{</b>}{ }a"
现在,我根本没有做任何改变编码的事情,所以..我真的很困惑一个问号和\ 240如何进入那里。 o.o
这是什么类型的编码?
我正在使用ruby 1.8.7
找到问题的根源。当我将新字符串转换为Diff :: LCS使用的数组时会发生这种情况:代码:
def self.convert_html_string_to_html_array(str)
=begin
Things like   (and other char codes), and tags need to be considered the same element
also handles the decision to diff per char or per word
also need to take into consideration javascript and css that might be in the middle of a selection
=end
result = Array.new
compare_words = str.has_at_least_one_word?
i = 0
while i < str.length do
cur_char = str[i..i]
case cur_char
when "&"
# for this we have two situations, a stray char code, and a char code preceeding a tag
next_index = str.index(";", i)
case str[next_index + 1..next_index + 1] # check to see if tag
when "<"
next_index = str.index(">", i)
end
result << str[i..next_index]
i = next_index
when "<"
next_index = str.index(">", i)
result << str[i..next_index]
i = next_index
when " "
result << cur_char
else
if compare_words
# in here we need to check the above rules again, cause tags can be touching regular text
next_index = i + 1
next_index = str.index(" ", next_index)
next_index = str.length if next_index.nil?
next_index -= 1
if i < str.length and str[i..next_index].include?("<") # beginning of a tag
next_index = str.index(">", i)
end
result << str[i..next_index] # don't want to include the space
i = next_index
else
result << cur_char
end
end
i += 1
end
return result # removes the trailing empty string
end
澄清,这:
'e Zerg a'
变成了这个:
[
[0] "e",
[1] "\302",
[2] "\240",
[3] "Z",
[4] "e",
[5] "r",
[6] "g",
[7] "\302",
[8] "\240",
[9] "a"
]
答案 0 :(得分:0)
更新到1.9.2或更高版本(我建议使用RVM),1.8.7有一些奇怪的东西用字符串...