Question

我编写了以下代码，将CharList（整数列表）拆分为单词列表。我已经使用trace运行它并且它完成所有工作并将它全部放在一个列表中，但它不会将WordList变量绑定到值。

splitintowords(CharList, WordList):- splitintowords_(CharList, [], [], WordList).
splitintowords_([],[], Res, Res).
splitintowords_([],Word, WordList,_):-
    \+ Word = [],
    append([Word], WordList, NWordList),
    splitintowords_([],[], NWordList,_).
splitintowords_([H|T],Word, WordList,_):-
    alphabetic(H),
    append([H], Word, NWord),
    splitintowords_(T,NWord, WordList,_).
splitintowords_([H|T],Word, WordList,_):-
    \+ alphabetic(H),
    append([Word], WordList, NWordList),
    splitintowords_(T, [], NWordList,_).

使用Swipl的测试用例：

string_codes('hello there, how are you?',T), splitintowords(T,  W).

这是我先写的一个替代实现，这个实现也不起作用：

splitintowords(CharList, WordList):-
    splitintowords_(CharList, [[]], WordList).
splitintowords_([], WordList, WordList).
splitintowords_([H|T], [Word|Words], _):-
    alphabetic(H),
    WordN = [H|Word],
    splitintowords_(T, [WordN|Words], _).
splitintowords_([H|T], Words):-
    \+ alphabetic(H),
    WordsN = [[]|Words],
    splitintowords(T, WordsN).

这个更简单，似乎与痕迹相同。

Answer 1

对于您的第一个解决方案，问题是虽然您构建了正确的列表，但是您在每个步骤中传递一个匿名变量'_'，该变量与任何内容统一，而不是最终要返回的列表变量的名称

我尝试使用内置谓词is_alpha / 1而不是字母来构建你的解决方案，因为你没有;给定代码，但它应该是相同的......代码有一些变化：

splitintowords(CharList, WordList):- splitintowords_(CharList, [], [], WordList).

splitintowords_([],[], Res, Res).
splitintowords_([],Word, WordList,L):-
    \+ Word = [],
    append([Word], WordList, NWordList),
    splitintowords_([],[], NWordList,L).
splitintowords_([H|T],Word, WordList,L):-
    is_alpha(H),
    append([H], Word, NWord),
    splitintowords_(T,NWord, WordList,L).
splitintowords_([H|T],Word, WordList,L):-
    \+ is_alpha(H),
    append([Word], WordList, NWordList),
    splitintowords_(T, [], NWordList,L).

示例：

 string_codes('hello there, how are you?',T), splitintowords(T,  W).
T = [104, 101, 108, 108, 111, 32, 116, 104, 101|...],
W = [[117, 111, 121], [101, 114, 97], [119, 111, 104], [], [101, 114, 101, 104|...], [111, 108, 108|...]] ;
false.

请注意，因为解决方案非常大，所以显示....，您可以通过在swi-prolog中按w来查看完整解决方案：

?- string_codes('hello there, how are you?',T), splitintowords(T,  W).
T = [104, 101, 108, 108, 111, 32, 116, 104, 101|...],
W = [[117, 111, 121], [101, 114, 97], [119, 111, 104], [], [101, 114, 101, 104|...], [111, 108, 108|...]] [write]
T = [104, 101, 108, 108, 111, 32, 116, 104, 101, 114, 101, 44, 32, 104, 111, 119, 32, 97, 114, 101, 32, 121, 111, 117, 63],
W = [[117, 111, 121], [101, 114, 97], [119, 111, 104], [], [101, 114, 101, 104, 116], [111, 108, 108, 101, 104]] ;
false.

（第二个T，W是第一个T，W的完整解。）

还有一些类似的改变你的第二个解决方案：

splitintowords(CharList, WordList):-
    splitintowords_(CharList, [[]], WordList).

splitintowords_([], WordList, WordList).

splitintowords_([H|T], [Word|Words], L):-
    is_alpha(H),
    WordN = [H|Word],
    splitintowords_(T, [WordN|Words], L).

splitintowords_([H|T], Words,L):-
    \+ is_alpha(H),
    WordsN = [[]|Words],
    splitintowords_(T, WordsN,L).

示例：

?- string_codes('hello there, how are you?',T), splitintowords(T,  W).
T = [104, 101, 108, 108, 111, 32, 116, 104, 101, 114, 101, 44, 32, 104, 111, 119, 32, 97, 114, 101, 32, 121, 111, 117, 63],
W = [[], [117, 111, 121], [101, 114, 97], [119, 111, 104], [], [101, 114, 101, 104, 116], [111, 108, 108, 101, 104]] ;
false.

请注意，在中间某处，它比以前的解决方案有一个空列表。如果你想要删除空列表，那么通过再写一个谓词就可以了：

delete([],[]).
delete([H|T],[H|T1]):-dif(H,[]),delete(T,T1).
delete([[]|T],T1):-delete(T,T1).

示例：

?- delete([[], [117, 111, 121], [101, 114, 97], [119, 111, 104], [], [101, 114, 101, 104, 116], [111, 108, 108, 101, 104]],L).
L = [[117, 111, 121], [101, 114, 97], [119, 111, 104], [101, 114, 101, 104, 116], [111, 108, 108, 101, 104]] ;
false.

（它删除了空列表）。

修改

我还使用您的谓词alphabetic/1对其进行了测试，并生成与is_alpha/1完全相同的输出。

Answer 2

这是我使用差异列表实现更快速度的实现。它的工作非常直接，从Chars开始，检查H是否可以与任何拆分字符统一，如果是，它会创建一个新单词并将其添加到Bcu中，如果不是，它会将头部追加到Acu的末尾。

%split_words(+Chars, +Splits, -Words).
% it needs list of Chars, list of Split characters, and returns list of
% words.
split_words(Chars, Splits, Words) :-
  split_words(Chars, Splits, X - X, Y - Y, Words).
split_words([], _, [], Bcu, Words) :- listify(Bcu, Words).
split_words([], _, Acu, Bcu, Words) :-
  listify(Acu, W), add(Bcu, W, New), listify(New, Words).
split_words([H|C], Splits, Acu, Bcu, Words) :-
  member(H, Splits), !, listify(Acu, W), add(Bcu, W, New),
  split_words(C, Splits, X - X, New, Words).
split_words([H|C], Splits, Acu, Bcu, Words) :-
 add(Acu, H, New), split_words(C, Splits, New, Bcu, Words).

add(X - [I|Y], I, X - Y).

listify(X - [], X).

Answer 3

我一定是在误解这个问题。我将使用字符而不是代码，以便更容易在顶层看到发生了什么。如果我理解措辞并尝试在Prolog中编写代码，那么您正在尝试：

?- string_chars("hello there, how are you?", Chars),
   chars_words(Chars, Words).
Chars = [h, e, l, l, o, ' ', t, h, e|...],
Words = [[h, e, l, l, o], [t, h, e, r, e], [h, o, w], [a, r, e], [y, o, u]].

在这里，您已经连续运行了alpha个字符，然后将它们放在一起作为单词;并且你已经丢弃了它们之间的所有非字母字符。

有了这个假设，以下是你如何处理它：

剥离领先的非alpha字符
前面的连续alpha字符是一个单词
丢弃尾随的非alpha字符并转到2

在Prolog：

chars_words([], []).
chars_words([C|Cs], Words) :-
        strip_non_alpha([C|Cs], Rest),
        chars_words_aux(Rest, Words).

chars_words_aux([], []).
chars_words_aux([C|Cs], [W|Ws]) :-
        word_rest([C|Cs], W, Rest0),
        strip_non_alpha(Rest0, Rest),
        chars_words_aux(Rest, Ws).

在任何时候都不需要使用append/3，因为你并没有真正改变字符从原始列表到单词列表的顺序。

我使用了一个名为partition_sorted/4的谓词来定义word_rest/3和strip_non_alpha/2。它需要一个列表并将其拆分为两个：一个谓词成功的前端和一个rest（其余的第一个元素是谓词失败的原始列表中的第一个元素）。

strip_non_alpha(List, Rest) :-
        partition_sorted(not_alpha, List, _, Rest).

word_rest(List, Word, Rest) :-
        partition_sorted(alpha, List, Word, Rest).

not_alpha(X) :- \+ char_type(X, alpha).
alpha(X) :- char_type(X, alpha).

最后，partition_sorted/4的一个天真的定义：

partition_sorted(Goal, List, True, False) :-
        partition_sorted_aux(List, Goal, True, False).

partition_sorted_aux([], _, [], []).
partition_sorted_aux([X|Xs], Goal, True, False) :-
        (       call(Goal, X)
        ->      True = [X|True0],
                partition_sorted_aux(Xs, Goal, True0, False)
        ;       True = [], False = [X|Xs]
        ).

这个定义是天真的，因为只有Goal成功或失败一次而不留下选择点，并且输入列表是基础的，它才能正常工作。

将字符串拆分为Prolog

3 个答案: