Question

执行某些操作后，我得到list如下：

FreqItemset(items=[u'A_String_0'], freq=303)
FreqItemset(items=[u'A_String_0', u'Another_String_1'], freq=302)
FreqItemset(items=[u'B_String_1', u'A_String_0', u'A_OtherString_1'], freq=301)

我想从列表中删除所有项目从 A_String_0 开始，但我想保留其他项目（如果A_String_0存在于中间，则无关紧要或在项目结束时

因此，在上面的示例中删除第1行和第2行，请保留第3行

我试过

 filter(lambda a: a != 'A_String_0', result)

和

result.remove('A_String_0')

这一切都没有帮助我

Answer 1

result = result if result[0] != 'A_String_0' else result[1:]怎么样？

Answer 2

就这么简单：

from pyspark.mllib.fpm import FPGrowth

sets = [
    FPGrowth.FreqItemset(
       items=[u'A_String_0'], freq=303),
    FPGrowth.FreqItemset(
        items=[u'A_String_0', u'Another_String_1'], freq=302),
    FPGrowth.FreqItemset(
        items=[u'B_String_1', u'A_String_0', u'A_OtherString_1'], freq=301)
]

[x for x in sets if x.items[0] != 'A_String_0']
## [FreqItemset(items=['B_String_1', 'A_String_0', 'A_OtherString_1'], freq=301)]

在实践中，最好过滤beffore collect：

filtered_sets = (model
    .freqItemsets()
    .filter(lambda x: x.items[0] != 'A_String_0')
    .collect())

Answer 3

您似乎正在使用名为 FreqItemset 的列表。但是，该名称表明您应该使用设置，而不是列表。

这样，您可以拥有一组可搜索的字符串，频率。例如：

>>> d = { "the": 2, "a": 3 }
>>> d[ "the" ]
2
>>> d[ "the" ] = 4
>>> d[ "a" ]
3
>>> del d[ "a" ]
>>> d
{'the': 4}

您可以轻松访问每个单词（这是字典的键），更改其值（显示频率）或删除它。所有操作都避免访问列表中的所有元素，因为它是字典，即它的性能良好（无论如何都比使用列表更好）。

只是我的两分钱。

Python 2.7：按值从列表中删除项目

3 个答案: