我有几句话:
TXT
我还有一个需要检查的元组列表:
wordlist = ['change', 'my', 'diaper', 'please']
我想做的是从不在元组列表中的所有单词中创建一个列表。
因此,此示例的结果为mylist = [('verb', 'change'), ('prep', 'my')]
我尝试过的操作似乎会创建重复项:
['diaper', 'please']
如何生成不在元组列表中的单词列表,并尽可能高效地进行?
不使用集合。
编辑:根据[word for tuple in mylist for word in wordlist if word not in tuple]
的以下限制选择答案
答案 0 :(得分:2)
这是一个使用列表理解的单人纸
{
"name": "florientr/laravel-gentelella",
"description": "The Laravel 5.4 framework with Gentelella template",
"keywords": [
"framework",
"laravel",
"laravel 5.4",
"gentelella",
"laravel-gentelella",
"template",
"bootstrap",
"responsive",
"admin",
"php",
"html",
"css",
"taggable",
"gravatar",
"form html"
],
"license": "MIT",
"version": "4.2.0",
"type": "project",
"require": {
"php": ">=5.6.4",
"laravel/framework": "5.4.*",
"thomaswelton/laravel-gravatar": "~1.0",
"rtconner/laravel-tagging": "~2.2",
"laravelcollective/html": "^5.4",
"cartalyst/sentinel": "2.0.*",
"laracasts/flash": "^2.0",
"unisharp/laravel-ckeditor": "^4.6",
"maatwebsite/excel": "~2.1.0"
},
"require-dev": {
"fzaninotto/faker": "~1.4",
"mockery/mockery": "0.9.*",
"phpunit/phpunit": "~5.7",
"symfony/css-selector": "3.1.*",
"symfony/dom-crawler": "3.1.*"
},
"autoload": {
"classmap": [
"database"
],
"psr-4": {
"App\\": "app/"
},
"files": [
"app/Helpers/helpers.php"
]
},
"autoload-dev": {
"classmap": [
"tests/TestCase.php"
]
},
"scripts": {
"post-root-package-install": [
"php -r \"copy('.env.example', '.env');\""
],
"post-create-project-cmd": [
"php artisan key:generate"
],
"post-install-cmd": [
"Illuminate\\Foundation\\ComposerScripts::postInstall",
"php artisan optimize"
],
"post-update-cmd": [
"Illuminate\\Foundation\\ComposerScripts::postUpdate",
"php artisan optimize"
]
},
"config": {
"preferred-install": "dist"
}
}
内部列表[word for word in wordlist if word not in [ w[1] for w in mylist ]]
从元组列表中提取第二个元素。
外部列表[ w[1] for w in mylist ]
提取单词,过滤掉刚提取的列表中的单词。
P.S。我以为您只想过滤元组列表的第二个元素。
答案 1 :(得分:2)
从元组列表中提取一个set
个已知单词:
myList = [('verb', 'change'), ('prep', 'my')]
known_words = set(tup[1] for tup in myList)
然后像以前一样使用它:
wordlist = ['change', 'my', 'diaper', 'please']
out = [word for word in wordlist if word not in known_words]
print(out)
# ['diaper', 'please']
检查集合中是否存在项是O(1),而检查列表或元组中的项是否是O(列表的长度),因此在这种情况下使用集合确实值得。
此外,如果您不关心单词的顺序并想删除重复项,则可以执行以下操作:
unique_new_words = set(wordlist) - known_words
print(unique_new_words)
# {'diaper', 'please'}
答案 2 :(得分:1)
这是我将您的元组展平(使用itertools.chain
并与该集合进行比较的版本(使用set
可以加快对{{1 }}运算符):
in
答案 3 :(得分:1)
我已经做出了一个假设,即tuple [1]只有一个元素,如果没有的话,则需要一个小的改动。
[word for word in wordlist if word not in [tuple[1] for tuple in mylist]]