如何统计列表中的单词?

时间:2016-04-26 05:55:36

标签: python

<?xml version="1.0" encoding="utf-8"?>
<android.support.v7.widget.CardView  xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:card_view="http://schemas.android.com/apk/res-auto"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_gravity="center"
app:cardCornerRadius="2dp"
app:cardElevation="1dp"
app:cardUseCompatPadding="true"
card_view:cardPreventCornerOverlap="false">
<RelativeLayout
    android:id="@+id/ll1"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:background="@color/icons"
    android:padding="@dimen/margin_small">

    <TextView
        android:id="@+id/postTitle"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_alignParentLeft="true"
        android:layout_alignParentTop="true"
        android:layout_marginBottom="@dimen/margin_tiny"
        android:maxLines="2"
        android:text="TITLE OF THE POST"
        android:textColor="@color/primary_text"
        android:textSize="@dimen/font_regular"
        android:textStyle="bold"
        app:typeface="roboto_bold"/>


    <ImageView
        android:id="@+id/imageView"
        android:layout_width="match_parent"
        android:layout_height="160dp"
        android:background="@color/icons"
        android:focusable="false"
        android:scaleType="centerCrop"
        android:layout_below="@+id/postTitle" />

</RelativeLayout>

让我知道如何计算我们创建的列表中重复单词的数量。 实际上我也很困惑上面的代码。 如果有人能够解释我的错误,那将是非常感激的。

5 个答案:

答案 0 :(得分:2)

集合模块中使用Counter方法:

from bs4 import BeautifulSoup
from collections import Counter
import urllib2
# Imported libraries for future use.
response = urllib2.urlopen('http://www.nytimes.com').read()
soup = BeautifulSoup(response,"lxml")

host = []
#created empty list to append future words extracted from data set.
for story_heading in soup.find_all(class_="story-heading"):
    story_title = story_heading.text.replace("\n", " ").strip()
    new_story_title = story_title.encode('utf-8')


    parts = new_story_title.split()[0]

    i=['a','A','an','An','the','The','from','From','to','To','when','When','what','What','on','On','for','For']
    if parts not in i:
        host.append(parts)
    else:
        pass
#now i have to calculate the number of repeated words in the file and calcute the number of repeatation.    
print Counter(host)

输出:

>>> ================================ RESTART ================================
>>> 
Counter({'North': 2, 'Trump': 1, 'U.S.': 1, 'Kasich-Cruz': 1, '8': 1, 'Court': 1, 'Where': 1, 'Your': 1, 'Forget': 1})
>>> 

答案 1 :(得分:1)

您可以使用count

执行此操作
d = {i: host.count(i) for i in set(host)}
print(d)

答案 2 :(得分:1)

使用字典理解迭代一组元素:

  • 区分大小写的版本(&#34;什么&#34;!=&#34;什么&#34;):

    occurrences = { item: host.count(item) for item in set(host) }
    
  • 不区分大小写的版本(&#34;什么&#34; ==&#34;什么&#34;):

    occurrences = { item: host.count(item) for item in set(item.lower() for item in host) }
    

    在这种情况下,字典键也将是小写元素。

答案 3 :(得分:0)

您可以看到以下不使用列表推导的代码段。我觉得这应该很容易理解。

host = ['Hello','foo','bar','World','foo','Hello']
dict1 = {}
host_unique = list(set(host))
for i in host_unique:
    dict[i] = host.count(i)

答案 4 :(得分:0)

使用:

lst = ['hi', 'Hio', 'Hi', 'hello', 'there' ]
s = set()
map(lambda x: s.add(x.lower()), lst)
print(len(s))

OR

lst = ['hi', 'Hio', 'Hi', 'hello', 'there' ]
s = set()
for item in lst:
    s.add(item.lower())
print(len(s))