如何在列表中拆分字符串并将每个单词与一组关键字进行比较

时间:2016-12-13 09:26:41

标签: python string python-2.7

我有一个包含以下内容的列表: -

for 30 days
for 40 working days
for 20 weeks
for 2 months

我想分割每个句子并与一组关键词进行比较: -

day
week
month
year

如果字符串中存在关键字'days',那么我想将该字符串中的数字乘以'1'。如果关键字'month'存在,那么将该字符串中的数字乘以'30'等等......我是python的新手,所以请!

我的代码

   with open("test_term.csv", "rb") as file1:
        reader = csv.reader(file1)
        extractedlist = list(reader)
        #print extractedlist
def split_line(text):
    # split the text
    words = text[0].split(' ')
    # for each word in the line:
    new_list = []
    for word in words:
        #print word
        #print w2n.word_to_num(word)
        conversion = w2n.word_to_num(word)
        if isinstance(conversion, (int,long)):
            #print conversion
            new_list.append(conversion)            

        else:
            new_list.append(word)


    return new_list

for extraRow in extractedlist:
    worn = split_line(extraRow)
    keywords = {"day":1,"days":1,"year":365,"years":365,"week":7,"weeks":7,"month":30,"months":30}
    #for s in worn:
     #   splitted_string = s.split(' ')
    interesting_words = worn[2:]
    mult = 1
    for k,v in keywords.iteritems():
        for word in interesting_words :
            mult = v
            break
        result = mult*worn[1]
        print result

现在我只有一个输入字符串for thirty working days'thirty'正在转换为'30'所以我们有'for thirty working days' 输出是: -

210  
900  
10950
900  
210  
10950
30   
30   

但我期望的输出是30 * 1,即'30'

3 个答案:

答案 0 :(得分:0)

你可以先创建一个词典: dictionnary = {"day":1, "month":30 ... }

分裂字符串,如:

splitted_string = ["for", 30, "working", "days"]
interesting_words = splitted_string[2:] # ["working", "days"]
从那里开始,你可以获得元素“days”并在你的dictionnary中找到相应的元素。找到元素后,我们只需获取值并打破循环。

mult = 1
for k,v in dictionnary.iteritems():
    for word in interesting_words :
        if k in word :
            mult = v
            break

您最终可以执行操作:

result = mult*splitted_string[1] #30

答案 1 :(得分:0)

import csv     # imports the csv module

f = open('file.csv', 'rb') # opens the csv file
results = []
try:
    reader = csv.reader(f)  # creates the reader object
    for row in reader:   # iterates the rows of the file in orders
        l = row[0].split(' ')
        if 'day' in l[2]:
            l[1] = int(l[1]) * 1
        elif 'working' in l[2]:
            if len(l) > 3  and 'day' in l[3]:
                l[1] = int(l[1]) * 1
        elif 'week' in l[2]:
            l[1] = int(l[1]) * 7
        elif 'month' in l[2]:
            l[1] = int(l[1]) * 30
        elif 'year' in l[2]:
            l[1] = int(l[1]) * 365
        results.append(l)

finally:
    print results
    f.close()      # closing

答案 2 :(得分:0)

如果您的数据在列表中,您可以迭代它。然后拆分每个字符串并在列表末尾搜索关键字('day' in ' '.join(data_split[2:])):

data = ['for 30 days',
    'for 40 working days',
    'for 20 weeks',
    'for 2 months']

for d in data:
    data_split = d.split(' ')
    if 'day' in ' '.join(data_split[2:]):
        print(int(data_split[1]))
    elif 'month' in ' '.join(data_split[2:]):
        print(int(data_split[1]) * 30)