用Python创建Lexicon和Scanner

时间:2013-03-15 04:42:22

标签: python lexicon

我是编码领域的新人,我没有受到热烈的欢迎。我一直在尝试通过在线教程http://learnpythonthehardway.org/book/学习python。我已经能够在书中奋斗直到练习48&这就是他让学生们松散的地方,并说:“你搞清楚了。”但我根本做不到。我知道我需要创建一个可能单词的Lexicon,我需要扫描用户输入以查看它是否与Lexicon中的任何内容匹配,但这是关于它的!据我所知,我需要创建一个名为lexicon的列表:

lexicon = [
    ('directions', 'north'),
    ('directions', 'south'),
    ('directions', 'east'),
    ('directions', 'west'),
    ('verbs', 'go'),
    ('verbs', 'stop'),
    ('verbs', 'look'),
    ('verbs', 'give'),
    ('stops', 'the'),
    ('stops', 'in'),
    ('stops', 'of'),
    ('stops', 'from'),
    ('stops', 'at')
]

是吗?我不知道下一步该做什么?我知道列表中的每个项目都被称为元组,但这对我来说并不是什么意思。如何获取原始输入并将其分配给元组?你知道我的意思?因此,在练习49中,他导入词典并在python中打印lexicon.scan(“input”)并返回元组列表,例如:

from ex48 import lexicon
>>> print lexicon.scan("go north")
[('verb', 'go'), ('direction', 'north')]

'scan()'是预定义函数还是在词典模块中创建了函数?我知道如果你使用'split()'它会创建一个包含输入中所有单词的列表,但是它如何将'go'分配给元组('verb','go')?

我离开了吗?我知道我问了很多,但我到处寻找了几个小时,我不能自己想出这个。请帮忙!我会永远爱你!

7 个答案:

答案 0 :(得分:2)

我不会使用列表来制作词典。您正在将单词映射到其类型,因此请创建一个字典。

这是我可以在不写完整件事情的情况下给出的最大提示:

lexicon = {
    'north': 'directions',
    'south': 'directions',
    'east': 'directions',
    'west': 'directions',
    'go': 'verbs',
    'stop': 'verbs',
    'look': 'verbs',
    'give': 'verbs',
    'the': 'stops',
    'in': 'stops',
    'of': 'stops',
    'from': 'stops',
    'at': 'stops'
}

def scan(sentence):
    words = sentence.lower().split()
    pairs = []

    # Iterate over `words`,
    # pull each word and its corresponding type
    # out of the `lexicon` dictionary and append the tuple
    # to the `pairs` list

    return pairs

答案 1 :(得分:2)

根据ex48指令,您可以为每种单词创建一些列表。这是第一个测试用例的示例。返回的值是元组列表,因此您可以为给定的每个单词附加到该列表。

direction = ['north', 'south', 'east', 'west', 'down', 'up', 'left', 'right', 'back']

class Lexicon:
    def scan(self, sentence):
        self.sentence = sentence
        self.words = sentence.split()
        stuff = []
        for word in self.words:
            if word in direction:
                stuff.append(('direction', word))
        return stuff

lexicon = Lexicon()

他指出数字和例外的处理方式不同。

答案 2 :(得分:1)

最后我做到了!

lexicon = {
    ('directions', 'north'),
    ('directions', 'south'),
    ('directions', 'east'),
    ('directions', 'west'),
    ('verbs', 'go'),
    ('verbs', 'stop'),
    ('verbs', 'look'),
    ('verbs', 'give'),
    ('stops', 'the'),
    ('stops', 'in'),
    ('stops', 'of'),
    ('stops', 'from'),
    ('stops', 'at')
    }

def scan(sentence):

    words = sentence.lower().split()
    pairs = []

    for word in words:
        word_type = lexicon[word]
        tupes = (word, word_type) 
        pairs.append(tupes)

    return pairs

答案 3 :(得分:1)

这是一个非常酷的练习。我不得不研究好几天,最后才开始工作。这里的其他答案没有说明如何实际使用内部元组的列表,如电子书摘要,所以这样做会这样做。所有者的回答并不完全有效,词典[word]请求interger而不是str。

lexicon = [('direction', 'north', 'south', 'east', 'west'),
           ('verb', 'go', 'kill', 'eat'),
           ('nouns', 'princess', 'bear')]
def scan():
    stuff = raw_input('> ')
    words = stuff.split()
    pairs = []

    for word in words:

        if word in lexicon[0]:
            pairs.append(('direction', word))
        elif word in lexicon[1]:
            pairs.append(('verb', word))
        elif word in lexicon[2]:
            pairs.append(('nouns', word))
        else: 
            pairs.append(('error', word))

    print pairs

干杯!

答案 4 :(得分:1)

最喜欢这里的我是编码领域的新手,尽管我在下面附加了我的解决方案,但它可能会对其他学生有所帮助。

我已经看到了一些可以实施的更有效的方法。但是,该代码处理了练习的每个用例,由于我是由初学者自己编写的,因此它不需要复杂的快捷方式,对于其他初学者来说也应该很容易理解。

因此,我认为这可能对其他人的学习有益。让我知道你的想法。干杯!

class Lexicon(object): 

def __init__(self):
    self.sentence = []
    self.dictionary = {
        'north' : ('direction','north'),
        'south' : ('direction','south'),
        'east' : ('direction','east'),
        'west' : ('direction','west'),
        'down' : ('direction','down'),
        'up' : ('direction','up'),
        'left' : ('direction','left'),
        'right' : ('direction','right'),
        'back' : ('direction','back'),
        'go' : ('verb','go'),
        'stop' : ('verb','stop'),
        'kill' : ('verb','kill'),
        'eat' : ('verb', 'eat'),
        'the' : ('stop','the'),
        'in' : ('stop','in'),
        'of' : ('stop','of'),
        'from' : ('stop','from'),
        'at' : ('stop','at'),
        'it' : ('stop','it'),
        'door' : ('noun','door'),
        'bear' : ('noun','bear'),
        'princess' : ('noun','princess'),
        'cabinet' : ('noun','cabinet'),
    }

def scan(self, input):
    loaded_imput = input.split()
    self.sentence.clear()

    for item in loaded_imput:
        try:
            int(item)
            number = ('number', int(item))
            self.sentence.append(number)
        except ValueError:
            word = self.dictionary.get(item.lower(), ('error', item))
            self.sentence.append(word)

    return self.sentence
lexicon = Lexicon()

答案 5 :(得分:0)

很明显,Lexicon是ex48文件夹中的另一个python文件。

like: ex48
      ----lexicon.py

所以你从ex 48文件夹导入lexicon.py。

scan是lexicon.py中的一个函数

答案 6 :(得分:0)

这是我扫描ex48词典的版本。我也是编程的初学者,python是我的第一语言。因此,该程序可能无法达到目的,无论如何,经过多次测试,结果还是不错的。请随时改进代码。

警告

如果您还没有尝试自己做练习,我鼓励您不要尝试任何示例。

警告

我喜欢编程的一件事是,每次遇到问题时,我都会花大量时间尝试使用不同的方法来解决问题。我花了数周的时间来尝试创建结构,作为一个初学者,我从中学到了很多东西,而不是从他人那里学到东西,这确实是一种回报。

下面是我的词典,并在一个文件中搜索。

direction = [('direction', 'north'),
            ('direction', 'south'),
            ('direction', 'east'),
            ('direction', 'west'),
            ('direction', 'up'),
            ('direction', 'down'),
            ('direction', 'left'),
            ('direction', 'right'),
            ('direction', 'back')
]

verbs = [('verb', 'go'),
        ('verb', 'stop'),
        ('verb', 'kill'),
        ('verb', 'eat')
]

stop_words = [('stop', 'the'),
            ('stop', 'in'),
            ('stop', 'of'),
            ('stop', 'from'),
            ('stop', 'at'),
            ('stop', 'it')
]

nouns = [('noun', 'door'),
        ('noun', 'bear'),
        ('noun', 'princess'),
        ('noun', 'cabinet')
]   

library = tuple(nouns + stop_words + verbs + direction)

#below is the search method with explanation.

def convert_number(x):
try:
    return int(x)
except ValueError:
    return None


def scan(input):
#include uppercase input for searching. (Study Drills no.3)
lowercase = input.lower()
#element is what i want to search.
element = lowercase.split()
#orielement is the original input which have uppercase, for 'error' type
orielement = input.split()
#library is tuple of the word types from above. You can replace with your data source.
data = library
#i is used to evaluate the position of element
i = 0
#z is used to indicate the position of output, which is the data that match what i search, equals to "i".
z = 0
#create a place to store my output.
output = []
#temp is just a on/off switch. Turn off the switch when i get any match for that particular input.
temp = True
#creating a condition which evaluates the total search needed to be done and follows the sequence by +1.
while not(i == len(element)):
    try:
        #j is used to position the word in the library, eg 'door', 'bear', 'go', etc which exclude the word type.
        j = 0
        while not (j == len(data)):
            #data[j][1] all the single word in library
            matching = data[j][1]
            #when the word match, it will save the match into the output.
            if (matching == element[i]):
                output.append(data[j])
                #print output[z]
                j += 1
                z += 1
                #to switch off the search for else: below and go to next input search. Otherwise they would be considerd 'error'
                temp = False
            #else is everything that is not in the library.
            else:
                while (data[j][1] == data [-1][1]) and (temp == True):
                    #refer to convert_number, to test if the input is a number, here i use orielement which includes uppercase
                    convert = convert_number(orielement[i])
                    #a is used to save number only.
                    a = tuple(['number', convert])
                    #b is to save everything
                    b = tuple(['error', orielement[i]])
                    #conver is number a[1] is the access the number inside, if it returns None from number then it wont append. 
                    if convert == a[1] and not(convert == None):    
                        output.append(a)
                        temp = False
                    else:
                        output.append(b)
                        #keep the switch off to escape the while loop!
                        temp = False
                #searching in next data
                j += 1
        #next word of input
        i += 1
        temp = True
    except ValueError:
        return output
else:
    pass
return output