我如何从一行文本的开头获取数字,将它们拆分并打印出来

时间:2011-07-12 12:58:24

标签: python file text

这是我的困境:我正在用Python编写一个应用程序,它允许我搜索特定字符串的平面文件(KJV bible.txt),并返回搜索到的行号,书和字符串。但是,我还想返回找到字符串的章节和经文。这要求我走到行的开头并获得章节和诗节编号。我是Python新手,目前我还在阅读Guido van Rossum的Python教程。这是我想要为圣经学习小组完成的事情;便携式的东西几乎可以在任何地方运行在cmd模块中。我感谢任何帮助...谢谢。以下是圣经章节例子的摘录:

 Daniel


 1:1 In the third year of the reign of Jehoiakim king of Judah came
 Nebuchadnezzar king of Babylon unto Jerusalem, and besieged it.

说我搜索了'Jehoiakim',其中一个搜索结果是上面的第一行。我想去这行前面的数字(在这种情况下是1:1)并得到章节(1)和第(1)节并将它们打印到屏幕上。

 1:2 And the Lord gave Jehoiakim king of Judah into his hand, with part
 of the vessels of the house of God: which he carried into the land of
 Shinar to the house of his god; and he brought the vessels into the
 treasure house of his god.

代码:

 import os
 import sys
 import re

 word_search = raw_input(r'Enter a word to search: ')
 book = open("KJV.txt", "r")
 first_lines = {36: 'Genesis', 4812: 'Exodus', 8867: 'Leviticus', 11749: 'Numbers', 15718: 'Deuteronomy',
           18909: 'Joshua', 21070: 'Judges', 23340: 'Ruth', 23651: 'I Samuel', 26641: 'II Samuel',
           29094: 'I Kings', 31990: 'II Kings', 34706: 'I Chronicles', 37378: 'II Chronicles',
           40502: 'Ezra', 41418: 'Nehemiah', 42710: 'Esther', 43352: 'Job', 45937: 'Psalms', 53537: 'Proverbs',
           56015: 'Ecclesiastes', 56711: 'The Song of Solomon', 57076: 'Isaih', 61550: 'Jeremiah',
           66480: 'Lamentations', 66961: 'Ezekiel', 71548: 'Daniel' }


 for ln, line in enumerate(book):
     if word_search in line:
         first_line = max(l for l in first_lines if l < ln)
         bibook = first_lines[first_line]

         template = "\nLine: {0}\nString: {1}\nBook:\n"
         output = template.format(ln, line, bibook)
         print output

2 个答案:

答案 0 :(得分:5)

在空格上进行一次拆分,然后在:上拆分。

passage, text = line.split(None, 1)
chapter, verse = passage.split(':')

答案 1 :(得分:1)

使用regular expressionr'(\d+)\.(\d+)'

找到匹配项(match = re.match(r'(\d+)\.(\d+)', line))后,您可以找到第1组(chapter = match.group(1))中的章节和第2组中的经文。

使用此代码:

 for ln, line in enumerate(book):
      match = match = re.match(r'(\d+)\.(\d+)', line)
      if match:
           chapter, verse = match.group(1), match.group(2)

      if word_search in line:
           ...
           print 'Book %s %s:%s ...%s...' % (book, chapter, verse, line)