正则表达式,如何使用正则表达式获得一定数量的文本行?

时间:2016-04-19 18:49:57

标签: python regex parsing

我试图将类和类号与先决条件一起拉出来。但是我很难获得该课程(例如:ACCT 203),包括先决条件(例如:ACCT 301),而不是其他任何内容。我在python中这样做,希望以后将这些数据插入数据库。有人能帮忙吗?我对正则表达式相对较新。

ACCT 203 
Financial Accounting Three Credits 
Development of basic accounting concepts. Emphasis is on the classifying, 
recording, and reporting of business transactions for all forms of business 
organizations. Offered every semester.
ACCT 204 
Managerial Accounting 
Three Credits
Emphasis is on generating, analyzing, and using accounting information in the 
planning and control processes. Topics include budgets, standards, cost systems, 
incremental analysis, and ~nancial statement analysis. Offered every semester. 
Prerequisite: 
ACCT 203
ACCT 301
Intermediate Accounting I 
Three Credits
This is the ~rst course in a two-course sequence that is intended to provide a 
comprehensive understanding of the concepts, principles, assumptions, and 
conventions that are used for classifying, recording, and reporting economic 
transactions for a business entity. Offered every fall. 
Prerequisite: 
ACCT 204 or permission of instructor
ACCT 302 
Intermediate Accounting II 
Three Credits
This is the second course in a two-course sequence that is intended to provide 
a comprehensive understanding of the concepts, principles, assumptions, and 
conventions that are used for classifying, recording, and reporting economic 
transactions for a business entity. Offered every spring. 
Prerequisite: 
ACCT 301 or permission of instructor
ACCT 303 
Accounting Theory and Practice 
Three Credits
This course is intended to provide an understanding of items that present 
measurement and reporting problems for the accountant. It will also discuss 
current issues that the accounting profession is attempting to establish and 
guidelines for their measurement and reporting. 
Prerequisite: 
ACCT 302
ACCT 310 

1 个答案:

答案 0 :(得分:1)

我不确定这是不是你想要的。但这是我的解决方案;

>>> classes = re.findall("[A-Z][A-Z][A-Z][A-Z] [0-9][0-9][0-9]", text)  
>>> for i in classes:             #just find element by order
...     print(i)
...
ACCT 203
ACCT 204
ACCT 203
ACCT 301
ACCT 204
ACCT 302
ACCT 301
ACCT 303
ACCT 302
ACCT 310