我想在python中列出字符串,例如

时间:2015-05-20 09:33:14

标签: python string list

我想列出字符串跟随字符串。

STUDENT
a john
a anny
SUBJECT
b math
b physical
CLASS
a one
a two
a three
STUDENT
a pone
b julia
b sopia
CLASS
a four
a five
PROFESSOR
b uno
b sonovon
PROFESSOR
b jone

我的目标是删除重复的SUBJECT并加入内容。

SUBJECT可以是随机的上部字符串。

但内容必须是ab

我该怎么做?

2 个答案:

答案 0 :(得分:1)

因为关于SUBJECTS的唯一信息是它们是高位字符串,所以您可以使用isupper()谓词以这种方式拆分文件:

def split_string(file_name):
    list_ = [ x for x in open(file_).read().splitlines()]
    for i,j in enumerate(list_):
        if not (j.isupper() and list_[i + 1].isupper()):
            print j 
split(file_name)

注意:我想这里你的字符串存储在一个文件

答案 1 :(得分:1)

只需使用主题作为关键字对dict中的元素进行分组:

from collections import OrderedDict
od = OrderedDict()
with open("match.txt") as f:
    key = next(f)
    for line in f:
        if line.startswith(("a","b")):
            od.setdefault(key,[]).append(line)
        else:
            key = line

输出:

for sub,cont in od.items():
    print("{}, {}".format(sub, cont))

STUDENT
, ['a john\n', 'a anny\n', 'a pone\n', 'b julia\n', 'b sopia\n']
SUBJECT
, ['b math\n', 'b physical\n']
CLASS
, ['a one\n', 'a two\n', 'a three\n', 'a four\n', 'a five\n']
PROFESSOR
, ['b uno\n', 'b sonovon\n', 'b jone']

正确分组数据,这就是我的目标是删除重复的SUBJECT并加入内容。非常明显,这就是你想要的。

OrderedDict将保持顺序,如果你想将更新的行写入文件只是重新打开并在迭代时编写.items?

with open("match.txt", "w") as f:
    for sub, cont in od.items():
        f.write(sub)
        f.writelines(cont)

新输出,由主题加入:

STUDENT
a john
a anny
a pone
b julia
b sopia
SUBJECT
b math
b physical
CLASS
a one
a two
a three
a four
a five
PROFESSOR
b uno
b sonovon
b jone