Question

文件中的元组：

 ('Wanna', 'O')
 ('be', 'O')
 ('like', 'O')
 ('Alexander', 'B')
 ('Coughan', 'I')
 ('?', 'O')

我的问题是，如何从不同的元组中加入两个字符串，但是在条件相同的索引中？

例如在我的情况下，我想在[0]中加入字符串，如果[1]等于'B'，然后是'我'

所以输出就像：

  Alexander Coughan

这是我的代码，但输出不是我想要的，它只是打印出来的 “NONE”：

   readF = read_file ("a.txt")
   def jointuples(sentt, i):
        word= sentt[i][0]
        wordj = sentt[i-1][0]
        nameq = sentt[i][1]

        if nameq =='I':
           temp= ' '.join (word + wordj)
           return temp

   def join2features(sentt):
        return [jointuples(sentt, i) for i in range(len(sentt))]

   c_joint = [join2features(s) for s in readF]

   c_joint

Answer 1

以下是我如何写这个：

from ast import literal_eval
from itertools import tee

def pairwise(iterable): # from itertools recipes
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

with open("a.txt") as f:
    for p0, p1 in pairwise(map(literal_eval, f)):
        if p0[1] == 'B' and p1[1] == 'I':
            print(' '.join(p0[0], p1[0]))
            break

这就是原因：

您的文件包含两个字符串的Python元组repr。这是一个非常糟糕的格式，如果你可以改变你存储数据的方式，你应该这样做。但如果为时已晚并且你必须解析它，那么literal_eval就是最好的答案。

因此，我们通过文件map ping literal_eval将文件中的每一行转换为元组。

然后我们使用itertools recipes中的pairwise将元组的可迭代转换为可迭代的相邻元组对。

所以，现在，在循环内部，p0和p1将成为相邻行的元组，您可以准确地写出您所描述的内容：if p0[1]是{{1接下来是（即，'B'是）p1[1]，'I'两个join。

我不确定你想要对连接的字符串做什么，所以我只是把它打印出来。我也不确定你是想要处理多个值还是只想处理第一个值，所以我输入[0]。

Answer 2

我会扩展输入数据以包含更多'B' + 'I'个例子。

phrases = [('Wanna', 'O'),
    ('be', 'O'),
    ('like', 'O'),
    ('Alexander', 'B'),
    ('Coughan', 'I'),
    ('One', 'B'),
    ('Two', 'I'),
    ('Three', 'B')]

length = len(phrases)
res = ['%s %s' % (phrases[i][0], phrases[i + 1][0])
    for i in range(length)
    if i < length - 1 and phrases[i][1] == 'B' and phrases[i + 1][1] == 'I']
print(res)

结果是：

['Alexander Coughan', 'One Two']

Answer 3

这是一个单行解决方案

>>> t = [ ('wanna', 'o'),
... ('be', 'o'),
... ('like', 'o'),
... ('Alexander', 'B'),
... ('Coughan', 'I'),
... ('?', 'o')]
>>> x = [B[0] for B in t if B[1]=='B'][0] + ' ' + [I[0] for I in t if I[1]=='I'][0]
>>> print x
Alexander Coughan
>>>

Answer 4

当我去写我的时，我没有看到@ MykhayloKopytonenko的解决方案，所以我的相似：

tuples = [('Wanna', 'O'),
          ('be', 'O'),
          ('like', 'O'),
          ('Alexander', 'B'),
          ('Coughan', 'I'),
          ('?', 'O'),
          ('foo', 'B'),
          ('bar', 'I'),
          ('baz', 'B'),]
results = [(t0[0], t1[0]) for t0, t1 in zip(tuples[:-1], tuples[1:])
                          if t0[1] == 'B' and t1[1] == 'I']
for r in results:
    print("%s %s" % r)

输出：

Alexander Coughan
foo bar
>>>

如果绝对必须将结果作为字符串返回，请将列表理解更改为：

 results = ["%s %s" % (t0, t1) for t0, t1 in zip(tuples[:-1], tuples[1:])
                               if t0[1] == 'B' and t1[1] == 'I']

这利用了以下事实：根据您的条件，元组列表中的 last 元素将从不作为结果集的第一个元素返回。因此，zip会有效地引导您完成(tuples[n], tuples[n + 1])，以便您可以轻松检查这些值。

如何使用Python连接来自不同元组的两个字符串但在同一索引中？

4 个答案: