将ListA元素与部分匹配的ListB元素连接起来

时间:2018-07-28 13:47:11

标签: python string list concatenation string-matching

说我有两个Python列表:

ListA = ['Jan 2018', 'Feb 2018', 'Mar 2018']
ListB = ['Sales Jan 2018','Units sold Jan 2018','Sales Feb 2018','Units sold Feb 2018','Sales Mar 2018','Units sold Mar 2018']

我需要获得如下输出:

List_op = ['Jan 2018 Sales Jan 2018 Units sold Jan 2018','Feb 2018 Sales Feb 2018 Units sold Feb 2018','Mar 2018 Sales Mar 2018 Units sold Mar 2018']

到目前为止我的方法:

res=set()
for i in ListB:
    for j in ListA:
        if j in i:
            res.add(f'{i} {j}')

print (res)

这给我的结果是:

{'Units sold Jan 2018 Jan 2018', 'Sales Feb 2018 Feb 2018', 'Units sold Mar 2018 Mar 2018', 'Units sold Feb 2018 Feb 2018', 'Sales Jan 2018 Jan 2018', 'Sales Mar 2018 Mar 2018'}

这绝对不是我正在寻找的解决方案。

我认为正则表达式在这里可能很少,但是我不确定该如何处理。在这方面的任何帮助都将受到高度赞赏。

谢谢。

编辑:

ListA和ListB中的值不一定是按顺序排列的。因此,对于ListA中的特定月份/年份值,必须匹配并选择来自ListB的相同月份/年份值,以用于“销售”和“已售单位”部分,并且需要进行串联。

我在这里的主要目标是获取列表,以后可以使用该列表生成将用于编写Hive查询的语句。

添加了@andrew_reece建议的更多解释

3 个答案:

答案 0 :(得分:1)

假设ListA和ListB已排序:

ListA = ['Jan 2018', 'Feb 2018', 'Mar 2018']
ListB = ['Sales Jan 2018','Units sold Jan 2018','Sales Feb 2018','Units sold Feb 2018','Sales Mar 2018','Units sold Mar 2018']

print([v1 + " " + v2 for v1, v2 in zip(ListA, [v1 + " " + v2 for v1, v2 in zip(ListB[::2], ListB[1::2])])])

这将打印:

['Jan 2018 Sales Jan 2018 Units sold Jan 2018', 'Feb 2018 Sales Feb 2018 Units sold Feb 2018', 'Mar 2018 Sales Mar 2018 Units sold Mar 2018']

在我的示例中,我首先将ListB变量连接在一起,然后将ListA与这个新列表连接起来。

答案 1 :(得分:1)

假设没有其他需要注意的边缘情况,您的原始代码也不错,只需稍作更新即可:

List_op = []
for a in ListA:
    combined = a
    for b in ListB:
        if a in b:
            combined += " " + b
    List_op.append(combined)

List_op
['Jan 2018 Sales Jan 2018 Units sold Jan 2018',
 'Feb 2018 Sales Feb 2018 Units sold Feb 2018',
 'Mar 2018 Sales Mar 2018 Units sold Mar 2018']

答案 2 :(得分:1)

字符串连接可能会变得很昂贵。在Python 3.6+中,您可以在列表理解中使用更高效的f-strings

res = [f'{i} {j} {k}' for i, j, k in zip(ListA, ListB[::2], ListB[1::2])]

print(res)

['Jan 2018 Sales Jan 2018 Units sold Jan 2018',
 'Feb 2018 Sales Feb 2018 Units sold Feb 2018',
 'Mar 2018 Sales Mar 2018 Units sold Mar 2018']

使用itertools.islice,可以避免创建新列表的开销:

from itertools import islice

zipper = zip(ListA, islice(ListB, 0, None, 2), islice(ListB, 1, None, 2))
res = [f'{i} {j} {k}' for i, j, k in zipper]