使用itertools组合将数据从一个单元格拆分为两个

时间:2019-05-23 13:43:18

标签: python pandas

我有一个数据框,其中的列包含元素列表。我想将此列分为原始元素组合的两列。

例如:

packageTrackingIdentification

此df将变为

var result = BulkmailAnnouncements.GroupBy(

   // Key: make groups with same "certain properties"
   announcement => new
   {
       CustomerOrderId,
       CustomerPartyAccountId,
   })
   // Result: groups of BulkMailAnnouncements with equal "certain properties"

   // keep only those groups that have at least one BulkMailAnnouncement
   // that has both a true AuditReportIndicator and at least one
   // PackageTrackingIdentification that equals packageTrackingIdentification
   .Where(groupOfBulkMailAnnouncements =>
         groupOfBulkMailAnnouncements.Any(bulkMailAnnouncement =>
            bulkMailAnnouncement.AuditReportIndicator &&
            bulkmailAnnouncment.PackageTrackingIdentifications
                 .Any(packageTrackingId == packageTrakcingIdentification)))

   // from the remaining groups, take the first or default
   .FirstOrDefault(); // or use async version

我一直在尝试使用itertools.combinations,但不了解如何利用它来创建两个单独的列。

1 个答案:

答案 0 :(得分:1)

首先使用DataFrame.stack,然后使用具有扁平化组合的列表理解来列出元组并传递给DataFrame构造函数:

from  itertools import combinations

print (df)
          dataTo       dataFrom
0  ['x','y','z']  ['a','b','c']

a = [y for x in df[['dataTo','dataFrom']].stack() for y in combinations(x, 2)]
df = pd.DataFrame(a, columns=['dataTo','DataFrom'])
print (df)

  dataTo DataFrom
0      x        y
1      x        z
2      y        z
3      a        b
4      a        c
5      b        c