我有一个类似于以下内容的数据框:
Wave A B C
340 77 70 15
341 80 73 15
342 83 76 16
343 86 78 17
我想生成将包含现有列的所有可能组合的列。我在这里显示了3个列,但在我的实际数据中,我有7列,因此共有127种组合。所需的输出如下:
Wave A B C AB AC AD BC ... ABC
340 77 70 15 147 92 ...
341 80 73 15 153 95 ...
342 83 76 16 159 99 ...
我实现了一个效率很低的版本,用户输入组合(AB,AC等),然后使用行的总和创建一个新的col。对于127种组合(尤其是具有描述性的列名),这似乎几乎是不可能的。
答案 0 :(得分:2)
您需要首先获取所有combination
,然后才获取combination
,然后需要创建地图dict
或Series
l=df.columns[1:].tolist()
l1=[list(map(list, itertools.combinations(l, i))) for i in range(len(l) + 1)]
d=[dict.fromkeys(y,''.join(y))for x in l1 for y in x ]
maps=pd.Series(d).apply(pd.Series).stack()
df.set_index('Wave',inplace=True)
df=df.reindex(columns=maps.index.get_level_values(1))
#here using reindex , get the order of your new df to the maps keys
df.columns=maps.tolist()
# here assign the new value to the column , since the order is same that why here I am assign it back
df.sum(level=0,axis=1)
Out[303]:
A B C AB AC BC ABC
Wave
340 77 70 15 147 92 85 162
341 80 73 15 153 95 88 168
342 83 76 16 159 99 92 175
343 86 78 17 164 103 95 181
答案 1 :(得分:2)
使用itertools中的$csv = Import-Csv C:\test2.csv
$Daysback = "-1"
$destination = "C:\Users\mark\Documents\Test2"
$loglocation = "C:\Users\mark\Documents\filetransferlog.csv"
$s = "SUCCESS"
$f = "FAIL"
$CurrentDate = Get-Date
$DatetoDelete = $CurrentDate.Date.AddDays($DaysBack)
$Log = foreach ($line in $csv) {
$objects = Get-ChildItem $line.FullName -Rec |
Where-Object LastWriteTime -lt $DatetoDelete
foreach ($object in $objects) {
$Result = $s
$sourceRoot = $object.FullName
try {
Copy-Item -Path $sourceRoot -Recurse -Destination $destination
Remove-Item -Path $sourceRoot -Recurse -Force
} catch {
$Result = $f
}
[PSCustomObject]@{
'Success/Fail' = $Result
Source = $sourceRoot
Destination = $destination
}
}
}
$Log | Export-Csv $loglocation -NoTypeInformation
+ chain
创建所有组合的列表,然后对相应的列求和:
combinations