我有一个如下所示的数据框:
from to cc extra columns
-------------------------------------------
1 2 3 sth
1|1 4 sth
3 1|2 4|5 sth
我想要一个新的数据框,为每个管道分隔值创建一个新行,如下所示:
from to cc extra columns
--------------------------------------------
1 2 3 sth
1 4 sth
1 4 sth
3 1 4 sth
3 2 4 sth
3 1 5 sth
3 2 5 sth
有人可以帮帮我吗?
谢谢!
答案 0 :(得分:2)
一个不优雅但有效的解决方案:
.data
enter: .asciiz "Please enter your integer:\n"
binaryI: .asciiz "\nHere is the input in binary: "
hexI: .asciiz "\n\nHere is the input in hexadecimal: "
binaryO: .asciiz "\n\nHere is the output in binary: "
hexO: .asciiz "\n\nHere is the output in hexadecimal: "
.text
prompt:
li $v0, 4
la $a0, enter
syscall
li $v0, 5
syscall
add $s2, $0, $v0
li $v0, 4
la $a0, binaryI
syscall
li $v0, 35
move $a0, $s2
syscall
li $v0, 4
la $a0, hexI
syscall
li $v0, 34
move $a0, $s2
syscall
addi $t0, $0, 7
srl $s0, $s2, 12
and $s0, $s0, $t0
li $v0, 4
la $a0, hexO
syscall
li $v0, 35
move $a0, $s0
syscall
li $v0, 4
la $a0, binaryO
syscall
li $v0, 34
move $a0, $s0
syscall
li $v0, 1
add $a0, $0, $s0
syscall
li $v0, 10
syscall
答案 1 :(得分:1)
numpy
split
。concatenate
目标列,而repeat
位于阵列的其余部分def explode(v, i, sep='|'):
v = v.astype(str)
n, m = v.shape
a = v[:, i]
bslc = np.r_[0:i, i+1:m]
asrt = np.append(i, bslc).argsort()
b = v[:, bslc]
a = np.core.defchararray.split(a, sep)
A = np.concatenate(a)[:, None]
counts = [len(x) for x in a.tolist()]
rpt = np.arange(n).repeat(counts)
return np.concatenate([A, b[rpt]], axis=1)[:, asrt]
pd.DataFrame(
explode(explode(explode(df.values, 0), 1), 2),
columns=df.columns
)
from to cc extra_columns
0 1 2 3 sth
1 1 4 sth
2 1 4 sth
3 3 1 4 sth
4 3 1 5 sth
5 3 2 4 sth
6 3 2 5 sth
答案 2 :(得分:0)
最简单的方法是移出pandas进行列表操作,然后将新数据移回pandas数据帧。在这里,我们使用itertools中的product
来创建所有组合,然后将每一行拆分成一堆列表。
请注意,我使用字母而不是数字作为数据。
import pandas as pd
from itertools import product
df = pd.DataFrame([['a', 'b', 'c', 'xtra'],
['a|a','d', '', 'xtra'],
['c', 'a|b','d|e','xtra']],
columns=['From','To','CC','extra_cols'])
split_data = []
for row in df.values.tolist():
split_data.extend(list(product(*[item.split('|') for item in row])))
new_df = pd.DataFrame(split_data, columns=['From','To','CC','extra_cols'])
> new_df
# From To CC extra_cols
# 0 a b c xtra
# 1 a d xtra
# 2 a d xtra
# 3 c a d xtra
# 4 c a e xtra
# 5 c b d xtra
# 6 c b e xtra