我的Googlefu让我失望了!
我有以下形式的大熊猫// The command with pipes.
$command = 'command1 | command2 | echo Testing things | sed s/things/stuff/';
// Execute the command. The overall exit code is in $exitStatus.
exec(
$command . '; echo ${PIPESTATUS[*]}',
$out,
$exitStatus
);
// Get the exit statuses and remove them from the output.
$pipeStatus = explode(' ', array_pop($out));
print_r([$pipeStatus, $out]);
// [
// [
// "127",
// "127",
// "0",
// "0",
// ],
// [
// "Testing stuff",
// ],
// ]
:
DataFrame
它基本上包含图表的节点,其中级别描述从较低级别到较高级别级别的传出边缘。我想转换DataFrame /创建一个新形式的DataFrame:
Level 1 Level 2 Level 3 Level 4
-------------------------------------
A B C NaN
A B D E
A B D F
G H NaN NaN
G I J K
包含 A B C D E F G H I J K
---------------------------------------------
A | 0 1 0 0 0 0 0 0 0 0 0
B | 0 0 1 1 0 0 0 0 0 0 0
C | 0 0 0 0 0 0 0 0 0 0 0
D | 0 0 0 0 1 1 0 0 0 0 0
E | 0 0 0 0 0 0 0 0 0 0 0
F | 0 0 0 0 0 0 0 0 0 0 0
G | 0 0 0 0 0 0 0 1 1 0 0
H | 0 0 0 0 0 0 0 0 0 0 0
I | 0 0 0 0 0 0 0 0 0 1 0
J | 0 0 0 0 0 0 0 0 0 0 1
K | 0 0 0 0 0 0 0 0 0 0 0
的单元格描绘了从相应行到相应列的传出边缘。在没有Pandas中的循环和条件的情况下,是否有Pythonic方法来实现这一目标?
答案 0 :(得分:2)
试试这段代码:
df = pd.DataFrame({'level_1':['A', 'A', 'A', 'G', 'G'], 'level_2':['B', 'B', 'B', 'H', 'I'],
'level_3':['C', 'D', 'D', np.nan, 'J'], 'level_4':[np.nan, 'E', 'F', np.nan, 'K']})
您的输入数据框是:
level_1 level_2 level_3 level_4
0 A B C NaN
1 A B D E
2 A B D F
3 G H NaN NaN
4 G I J K
解决方案是:
# Get unique values from input dataframe and filter out 'nan' values
list_nodes = []
for i_col in df.columns.tolist():
list_nodes.extend(filter(lambda v: v==v, df[i_col].unique().tolist()))
# Initialize your result dataframe
df_res = pd.DataFrame(columns=sorted(list_nodes), index=sorted(list_nodes))
df_res = df_res.fillna(0)
# Get 'index-column' pairs from input dataframe ('nan's are exluded)
list_indexes = []
for i_col in range(df.shape[1]-1):
list_indexes.extend(list(set([tuple(i) for i in df.iloc[:, i_col:i_col+2]\
.dropna(axis=0).values.tolist()])))
# Use 'index-column' pairs to fill the result dataframe
for i_list_indexes in list_indexes:
df_res.set_value(i_list_indexes[0], i_list_indexes[1], 1)
最终结果是:
A B C D E F G H I J K
A 0 1 0 0 0 0 0 0 0 0 0
B 0 0 1 1 0 0 0 0 0 0 0
C 0 0 0 0 0 0 0 0 0 0 0
D 0 0 0 0 1 1 0 0 0 0 0
E 0 0 0 0 0 0 0 0 0 0 0
F 0 0 0 0 0 0 0 0 0 0 0
G 0 0 0 0 0 0 0 1 1 0 0
H 0 0 0 0 0 0 0 0 0 0 0
I 0 0 0 0 0 0 0 0 0 1 0
J 0 0 0 0 0 0 0 0 0 0 1
K 0 0 0 0 0 0 0 0 0 0 0