矩阵标题

时间:2018-10-17 12:57:54

标签: python pandas numpy

在以下数据集中,我需要添加列和行,因此我知道例如员工'12'从雇主'a'到雇主'b'的位置。这是我的数据集

employerEmployeeEdges = [(12, 'a'), (15, 'a'), (17, 'a'), (12, 'a'), (15, 'a'), (23, 'b'), (12, 'b'), (18, 'b'), (12, 'b'), (12, 'b'), (15, 'a'), (12, 'a'), (15, 'a'), (15, 'a'), (24, 'c'), (12, 'c')]

employerEmployeeEdges=np.array(employerEmployeeEdges)
#print(employerEmployeeEdges)

unique_employee = np.unique(employerEmployeeEdges[:,1])
n_unique = len(unique_employee)
#print(unique_employee)


Q = np.zeros([n_unique,n_unique])

for n, employer_employee in enumerate(employerEmployeeEdges):
    #print(employer_employee)
    #copy the array for the original o be intact
    eee = np.copy(employerEmployeeEdges)
    #sustitue the current tuple with a empty one to avoid self comparing
    eee[n] = (None,None)
    #get the index for the current employee, the one on the y axis
    employee_index = np.where(employer_employee[1] == unique_employee)
    #get the indexes where the the employees letter match
    eq_index = np.where(eee[:,0] == employer_employee[0])[0]
    eq_employee = eee[eq_index,1]
    #add at the final array Q by index
    for emp in eq_employee:
        print(np.unique(emp))
        emp_index = np.where(unique_employee == emp)
        #print(emp)
        Q[employee_index,emp_index]+= 1
        #df = pd.DataFrame(Q, columns=emp, index=emp)

print(Q) 

[[26.  9.  3.]
 [ 9.  6.  3.]
 [ 3.  3.  0.]]

我想在此矩阵上方添加列和行标题

这是我到目前为止所做的:

for index, row in enumerate(Q):
    if index < len(Q)-1:
        print('{}\t'.format(str(index + 1))),
    else:
        print(' \t'),
    print('|'.join('{0:.2f}'.format(x) for x in row))

1   26.00|9.00|3.00
2   9.00|6.00|3.00
    3.00|3.00|0.00

由于某种原因,我无法向该数组添加列或行。我需要做什么?该数组应该看起来像(我想要的输出)

       a    b    c
a   26.00|9.00|3.00
b   9.00|6.00|3.00
b   3.00|3.00|0.00

在安德鲁的帮助下,这是解决方案

df = pd.DataFrame(Q)
df.index = unique_employee
df.columns = unique_employee
print(df)
      a    b    c
a  26.0  9.0  3.0
b   9.0  6.0  3.0
c   3.0  3.0  0.0

1 个答案:

答案 0 :(得分:0)

您可以使用熊猫并指定index(行标签)和columns(列标签)来匹配您的unique_employee数组。

import pandas as pd 

print(Q) 
[[26.  9.  3.]
 [ 9.  6.  3.]
 [ 3.  3.  0.]]

df = pd.DataFrame(Q)
df.index = unique_employee
df.columns = unique_employee
print(df)
      a    b    c
a  26.0  9.0  3.0
b   9.0  6.0  3.0
c   3.0  3.0  0.0