在pandas数据帧中按递增顺序重新排序节点

时间:2017-12-04 14:18:32

标签: python pandas dataframe

[Code]

const
  CSIDL_COMMON_PROGRAMS = $0017;
  SHGFP_TYPE_CURRENT = 0;
  MAX_PATH = 260;
  S_OK = 0;

function SHGetFolderPath(
  hwnd: HWND; csidl: Integer; hToken: THandle; dwFlags: DWORD;
  pszPath: string): HResult;
  external 'SHGetFolderPathW@shell32.dll stdcall';

function GetMenuRootPath(Param: string): string;
var
  R, I: Integer;
begin
  if IsAdminLoggedOn then
  begin
    SetLength(Result, MAX_PATH);
    R := SHGetFolderPath(0, CSIDL_COMMON_PROGRAMS, 0, SHGFP_TYPE_CURRENT, Result); 
    if R <> S_OK then
    begin
      Log('Failed to resolve path to common Start menu folder');
    end
      else
    begin  
      SetLength(Result, Pos(#0, Result) - 1);

      Log(Format('Resolved path to common Start menu folder: %s', [Result]));
    end;
  end
    else
  begin
    Result := ExpandConstant('{userprograms}');
    Log(Format('Using user''s Start menu folder: %s', [Result]))
  end;
end;

我想按递增顺序重命名data1中的node1和node2。 节点数为2 3 6 7 28,因此分别为1 2 3 4 5。

因此数据框变为 -

data1 = { 'node1': [2,2,3,6],
     'node2': [6,7,7,28],
     'weight': [1,2,1,1], }
df1 = pd.DataFrame(data1, columns = ['node1','node2','weight'])

数据在

之前看起来像这样

enter image description here

但现在看起来像这样

enter image description here

3 个答案:

答案 0 :(得分:7)

通过整形分配和分配,即

进行分解
df1[['node1','node2']] = (pd.factorize(np.sort(df1[['node1','node2']].values.reshape(-1)))[0]+1).reshape(-1,len(df1)).T

    node1  node2  weight
0      1      3       1
1      1      4       2
2      2      4       1
3      3      5       1

使用dict进行融合和分解以及重命名的另一种方法

vals = pd.factorize(df1[['node1','node2']].melt().sort_values('value')['value'])

to_rename = dict(zip(vals[1],np.unique(vals[0]+1)))
# {2: 1, 3: 2, 6: 3, 7: 4, 28: 5}

df1[['node1','node2']] = df1[['node1','node2']].apply(lambda x : x.map(to_rename))
# Also df1[['node1','node2']] = df1[['node1','node2']].replace(to_rename) Thanks @jezrael

  node1  node2  weight
0      1      3       1
1      1      4       2
2      2      4       1
3      3      5       1

答案 1 :(得分:5)

使用rank重新制作stack,然后unstack

df2 = (df1.set_index('weight', append=True)
          .stack()
          .rank(method='dense')
          .astype(int)
          .unstack()
          .reset_index(level=1))
print (df2)
   weight  node1  node2
0       1      1      3
1       2      1      4
2       1      2      4
3       1      3      5

答案 2 :(得分:1)

或者我们可以使用替换:-)

ary=np.concatenate(df1.iloc[:,:2].values)
mapdf=pd.DataFrame({'data':pd.Series(ary).astype('category').cat.codes.add(1),'maper':ary}).set_index('maper')
df1[['node1','node2']]=df1[['node1','node2']].replace(mapdf.data.to_dict())

df1
Out[1631]: 
   node1  node2  weight
0      1      3       1
1      1      4       2
2      2      4       1
3      3      5       1