如何变换值pandas数据帧?

时间:2016-04-02 10:35:43

标签: python pandas

我有数据:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0" xmlns:asmv3="urn:schemas-microsoft-com:asm.v3">

  <description>eclipse</description>

  <trustInfo xmlns="urn:schemas-microsoft-com:asm.v2">
    <security>
      <requestedPrivileges>
        <requestedExecutionLevel xmlns:ms_asmv3="urn:schemas-microsoft-com:asm.v3" level="asInvoker" ms_asmv3:uiAccess="false">
        </requestedExecutionLevel>
      </requestedPrivileges>
    </security>
  </trustInfo>

  <asmv3:application>
    <asmv3:windowsSettings xmlns="http://schemas.microsoft.com/SMI/2005/WindowsSettings">
      <ms_windowsSettings:dpiAware xmlns:ms_windowsSettings="http://schemas.microsoft.com/SMI/2005/WindowsSettings">false</ms_windowsSettings:dpiAware>
    </asmv3:windowsSettings>
  </asmv3:application>
</assembly>

我使用data = [ (1, 'Shirt', 2), (1, 'Pants', 3), (2, 'Top', 2), (2, 'Shirt', 1), (2, 'T-Shirt', 4), (3, 'Shirt', 3), (3, 'T-Shirt', 2), (4, 'Top', 3), (4, 'Pants', 3), (4, 'T-Shirt', 3), ] 进行转换:

pandas
来自df = pd.DataFrame(data, columns=['unique_id', 'category_product', 'count'])

和矩阵是:

df

但是我需要从0开始更改 unique_id category_product count 0 11 Shirt 2 1 11 Pants 3 2 24 Top 2 3 24 Shirt 1 4 24 T-Shirt 4 5 36 Shirt 3 6 36 T-Shirt 2 7 48 Top 3 8 48 Pants 3 9 48 T-Shirt 3 ,并按照看到的顺序增加,结果如下:

unique_id

我该怎么做?

1 个答案:

答案 0 :(得分:1)

可能有更简单的方法,但这里有一个;

df.unique_id = (df.unique_id.diff() != 0).cumsum() - 1

基本上它只是将每一行与前一行进行比较,如果差异为!= 0则将输出值增加1.最后的-1是补偿前导NaN(第一行没有任何内容)差异)