熊猫使用其他列中的值创建新列,并根据列值进行选择

时间:2020-04-20 04:06:32

标签: python pandas dataframe pivot

我有一个看起来像这个例子的数据框。由于某些原因,原始数据具有复制的值。

   2020-04-20T03:48:30.482071Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:48:30.482083Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:48:30.485048Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.19-0ubuntu0.19.10.3) starting as process 6006
    2020-04-20T03:48:31.604813Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
    2020-04-20T03:48:31.624398Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
    2020-04-20T03:48:31.792352Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
    2020-04-20T03:48:32.331025Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.19-0ubuntu0.19.10.3'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu).
    2020-04-20T03:48:32.673302Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/var/run/mysqld/mysqlx.sock' bind-address: '::' port: 33060
    2020-04-20T03:48:33.334826Z 14 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:33.359907Z 13 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:33.360197Z 17 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:34.014307Z 17 [Warning] [MY-010756] [Server] Checking table:   './myshop/wp_options'
    2020-04-20T03:48:47.570204Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:48:47.570213Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:48:47.572172Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.19-0ubuntu0.19.10.3) starting as process 6102
    2020-04-20T03:48:48.809895Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
    2020-04-20T03:48:48.841774Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
    2020-04-20T03:48:49.458214Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
    2020-04-20T03:48:53.419768Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.19-0ubuntu0.19.10.3'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu).
    2020-04-20T03:48:53.867082Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/var/run/mysqld/mysqlx.sock' bind-address: '::' port: 33060
    2020-04-20T03:48:59.620511Z 7 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:59.703179Z 24 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:59.752318Z 21 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:59.781891Z 8 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:59.869661Z 13 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:48:59.946611Z 19 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:00.001451Z 9 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:00.060660Z 17 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:00.172364Z 18 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:00.285342Z 12 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:00.381400Z 14 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:00.422576Z 15 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:00.507195Z 16 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:03.729147Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:49:03.729156Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:49:03.731122Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.19-0ubuntu0.19.10.3) starting as process 6183
    2020-04-20T03:49:04.929865Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
    2020-04-20T03:49:04.984710Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
    2020-04-20T03:49:08.270242Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
    2020-04-20T03:49:08.528334Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.19-0ubuntu0.19.10.3'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu).
    2020-04-20T03:49:08.761347Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/var/run/mysqld/mysqlx.sock' bind-address: '::' port: 33060
    2020-04-20T03:49:08.840729Z 15 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.842024Z 17 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.842193Z 10 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.842413Z 11 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.842538Z 16 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.842671Z 8 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.843040Z 12 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.843322Z 9 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.843645Z 14 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.844176Z 13 [ERROR] [MY-013134] [Server] Table './myshop/wp_options' is marked as crashed and should be repaired
    2020-04-20T03:49:08.850453Z 14 [Warning] [MY-010756] [Server] Checking table:   './myshop/wp_options'
    2020-04-20T03:49:33.007478Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:49:33.007487Z 0 [Warning] [MY-010080] [Server] The use of InnoDB is mandatory since MySQL 5.7. The former options like '--innodb=0/1/OFF/ON' or '--skip-innodb' are ignored.
    2020-04-20T03:49:33.009642Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.19-0ubuntu0.19.10.3) starting as process 6293
    2020-04-20T03:49:34.320339Z 0 [System] [MY-010229] [Server] Starting XA crash recovery...
    2020-04-20T03:49:34.351618Z 0 [System] [MY-010232] [Server] XA crash recovery finished.
    2020-04-20T03:49:36.455632Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
    2020-04-20T03:49:37.389898Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.19-0ubuntu0.19.10.3'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu).
    2020-04-20T03:49:37.666779Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/var/run/mysqld/mysqlx.sock' bind-address: '::' port: 33060

我想将其转换为如下形式:

  Node Node 1 Value Node 2 Value Node 3 Value
0    1            A            B            C
1    2            A            B            C
2    3            A            B            C

此迭代代码可按预期工作,但对我的数据来说却非常慢(48个节点的〜20,000值)。

我觉得必须有一个更快的方法,也许使用 Node Value 0 1 A 1 2 B 2 3 C ,但我无法弄清楚。

apply

1 个答案:

答案 0 :(得分:2)

使用DataFrame.lookup,然后使用DataFrame.assign

a = df.lookup(df.index, "Node " + df.Node.astype(str) + " Value")

df = df[['Node']].assign(Value = a)
print (df)
   Node Value
0     1     A
1     2     B
2     3     C

编辑:如果缺少某些值,则可以用numpy.setdiff1d提取具有默认值的字典的值,例如np.nan并在lookup之前添加到DataFrame:

print (df)
   Node Node 1 Value Node 2 Value Node 3 Value
0     1            A            B            C
1     2            A            B            C
3     5            A            B            C

s = "Node " + df.Node.astype(str) + " Value"
new = dict.fromkeys(np.setdiff1d(s, df.columns), np.nan)
print (new)
{'Node 5 Value': nan}

print (df.assign(**new))
   Node Node 1 Value Node 2 Value Node 3 Value  Node 5 Value
0     1            A            B            C           NaN
1     2            A            B            C           NaN
3     5            A            B            C           NaN

a = df.assign(**new).lookup(df.index, s)
print (a)
['A' 'B' nan]

df = df[['Node']].assign(Value = a)
print (df)
   Node Value
0     1     A
1     2     B
3     5   NaN

definition of lookup的另一个想法:

def f(row, col):
    try:
        return df.at[row, col]
    except:
        return np.nan

s = "Node " + df.Node.astype(str) + " Value"
a = [f(row, col) for row, col in zip(df.index, s)]

df = df[['Node']].assign(Value = a)
print (df)
   Node Value
0     1     A
1     2     B
3     5   NaN

以及DataFrame.melt的解决方案:

s = "Node " + df.Node.astype(str) + " Value"
b = (df.assign(Node = s)
        .reset_index()
        .melt(['index','Node'], value_name='Value')
        .query('Node == variable').set_index('index')['Value'])


df = df[['Node']].join(b)
print (df)
   Node Value
0     1     A
1     2     B
3     5   NaN