我在pandas数据框中有以下表格:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>AquaticSettingName</th>
<th>Definition</th>
<th>d_level_id</th>
<th>Concat_Code</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>Marine</td>
<td>The Marine System is defined by salinity, which is typically about 35, although salinity can mea...</td>
<td>1</td>
<td>M</td>
</tr>
<tr>
<th>1</th>
<td>Estuarine</td>
<td>The Estuarine System is defined by salinity and geomorphology. This System includes tidally infl...</td>
<td>1</td>
<td>E</td>
</tr>
<tr>
<th>2</th>
<td>Lacustrine</td>
<td>The CMECS Lacustrine System includes (a) all deepwater areas of the Great Lakes and (b) shorelin...</td>
<td>1</td>
<td>L</td>
</tr>
<tr>
<th>3</th>
<td>Marine Nearshore</td>
<td>The Marine Nearshore Subsystem extends from the landward limit of the Marine System to the 30 me...</td>
<td>2</td>
<td>M.1</td>
</tr>
<tr>
<th>4</th>
<td>Marine Offshore</td>
<td>The Marine Offshore Subsystem extends from the 30 meter depth contour to the continental shelf b...</td>
<td>2</td>
<td>M.2</td>
</tr>
<tr>
<th>5</th>
<td>Marine Oceanic</td>
<td>The Marine Oceanic Subsystem represents the open ocean, extending from the continental shelf bre...</td>
<td>2</td>
<td>M.3</td>
</tr>
<tr>
<th>6</th>
<td>Estuarine Coastal</td>
<td>The Estuarine Coastal Subsystem extends from the supratidal zone at the land margin up to the 4 ...</td>
<td>2</td>
<td>E.1</td>
</tr>
<tr>
<th>7</th>
<td>Estuarine Open Water</td>
<td>The Estuarine Open Water Subsystem includes all waters of the Estuarine System with a total dept...</td>
<td>2</td>
<td>E.2</td>
</tr>
<tr>
<th>8</th>
<td>Estuarine Tidal Riverine Coastal</td>
<td>The Estuarine Tidal Riverine Coastal Subsystem includes the most upstream region of the estuary,...</td>
<td>2</td>
<td>E.3</td>
</tr>
<tr>
<th>9</th>
<td>Estuarine Tidal Riverine Open Water</td>
<td>The Estuarine Tidal Riverine Open Water Subsystem includes tidal freshwater areas with a salinit...</td>
<td>2</td>
<td>E.4</td>
</tr>
<tr>
<th>10</th>
<td>Lacustrine Limnetic</td>
<td>The Limnetic Subsystem includes all deepwater habitats within the Lacustrine System. "Deepwater ...</td>
<td>2</td>
<td>L.1</td>
</tr>
<tr>
<th>11</th>
<td>Lacustrine Littoral</td>
<td>The Littoral Subsystem includes shallow habitats in the Lacustrine System. The shoreward boundar...</td>
<td>2</td>
<td>L.2</td>
</tr>
<tr>
<th>12</th>
<td>Marine Nearshore Subtidal</td>
<td>The substrate is generally continuously submerged in this zone and includes those areas below MLLW.</td>
<td>3</td>
<td>M.1.a</td>
</tr>
<tr>
<th>13</th>
<td>Marine Nearshore Intertidal</td>
<td>The substrate is regularly and periodically exposed and flooded by tidal action. This zone exten...</td>
<td>3</td>
<td>M.1.b</td>
</tr>
<tr>
<th>14</th>
<td>Marine Nearshore Supratidal</td>
<td>This zone includes areas above MHHW that are affected by wave splash and overwash but does not i...</td>
<td>3</td>
<td>M.1.c</td>
</tr>
<tr>
<th>15</th>
<td>Marine Offshore Subtidal</td>
<td>The substrate is subtidal and continuously submerged in this zone and includes those areas below...</td>
<td>3</td>
<td>M.2.a</td>
</tr>
<tr>
<th>16</th>
<td>Marine Oceanic Subtidal</td>
<td>The substrate is subtidal and continuously submerged in this zone.</td>
<td>3</td>
<td>M.3.a</td>
</tr>
<tr>
<th>17</th>
<td>Estuarine Coastal Subtidal</td>
<td>The substrate is generally continuously submerged in this zone and includes those areas below MLLW.</td>
<td>3</td>
<td>E.1.a</td>
</tr>
<tr>
<th>18</th>
<td>Estuarine Coastal Intertidal</td>
<td>The substrate in this zone is regularly and periodically exposed and flooded by tides. This zone...</td>
<td>3</td>
<td>E.1.b</td>
</tr>
<tr>
<th>19</th>
<td>Estuarine Coastal Supratidal</td>
<td>This zone includes areas above MHHW; areas in this zone are affected by wave splash and overwash...</td>
<td>3</td>
<td>E.1.c</td>
</tr>
<tr>
<th>20</th>
<td>Estuarine Open Water Subtidal</td>
<td>The substrate is generally continuously submerged in this zone and includes those areas below MLLW.</td>
<td>3</td>
<td>E.2.a</td>
</tr>
<tr>
<th>21</th>
<td>Estuarine Tidal Riverine Coastal Subtidal</td>
<td>The substrate is generally continuously submerged in this zone and includes those areas below MLLW.</td>
<td>3</td>
<td>E.3.a</td>
</tr>
<tr>
<th>22</th>
<td>Estuarine Tidal Riverine Coastal Intertidal</td>
<td>The substrate in this zone is regularly and periodically exposed and flooded by tides. This zone...</td>
<td>3</td>
<td>E.3.b</td>
</tr>
<tr>
<th>23</th>
<td>Estuarine Tidal Riverine Open Water Subtidal</td>
<td>The substrate is generally continuously submerged in this zone and includes those areas below MLLW.</td>
<td>3</td>
<td>E.4.a</td>
</tr>
</tbody>
</table>
&#13;
我需要使用最后一列来重新格式化表格:
<table>
<tr>
<th>System</th>
<th>SubSystem</th>
<th>Tidal Zone</th>
</tr>
<tr>
<td>Marine</td>
<td>Marine Nearshore</td>
<td>Marine Nearshore Subtidal</td>
</tr>
<tr>
<td> </td>
<td> </td>
<td>Marine Nearshore Intertidal</td>
</tr>
<tr>
<td> </td>
<td> </td>
<td>Marine Nearshore Supratidal</td>
</tr>
<tr>
<td>Estuarine</td>
<td>Estuarine Coastal</td>
<td>Estuarine Coastal Subtidal</td>
</tr>
<tr>
<td> </td>
<td> </td>
<td>Estuarine Coastal Intertidal</td>
</tr>
<tr>
<td> </td>
<td> </td>
<td>Estuarine Coastal Supratidal</td>
</tr>
<tr>
<td> </td>
<td>Estuarine Open Water</td>
<td>Estuarine Open Water Subtidal</td>
</tr>
<tr>
<td> </td>
<td>Estuarine Tidal Riverine Coastal</td>
<td>Estuarine Tidal Riverine Coastal Subtidal</td>
</tr>
<tr>
<td> </td>
<td> </td>
<td>Estuarine Tidal Riverine Coastal Intertidal</td>
</tr>
<tr>
<td> </td>
<td>Estuarine Tidal Riverine Open Water</td>
<td>Estuarine Tidal Riverine Open Water Subtidal</td>
</tr>
<tr>
<td>Lacustrine</td>
<td>Lacustrine Littoral</td>
<td></td>
</tr>
<tr>
<td> </td>
<td>Lacustrine Limnetic</td>
<td>
</td>
</tr>
</table>
&#13;
可以使用Concat_Code列中包含的信息生成所需的输出。
我的方法是使用this notebook中描述的Split
和Groupby
(注意:在那个笔记本中,我使用了一个hackish解决方案解决了我遇到的问题here 1}}我将其发布在单独的问题{{3}}
从这种方法,我得到一个新的字典,看起来像:
groupby
我使用上面的字典构建我的pandas数据框,其中第一个键引用第一个组,第二个键引用第二个组,依此类推......在嵌套for循环中我{'E': {'1': ['a', 'b', 'c'],
'2': ['a'],
'3': ['a', 'b'],
'4': ['a']},
'L': {'1': ['x'],
'2': ['x']},
'M': {'1': ['a', 'b', 'c'],
'2': ['a'],
'3': ['a']}}
{{ 1}}我用来从第一个表中提取记录。 recreate
每个第一级组获得一个增量。:
Concat_Column
返回desidered输出。
是否有更好,更简单,更健壮的方法来实现相同的结果?