Question

在9列数据集的一列上使用groupby后，如何访问print语句中的单个单元格。所有示例都显示了ipython的输出。我需要格式化其他软件的数据。我包含的代码不起作用。

800200.2986 3859676.9971 WELL01 IZHA 107.000 10100.0000 6483.3506 6483.0552 -6376.0552 NIN
  800200.2986 3859676.9971 WELL01 KEY B 107.000 10100.0000 6664.8892 6664.5864 -6557.5864 NIN
  800200.2986 3859676.9971 WELL01 SIMS 107.000 10100.0000 2120.7332 2120.7112 -2013.7112 NIN2
  800200.2986 3859676.9971 WELL01 BOT0 107.000 10100.0000 8426.7568 8426.3613 -8319.3613 NIN
  800200.2986 3859676.9971 WELL01 BOT0-2A 107.000 10100.0000 8476.9834 8476.5830 -8369.5830 NIN

wls= pd.read_fwf(cmdl.datafilename,skiprows=10,colspecs= colwidths,names=colnames)
tpgrp = wls.groupby('Top')
tpgrpdict= dict(list(tpgrp))
tp0= tpgrp.get_group('BOT0')
#print tp0,tp0.count()
print tp0[['X','Y','TVDSS','Well']]
for t in tp0:
    for i in range(len(t)):
        print t[i],tp0[tp0[]['X']

Answer 1

tpgrp = wls.groupby('Top')
tpgrpdict= dict(list(tpgrp))
tp0= tpgrp.get_group('BOT0')

您发布的数据框没有名为“Top”的列，也没有名为“BOT0”的行。

groupby()返回一个DataFrameGroupBy对象，该对象具有groups属性，该属性是一个字典，其中组名是键，每个值都是一个包含其名称的数组。该组的行。您可以单步执行键并将每个组名称转换为get_group()的数据框。

然后，您可以使用.loc[]访问单个单元格，例如：

df.loc['row_name', 'column_name']

或者，您可以使用.iloc[]，例如：

df.iloc[0, -1]    #row 0, last cell in row

以下是一个例子：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame(
{
    'A' : ['x', 'y', 'x', 'y', 'x', 'y', 'x', 'x'],
    'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
    'C' : np.random.randn(8), 
    'D' : np.random.randn(8)
}
)

print df

       A      B         C         D
0  x    one -0.495361  0.349394
1  y    one -0.650411 -0.015773
2  x    two  1.249771  0.688563
3  y  three  0.538023 -1.171365
4  x    two -0.251789 -1.919394
5  y    two -1.308169  1.140029
6  x    one -0.404656  0.196439
7  x  three  1.318467 -0.294299

group_by_obj = df.groupby('A')
group_dict = group_by_obj.groups

print group_dict

{'y': [1L, 3L, 5L], 'x': [0L, 2L, 4L, 6L, 7L]}  #The lists contain row numbers

for group_name in group_dict.keys():
    new_df = group_by_obj.get_group(group_name)
    print new_df
    print '.' * 20

    print new_df.loc[:,'A':'C']  #extract all rows(:), and columns A through C
    print new_df.iloc[0, -1]  #row=0, col=-1
    print '*' * 20

   A      B         C         D
1  y    one -0.650411 -0.015773
3  y  three  0.538023 -1.171365
5  y    two -1.308169  1.140029
....................
   A      B         C
1  y    one -0.650411
3  y  three  0.538023
5  y    two -1.308169
-0.0157729776716
********************
   A      B         C         D
0  x    one -0.495361  0.349394
2  x    two  1.249771  0.688563
4  x    two -0.251789 -1.919394
6  x    one -0.404656  0.196439
7  x  three  1.318467 -0.294299
....................
   A      B         C
0  x    one -0.495361
2  x    two  1.249771
4  x    two -0.251789
6  x    one -0.404656
7  x  three  1.318467
0.349393627206
********************

使用print在groupby之后访问pandas数据帧的各个单元格

1 个答案: