读取csv时将行号作为索引

时间:2019-04-05 15:08:45

标签: python python-3.x pandas

我有一个csv文件,如下所示:

30,60,14.3,53.6,0.71,403,0
30,60,15.3,54.9,0.72,403,0
30,60,16.5,56.2,0.73,403,0
30,60,17.9,57.5,0.74,403,0

没有标题,只有数据。列是

colNames = {
        'doa_in1': np.float64, 'doa_in2': np.float64,
        'doa_est1': np.float64, 'doa_est2': np.float64, 
        'rho': np.float64,
        'seed': np.int32, 'matl_chan':np.int32
        }

我通过以下方式阅读了csv:

tmp_df = pd.read_csv(
                    io.BytesIO(tmp_csv), encoding='utf8',
                    header=None,
                    names=colNames.keys(), dtype=colNames,
                    converters={
                                'matl_chan': lambda x: bool(int(x))
                               }
                    )

这是一个警告,因为我正在对matl_chan进行两种可能的转换,但这只是一个警告,指出python将仅使用converters中的内容(即lambda函数)

我希望每行有一个数字或唯一的索引。

那是因为,然后我使用此函数处理tmp_df

def remove_lines(df):
    THRES = 50
    THRES_angle = 10  # degrees
    is_converging = True
    for idx, row in df.iterrows():
        if idx == 0:
            is_converging = False
        # check if MUSIC started converging
        if abs(row['doa_est1']-row['doa_in1']) < THRES_angle:
            if abs(row['doa_est2']-row['doa_in2']) < THRES_angle:
                is_converging = True
        # calc error
        err = abs(row['doa_est1']- row['doa_in1'])+abs(row['doa_est2']-row['doa_in2'])
        if err > THRES and is_converging:
            df=df.drop(idx) 
    return df

尽管所有行的索引都为30,所以在出现此错误时函数不会删除任何内容:

KeyError: '[30] not found in axis'

完整的堆栈跟踪为

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-143-b61c0402f9d7> in <module>
----> 1 df=get_dataframe()

<ipython-input-121-b76aab8b17ee> in get_dataframe()
     24                 continue
     25 
---> 26             tmp_df_sanitized = remove_lines(tmp_df)
     27             all_dataframes.append(tmp_df_sanitized)
     28 

<ipython-input-142-31019390251a> in remove_lines(df)
     62         err = abs(row['doa_est1']-row['doa_in1'])+abs(row['doa_est2']-row['doa_in2'])
     63         if err > THRES and is_converging:
---> 64             df=df.drop(idx)
     65             print("dropped {}".format(idx))
     66     return df

/usr/lib/python3.7/site-packages/pandas/core/frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3938                                            index=index, columns=columns,
   3939                                            level=level, inplace=inplace,
-> 3940                                            errors=errors)
   3941 
   3942     @rewrite_axis_style_signature('mapper', [('copy', True),

/usr/lib/python3.7/site-packages/pandas/core/generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3778         for axis, labels in axes.items():
   3779             if labels is not None:
-> 3780                 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
   3781 
   3782         if inplace:

/usr/lib/python3.7/site-packages/pandas/core/generic.py in _drop_axis(self, labels, axis, level, errors)
   3810                 new_axis = axis.drop(labels, level=level, errors=errors)
   3811             else:
-> 3812                 new_axis = axis.drop(labels, errors=errors)
   3813             result = self.reindex(**{axis_name: new_axis})
   3814 

/usr/lib/python3.7/site-packages/pandas/core/indexes/base.py in drop(self, labels, errors)
   4962         if mask.any():
   4963             if errors != 'ignore':
-> 4964                 raise KeyError(
   4965                     '{} not found in axis'.format(labels[mask]))
   4966             indexer = indexer[~mask]

KeyError: '[30] not found in axis'

有人可以解决吗?

编辑:更清楚一点,我想将上面放置的四行的行索引设为[0,1,2,3]

0 个答案:

没有答案