Pandas Dataframe-根据条件获取索引值

时间:2018-08-03 12:21:21

标签: python python-3.x pandas numpy dataframe

我有一个名为data.txt的文本文件,其中包含表格数据,如下所示:

                        PERIOD
CHANNELS    1      2      3      4       5 
0         1.51   1.61   1.94   2.13   1.95 
5         1.76   1.91   2.29   2.54   2.38 
6         2.02   2.22   2.64   2.96   2.81 
7         2.27   2.52   2.99   3.37   3.24 
8         2.53   2.83   3.35   3.79   3.67 
9         2.78   3.13   3.70   4.21   4.09 
10        3.04   3.44   4.05   4.63   4.53

CHANNELS列中的是仪器的通道号,其他5列中是该特定通道分别在周期1、2、3、4和5中可以检测到的最大能量。

我想编写一个python代码,获取用户输入:周期,较低能量和较高能量,然后给出给定时间段对应于较低能量和较高能量的通道号。

例如:

Enter the period:
>>1
Enter the Lower energy:
>1.0
Enter the Higher energy:
>2.0
#Output
The lower energy channel is 0
The higher energy channel is 6

这是我到目前为止写的:

import numpy as np
import pandas as pd

period = int(input('Enter the period: '))
lower_energy = float(input('Enter the lower energy value: '))
higher_energy = float(input('Enter the higher energy value: '))
row_names = [0, 5, 6, 7, 8, 9, 10]
column_names = [1, 2, 3, 4, 5] 
data_list = []
with open('data.txt') as f:
lines = f.readlines()[2:]
for line in lines:
    arr = [float(num) for num in line.split()[1:]]
    data_list.append(arr)
df = pd.DataFrame(data_list, columns=column_names, index=row_names)
print (df, '\n')
print (df[period])

帮我添加到此。

2 个答案:

答案 0 :(得分:3)

您可以添加以下代码:

根据条件检索索引。假设不断增加通道。

lower_channel_energy = df[df[period]>lower_energy].index[0]
high_channel_energy =  df[(df[period]<higher_energy).shift(-1)==False].index[0]

打印我们计算出的通道:

print("The lower energy channel is {}".format(lower_channel_energy))
print("The higher energy channel is {}".format(high_channel_energy))

此解决方案假定在下降的通道上能量正在增加。

答案 1 :(得分:1)

实际上,您可以直接使用Pandas读取文件以简化程序。 我可以通过以下方式重现您期望的输出:

import pandas as pd

df = pd.read_csv('data.txt', engine='python' header=1,sep=r'\s{2,}')

period = input('Enter the period: ')
lower_energy = float(input('Enter the lower energy value: '))
higher_energy = float(input('Enter the higher energy value: '))

# select the channels within the ranges provided
lo_e_range = (df[period] > lower_energy)
hi_e_range = (df[period] > higher_energy)

# Indices of the lower and higher energy channels
lec = df[period][lo_e_range].index[0]
hec = df[period][hi_e_range].index[0]

print('The lower energy channel is {}'.format(df['CHANNELS'][lec]))
print('The higher energy channel is {}'.format(df['CHANNELS'][hec]))

我已修改代码以考虑您的评论。