我有一个文件,其中列出了描述特定参数的列:
尺寸亮度
我只需要此文件中的特定数据(特别是行和列)。到目前为止,我在python中有一个代码,我在其中添加了必要的行号。我只需要知道如何匹配它以获得文本文件中的正确字符串以及列(幅度)和(亮度)中的变量。有关如何处理此问题的任何建议吗?
以下是我的代码示例(#comments描述了我所做的和我想做的事情):
temp_ListMatch = (point[5]).strip()
if temp_ListMatch:
ListMatchaddress = (point[5]).strip()
ListMatchaddress = re.sub(r'\s', '_', ListMatchaddress)
ListMatch_dirname = '/projects/XRB_Web/apmanuel/499/Lists/' + ListMatchaddress
#print ListMatch_dirname+"\n"
try:
file5 = open(ListMatch_dirname, 'r')
except IOError:
print 'Cannot open: '+ListMatch_dirname
Optparline = []
for line in file5:
point5 = line.split()
j = int(point5[1])
Optparline.append(j)
#Basically file5 contains the line numbers I need,
#and I have appended these numbers to the variable j.
temp_others = (point[4]).strip()
if temp_others:
othersaddress = (point[4]).strip()
othersaddress =re.sub(r'\s', '_', othersaddress)
othersbase_dirname = '/projects/XRB_Web/apmanuel/499/Lists/' + othersaddress
try:
file6 = open(othersbase_dirname, 'r')
except IOError:
print 'Cannot open: '+othersbase_dirname
gmag = []
z = []
rh = []
gz = []
for line in file6:
point6 = line.split()
f = float(point6[2])
g = float(point6[4])
h = float(point6[6])
i = float(point6[9])
# So now I have opened file 6 where this list of data is, and have
# identified the columns of elements that I need.
# I only need the particular rows (provided by line number)
# with these elements chosen. That is where I'm stuck!
答案 0 :(得分:0)
将整个数据文件加载到pandas DataFrame中(假设数据文件有一个标题,我们可以从中获取列名)
import pandas as pd
df = pd.read_csv('/path/to/file')
将行号文件加载到pandas系列中(假设每行有一个):
# squeeze = True makes the function return a series
row_numbers = pd.read_csv('/path/to/rows_file', squeeze = True)
仅返回行号文件中的行,以及列的大小和亮度(这假设第一行的编号为0):
relevant_rows = df.ix[row_numbers][['magnitude', 'luminosity']