Question

我有一个python脚本，可以捕获日志数据并将其转换为2D数组。

脚本的下一部分旨在循环遍历.csv文件并评估每行的第一列，并确定该值是否等于2D数组中的值或介于2D数组中。如果是，则将最后一列标记为TRUE。如果不是，请将其标记为FALSE。

例如，如果我的二维数组如下所示：

[[1542053213, 1542053300], [1542055000, 1542060105]]

我的csv文件如下所示：

1542053220, Foo, Foo, Foo
1542060110, Foo, Foo, Foo

第一行的最后一列应读为TRUE（或1），而第二行的最后一列应读为FALSE（或0）。

我当前的代码如下：

from os.path import expanduser
import re
import csv
import codecs

#Setting variables
#Specifically, set the file path to the reveal log
filepath = expanduser('~/LogAutomation/programlog.txt')
csv_filepath = expanduser('~/LogAutomation/values.csv')
tempStart = ''
tempEnd = ''

print("Starting Script")

#open the log
with open(filepath) as myFile:
    #read the log
    all_logs = myFile.read()
myFile.close()

#Create regular expressions
starting_regex = re.compile(r'\[(\d+)\s+s\]\s+Starting\s+Program')
ending_regex = re.compile(r'\[(\d+)\s+s\]\s+Ending\s+Program\.\s+Stopping')

#Create arrays of start and end times
start_times = list(map(int, starting_regex.findall(all_logs)))
end_times = list(map(int, ending_regex.findall(all_logs)))

#Create 2d Array
timeArray = list(map(list, zip(start_times, end_times)))

#Print 2d Array
print(timeArray)

print("Completed timeArray construction")

#prints the csv file
with open(csv_filepath, 'rb') as csvfile:
    reader = csv.reader(codecs.iterdecode(csvfile, 'utf-8'))

    for row in reader:
        currVal = row[0]
            #if currVal is equal to or in one of the units in timeArray, mark last column as true
            #else, mark last column as false

csvfile.close()

print("Script completed")

我已经能够成功遍历.csv并获取每一行的第一列的值，但是我不知道如何进行比较。不幸的是，我对2D数组数据结构不熟悉，无法在值之间进行检查。另外，我的.csv文件中的列数可能会波动，因此有人知道一种非静态的方法来确定“最后一列”以便能够在文件中的该列之后写入该列吗？

有人可以给我一些帮助吗？

Answer 1

您只需要遍历列表列表，并检查该值是否在任何间隔内。这是一种简单的方法：

(array('id' => $id)).

上面的代码会将结果写入同一文件中，因此请小心。我还删除了with open(csv_filepath, 'rb') as csvfile: reader = csv.reader(codecs.iterdecode(csvfile, 'utf-8')) input_rows = [row for row in reader] with open(csv_filepath, 'w') as outputfile: writer = csv.writer(outputfile) for row in input_rows: currVal = int(row[0]) ok = 'FALSE' for interval in timeArray: if interval[0] <= curVal <= interval[1]: ok = 'TRUE' break writer.writerow(row + [ok])，因为如果您使用csvfile.close()语句，该文件将自动为您关闭。

Answer 2

我会去找些更蟒蛇的东西。

compare = lambda x, y, t: (x <= int(t) <= y)
with open('output.csv', 'w') as outputfile:
    writer = csv.writer(outputfile)
    with open(csv_filepath, 'rb') as csvfile:
        reader = csv.reader(codecs.iterdecode(csvfile, 'utf-8'))

    for row in reader:
        currVal = row[0]
        #if currVal is equal to or in one of the units in timeArray, mark last column as true
        #else, mark last column as false
        match = any(compare(x, y, currVal) for x, y in timeArray)
        write.writerow(row + ['TRUE' if match else 'FALSE'])

    csvfile.close()
outputfile.close()

查找值是否等于或在2D数组中的值之间

2 个答案: