我如何确保每个唯一的行只在python中打印一次:
我正在连接Oracle db并检索一些记录。有可能从db中检索具有精确时间戳,值等的相同记录两次或多次。我来自R编程,我可以在数据帧中发出唯一命令来实现这一目标。
我怎样才能确保每个唯一字段只在pyton中打印一次。这是我的代码:
import pyodbc
import re
sql="DateTime, Server, Server_Type, Metric, Value from oracle_table"
cnxn = pyodbc.connect("DSN=dsn1;UID=userid;PWD=passwd123")
cursor = cnxn.cursor()
cursor.execute(sql)
row = cursor.fetchall()
for line in row:
if line[4]:
if float(line[4])>=0:
print line[1]+"."+re.sub(r'\W+', '', re.sub(r'\%', 'Percent', line[3])),line[0].strftime('%s'), ("%.6f" % float(line[4])), "host="+re.sub("\..*$","",line[1]), "type="+line[2],"source=Oracle","dc=DC1"
输出是:
server1.CRITICAL_INCIDENTS 1418223897 0.000000 host=server1 type=oracle_database source=Oracle dc=DC1
server1.ResponseTimepertransaction 1418223577 2.467900 host=server1 type=oracle_database source=Oracle dc=DC1
server1.DataDictionaryHitPercent 1418223577 100.000000 host=server1 type=oracle_database source=Oracle dc=DC1
server1.FullIndexScanspersecond 1418223577 0.000000 host=server1 type=oracle_database source=Oracle dc=DC1
server1.ExecutesPerformedwithoutParsesPercent 1418223577 66.666667 host=server1 type=oracle_database source=Oracle dc=DC1
server1.SortsinMemoryPercent 1418223577 100.000000 host=server1 type=oracle_database source=Oracle dc=DC1
server1.BufferCacheHitPercent 1418223577 100.000000 host=server1 type=oracle_database source=Oracle dc=DC1
server1.DatabaseCPUTimePercent 1418223577 81.048665 host=server1 type=oracle_database source=Oracle dc=DC1
server1.CRITICAL_INCIDENTS 1418223897 0.000000 host=server1 type=oracle_database source=Oracle dc=DC1
server1.CRITICAL_INCIDENTS 1418223897 0.000000 host=server1 type=oracle_database source=Oracle dc=DC1
server1.ResponseTimepertransaction 1418223577 2.467900 host=server1 type=oracle_database source=Oracle dc=DC1
答案 0 :(得分:2)
# Python 2.7
seenAlready = set()
for line in row:
if line[4]:
if float(line[4])>=0:
outputLine = ... # Whatever you do to construct the output line
if outputLine not in seenAlready:
print outputLine
seenAlready.add(outputLine)
答案 1 :(得分:1)
ll=[]
for line in row:
if line[4]:
if float(line[4])>=0:
ll.append(line[1]+"."+re.sub(r'\W+', '', re.sub(r'\%', 'Percent', line[3])),line[0].strftime('%s'), ("%.6f" % float(line[4])), "host="+re.sub("\..*$","",line[1]), "type="+line[2],"source=Oracle","dc=DC1")
print set(ll)
您可以在此处使用set
。
答案 2 :(得分:0)
在SQL查询中使用 select distinct 。
import pyodbc
import re
sql = """select distinct DateTime, Server, Server_Type, Metric, Value
from oracle_table where Value >= 0"""
cnxn = pyodbc.connect("DSN=dsn1;UID=userid;PWD=passwd123")
cursor = cnxn.cursor()
cursor.execute(sql)
row = cursor.fetchall()
for line in row:
print line[1]+"."+re.sub(r'\W+', '', re.sub(r'\%', 'Percent', line[3])),
line[0].strftime('%s'), ("%.6f" % float(line[4])),
"host="+re.sub("\..*$","",line[1]),
"type="+line[2],"source=Oracle","dc=DC1"