我有一些数据存储在这样的数据库中:
TableName:错误 表:
+------------+--------------+
| fault_type | total |
+------------+--------------+
| 1 | 1 |
| 2 | 3 |
| 3 | 8 |
| 4 | 2 |
.............................
我该如何从这张表开始获得直方图?
答案 0 :(得分:2)
下面的解决方案假设你有MySQL,Python和GNUPlot。如有必要,可以对具体细节进行微调。发布它以便它可以成为其他同行的基准。
步骤1:确定图表的类型。
如果它是某种频率图,那么一个简单的SQL查询应该可以解决这个问题:
select total, count(total) from faults GROUP BY total;
如果您需要指定箱尺寸,请继续执行下一步。
步骤2:确保您能够使用Python连接到MySQL。您可以使用MySQLdb导入来执行此操作。
之后,为直方图生成数据的python代码如下(这是在5分钟内准确编写的,因此非常粗糙):
import MySQLdb
def DumpHistogramData(databaseHost, databaseName, databaseUsername, databasePassword, dataTableName, binsTableName, binSize, histogramDataFilename):
#Open a file for writing into
output = open("./" + histogramDataFilename, "w")
#Connect to the database
db = MySQLdb.connect(databaseHost, databaseUsername, databasePassword, databaseName)
cursor = db.cursor()
#Form the query
sql = """select b.*, count(*) as total
FROM """ + binsTableName + """ b
LEFT OUTER JOIN """ + dataTableName + """ a
ON a.total between b.min AND b.max
group by b.min;"""
cursor.execute(sql)
#Get the result and print it into a file for further processing
count = 0;
while True:
results = cursor.fetchmany(10000)
if not results:
break
for result in results:
#print >> output, str(result[0]) + "-" + str(result[1]) + "\t" + str(result[2])
db.close()
def PrepareHistogramBins(databaseHost, databaseName, databaseUsername, databasePassword, binsTableName, maxValue, totalBins):
#Connect to the database
db = MySQLdb.connect(databaseHost, databaseUsername, databasePassword, databaseName)
cursor = db.cursor()
#Check if the table was already created
sql = """DROP TABLE IF EXISTS """ + binsTableName
cursor.execute(sql)
#Create the table
sql = """CREATE TABLE """ + binsTableName + """(min int(11), max int(11));"""
cursor.execute(sql)
#Calculate the bin size
binSize = maxValue/totalBins
#Generate the bin sizes
for i in range(0, maxValue, binSize):
if i is 0:
min = i
max = i+binSize
else:
min = i+1
max = i+binSize
sql = """INSERT INTO """ + binsTableName + """(min, max) VALUES(""" + str(min) + """, """ + str(max) + """);"""
cursor.execute(sql)
db.close()
return binSize
binSize = PrepareHistogramBins("localhost", "testing", "root", "", "bins", 5000, 100)
DumpHistogramData("localhost", "testing", "root", "", "faults", "bins", binSize, "histogram")
步骤3:使用GNUPlot生成直方图。您可以使用以下脚本作为起点(生成eps图像文件):
set terminal postscript eps color lw 2 "Helvetica" 20
set output "output.eps"
set xlabel "XLABEL"
set ylabel "YLABEL"
set title "TITLE"
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set boxwidth 0.9
set key autotitle columnheader
set xtics rotate by -45
plot "input" using 1:2 with linespoints ls 1
将上述脚本保存到一些任意文件中,例如sample.script。继续下一步。
步骤4:使用带有上述输入脚本的gnuplot生成eps文件
gnuplot sample.script
没有什么复杂的,但我认为这段代码中的几个位可以重复使用。再说一遍,就像我说的那样,它并不完美,但你可以完成工作:)
致谢:
Ofri Raviv(帮助我 这篇文章中的MySQL查询: Getting data for histogram plot)
我自己(用于编写python和 gnuplot脚本:D)
答案 1 :(得分:0)
This blog article可能对您有所帮助!它使用gnuplot讨论统计数据并将结果绘制成直方图。