我有一个类似这样的测试文件: -
2464 2480 2481
Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for policy NSS-Tuned
BPS Profile Throughput BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin 219.1 BPSHTTP21KBINARY 219.16
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml 355.6 BPS-HTTP21K-HTML 364.0
SigTestHTTP21kText 379.95 SigTestHTTP21kText 377.9 BPS-HTTP21K-TEXT 376.25
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay 381.15 BPS-HTTP21K-DELAY 380.2
NSS-HTTPCPS 18920 NSS-HTTPCPS 6599 BPS-HTTPCPS 74.6522222222
SIggTestPerimeter 270.233333333 SIggTestPerimeter 243.433333333 BPS-PERIMETER 222.8
SIgTestDatacenter 370.825 SIgTestDatacenter 380.24 BPS-DATACENTER 373.275
NSS-Financial 5 NSS-Financial BPS-FINANCIAL 56.345
NSS-Education 971.125 NSS-Education 950.4 BPS-EDUCATION 1010.2
NSS-EuroMobile 920.68 NSS-EuroMobile 1001.075 BPS-EUROMOBILE 932.525
NSS-USMobile 528.2 NSS-USMobile 570.6 BPS-USMOBILE 541.9
您会看到第一个标题由4个标签(\ t \ t \ t \ t \ t)
分隔第二个标题由2个标签(\ t \ t)
分隔后续结果由2个标签分隔(\ t \ t)。
现在我需要操纵吞吐量列并生成新列以计算百分比等。
我写的代码是:
#!/usr/bin/python
import time
import os,sys
from os import path
import re
import sys, ast
import subprocess
import numpy as np
#from StringIO import StringIO
import pandas as pd
location = "/root/madhu_test/bpstest/results/finalnss.txt"
#print location
f = pd.read_csv(location,delimiter='\t\t',header=True)
print f
cols = f.columns.tolist()
print cols
f = f.drop('BPS Profile.2', 1)
f = f.drop('BPS Profile.1', 1)
np.radians(f['Throughput'])
np.radians(f['Throughput.1'])
f['percentage'] = ((f['Throughput.1']-f['Throughput'])/f['Throughput.1'])*100.0
f['percentage.1'] = ((f['Throughput.2']-f['Throughput'])/f['Throughput.2'])*100.0
cols = f.columns.tolist()
#print cols
cols = ['BPS Profile', 'Throughput', 'Throughput.1', 'percentage', 'Throughput.2','percentage.1']
f = f[cols]
f.to_html('/root/madhu_test/bpstest/results/outnss.html')
在运行代码时,我得到如下输出: -
Test results for policy NSS-Tuned \
BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin 219.1
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml 355.6
SigTestHTTP21kText 379.95 SigTestHTTP21kText 377.9
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay 381.15
NSS-HTTPCPS 18920 NSS-HTTPCPS 6599
SIggTestPerimeter 270.233333333 SIggTestPerimeter 243.433333333
SIgTestDatacenter 370.825 SIgTestDatacenter 380.24
NSS-Financial 5 NSS-Financial BPS-FINANCIAL\t56.345
NSS-Education 971.125 NSS-Education 950.4
NSS-EuroMobile 920.68 NSS-EuroMobile 1001.075
NSS-USMobile 528.2 NSS-USMobile 570.6
Test results for policy NSS-Tuned.1 \
BPS Profile Throughput BPS Profile BPS Profile
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin BPSHTTP21KBINARY\t219.16
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml BPS-HTTP21K-HTML\t364.0
SigTestHTTP21kText 379.95 SigTestHTTP21kText BPS-HTTP21K-TEXT\t376.25
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay BPS-HTTP21K-DELAY\t380.2
NSS-HTTPCPS 18920 NSS-HTTPCPS BPS-HTTPCPS\t74.6522222222
SIggTestPerimeter 270.233333333 SIggTestPerimeter BPS-PERIMETER\t222.8
SIgTestDatacenter 370.825 SIgTestDatacenter BPS-DATACENTER\t373.275
NSS-Financial 5 NSS-Financial None
NSS-Education 971.125 NSS-Education BPS-EDUCATION\t1010.2
NSS-EuroMobile 920.68 NSS-EuroMobile BPS-EUROMOBILE\t932.525
NSS-USMobile 528.2 NSS-USMobile BPS-USMOBILE\t541.9
Test results for policy NSS-Tuned.2
BPS Profile Throughput BPS Profile Throughput
SigTestHTTP21kBin 216.966666667 SigTestHTTP21kBin None
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml None
SigTestHTTP21kText 379.95 SigTestHTTP21kText None
NSS-HTTP21Kdelay 378.15 NSS-HTTP21Kdelay None
NSS-HTTPCPS 18920 NSS-HTTPCPS None
SIggTestPerimeter 270.233333333 SIggTestPerimeter None
SIgTestDatacenter 370.825 SIgTestDatacenter None
NSS-Financial 5 NSS-Financial None
NSS-Education 971.125 NSS-Education None
NSS-EuroMobile 920.68 NSS-EuroMobile None
NSS-USMobile 528.2 NSS-USMobile None
['Test results for policy NSS-Tuned', 'Test results for policy NSS-Tuned.1', 'Test results for policy NSS-Tuned.2']
如何将其分为6个列,例如['BPS Profile','Throughput','Throughput.1','percentage','Throughput.2','percentage.1']
如果我从文本文件中删除以下内容
2464 2480 2481
Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for policy NSS-Tuned
然后,Pandas数据帧将其正确分为6列。
我理解跳过会忽略行,但在生成的最终HTML文件中我也需要这些数据:
2464 2480 2481
Test results for policy NSS-Tuned Test results for policy NSS-Tuned Test results for policy NSS-Tuned
答案 0 :(得分:0)
如果我理解你的问题,我会跳过标题,吸收数据,然后手工设置列标题......
df = pd.read_csv(data.csv, skiprows=7, header=None, delimiter='\t+')
df.columns = ['BPS Profile', 'Throughput', 'BPS Profile.1', 'Throughput.1',
'BPS Profile.2', 'Throughput.2']
从这里可以很容易地操纵桌子......