Question

我有一个类似这样的测试文件： -

                2464                                            2480                                            2481

Test results for policy NSS-Tuned               Test results for policy NSS-Tuned               Test results for policy NSS-Tuned

BPS Profile             Throughput              BPS Profile             Throughput              BPS Profile             Throughput

SigTestHTTP21kBin               216.966666667           SigTestHTTP21kBin               219.1           BPSHTTP21KBINARY        219.16
SigTestHTTP21kHtml              359.433333333           SigTestHTTP21kHtml              355.6           BPS-HTTP21K-HTML        364.0
SigTestHTTP21kText              379.95          SigTestHTTP21kText              377.9           BPS-HTTP21K-TEXT        376.25
NSS-HTTP21Kdelay                378.15          NSS-HTTP21Kdelay                381.15          BPS-HTTP21K-DELAY       380.2
NSS-HTTPCPS             18920           NSS-HTTPCPS             6599            BPS-HTTPCPS     74.6522222222
SIggTestPerimeter               270.233333333           SIggTestPerimeter               243.433333333           BPS-PERIMETER   222.8
SIgTestDatacenter               370.825         SIgTestDatacenter               380.24          BPS-DATACENTER  373.275
NSS-Financial           5               NSS-Financial           BPS-FINANCIAL   56.345
NSS-Education           971.125         NSS-Education           950.4           BPS-EDUCATION   1010.2
NSS-EuroMobile          920.68          NSS-EuroMobile          1001.075                BPS-EUROMOBILE  932.525
NSS-USMobile            528.2           NSS-USMobile            570.6           BPS-USMOBILE    541.9

您会看到第一个标题由4个标签（\ t \ t \ t \ t \ t）

分隔

第二个标题由2个标签（\ t \ t）

分隔

后续结果由2个标签分隔（\ t \ t）。

现在我需要操纵吞吐量列并生成新列以计算百分比等。

我写的代码是：

#!/usr/bin/python

import time
import os,sys
from os import path
import re
import sys, ast
import subprocess
import numpy as np
#from StringIO import StringIO
import pandas as pd


location = "/root/madhu_test/bpstest/results/finalnss.txt"
#print location
f = pd.read_csv(location,delimiter='\t\t',header=True)
print f
cols = f.columns.tolist()
print cols
f = f.drop('BPS Profile.2', 1)
f = f.drop('BPS Profile.1', 1)
np.radians(f['Throughput'])
np.radians(f['Throughput.1'])

f['percentage'] = ((f['Throughput.1']-f['Throughput'])/f['Throughput.1'])*100.0
f['percentage.1'] = ((f['Throughput.2']-f['Throughput'])/f['Throughput.2'])*100.0
cols = f.columns.tolist()
#print cols
cols = ['BPS Profile', 'Throughput', 'Throughput.1', 'percentage', 'Throughput.2','percentage.1']
f = f[cols]
f.to_html('/root/madhu_test/bpstest/results/outnss.html')

在运行代码时，我得到如下输出： -

                                                    Test results for policy NSS-Tuned  \
BPS Profile        Throughput    BPS Profile                               Throughput
SigTestHTTP21kBin  216.966666667 SigTestHTTP21kBin                              219.1
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml                             355.6
SigTestHTTP21kText 379.95        SigTestHTTP21kText                             377.9
NSS-HTTP21Kdelay   378.15        NSS-HTTP21Kdelay                              381.15
NSS-HTTPCPS        18920         NSS-HTTPCPS                                     6599
SIggTestPerimeter  270.233333333 SIggTestPerimeter                      243.433333333
SIgTestDatacenter  370.825       SIgTestDatacenter                             380.24
NSS-Financial      5             NSS-Financial                  BPS-FINANCIAL\t56.345
NSS-Education      971.125       NSS-Education                                  950.4
NSS-EuroMobile     920.68        NSS-EuroMobile                              1001.075
NSS-USMobile       528.2         NSS-USMobile                                   570.6

                                                    Test results for policy NSS-Tuned.1  \
BPS Profile        Throughput    BPS Profile                                BPS Profile
SigTestHTTP21kBin  216.966666667 SigTestHTTP21kBin             BPSHTTP21KBINARY\t219.16
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml             BPS-HTTP21K-HTML\t364.0
SigTestHTTP21kText 379.95        SigTestHTTP21kText            BPS-HTTP21K-TEXT\t376.25
NSS-HTTP21Kdelay   378.15        NSS-HTTP21Kdelay              BPS-HTTP21K-DELAY\t380.2
NSS-HTTPCPS        18920         NSS-HTTPCPS                 BPS-HTTPCPS\t74.6522222222
SIggTestPerimeter  270.233333333 SIggTestPerimeter                 BPS-PERIMETER\t222.8
SIgTestDatacenter  370.825       SIgTestDatacenter              BPS-DATACENTER\t373.275
NSS-Financial      5             NSS-Financial                                     None
NSS-Education      971.125       NSS-Education                    BPS-EDUCATION\t1010.2
NSS-EuroMobile     920.68        NSS-EuroMobile                 BPS-EUROMOBILE\t932.525
NSS-USMobile       528.2         NSS-USMobile                       BPS-USMOBILE\t541.9

                                                    Test results for policy NSS-Tuned.2
BPS Profile        Throughput    BPS Profile                                 Throughput
SigTestHTTP21kBin  216.966666667 SigTestHTTP21kBin                                 None
SigTestHTTP21kHtml 359.433333333 SigTestHTTP21kHtml                                None
SigTestHTTP21kText 379.95        SigTestHTTP21kText                                None
NSS-HTTP21Kdelay   378.15        NSS-HTTP21Kdelay                                  None
NSS-HTTPCPS        18920         NSS-HTTPCPS                                       None
SIggTestPerimeter  270.233333333 SIggTestPerimeter                                 None
SIgTestDatacenter  370.825       SIgTestDatacenter                                 None
NSS-Financial      5             NSS-Financial                                     None
NSS-Education      971.125       NSS-Education                                     None
NSS-EuroMobile     920.68        NSS-EuroMobile                                    None
NSS-USMobile       528.2         NSS-USMobile                                      None
['Test results for policy NSS-Tuned', 'Test results for policy NSS-Tuned.1', 'Test results for policy NSS-Tuned.2']

如何将其分为6个列，例如['BPS Profile'，'Throughput'，'Throughput.1'，'percentage'，'Throughput.2'，'percentage.1']

如果我从文本文件中删除以下内容

                2464                                            2480                                            2481

Test results for policy NSS-Tuned               Test results for policy NSS-Tuned               Test results for policy NSS-Tuned

然后，Pandas数据帧将其正确分为6列。

我理解跳过会忽略行，但在生成的最终HTML文件中我也需要这些数据：

              2464                                            2480                                            2481

Test results for policy NSS-Tuned               Test results for policy NSS-Tuned               Test results for policy NSS-Tuned

Answer 1

如果我理解你的问题，我会跳过标题，吸收数据，然后手工设置列标题......

df = pd.read_csv(data.csv, skiprows=7, header=None, delimiter='\t+')
df.columns = ['BPS Profile', 'Throughput', 'BPS Profile.1', 'Throughput.1', 
    'BPS Profile.2', 'Throughput.2']

从这里可以很容易地操纵桌子......

Pandas Dataframe中的单独列在不同的行中具有不同的制表符分隔

1 个答案: