Python Pandas用正则表达式替换了部分csv文件

时间:2019-04-04 13:58:11

标签: python regex pandas csv

我仍在学习Python,我需要一项任务的帮助。我需要使用pandas和regex删除部分csv文件,但无法正常工作。文件将始终以crypto *开头,并且将以+开头的27行如下remove.File(从Unix计算机复制):

crypto pki certificate chain TP-self-signed-3688302998
+certificate self-signed 01
  +30820330 30820218 A0030201 02020101 300D0609 2A864886 F70D0101 05050030
  +31312F30 2D060355 04031326 494F532D 53656C66 2D536967 6E65642D 43657274
  +69666963 6174652D 33363838 33303239 3938301E 170D3139 30313234 31353435
  +35305A17 0D323030 31303130 30303030 305A3031 312F302D 06035504 03132649
  +4F532D53 656C662D 5369676E 65642D43 65727469 66696361 74652D33 36383833
  +30323939 38308201 22300D06 092A8648 86F70D01 01010500 0382010F 00308201
  +0A028201 0100C2D5 12E88676 89FAC5B8 B70775B4 1FDB724A E44B7D02 C1E37E01
  +1CBE6B58 2D92E563 1180BBBB 09F9023D C55FA388 74E6A7A6 94707006 B30F31F0
  +6C90B41A E6F219FA 87FF27D9 0BD418C8 31B4AD01 C1ED8989 98F19DC9 13332457
  +B45EFA8D B56B8686 5BAA884D 26FEAAA8 DCAFF620 2164C13B E0064DA1 2C41F4F8
  +8E377A91 E60E74E1 AA157A16 F3725B6C 3A9D5335 3D6899BB D3E51B95 F06CD52A
  +CF258C2E AAA1D458 819DCBEA BEB4FB87 7AE70DCC 82F10CAF 5631AE57 9D87D75F
  +DD5A1772 963F9D60 462D5C77 24958B0E 0E05500F 54CF67C7 67C3BC64 1AD79A72
  +27DA5D03 A30BBA45 17D923CE 95D1CAAF 2645D9A3 B5E2FCFA 26440BEF 5688BB20
  +EAA8288B 52350203 010001A3 53305130 0F060355 1D130101 FF040530 030101FF
  +301F0603 551D2304 18301680 14E85A71 CEEF0D74 91237468 81C15A6D D0882175
  +2A301D06 03551D0E 04160414 E85A71CE EF0D7491 23746881 C15A6DD0 8821752A
  +300D0609 2A864886 F70D0101 05050003 82010100 B9819E1C 9F208C50 26436397
  +5E18636D 77DF8290 DA858715 E49EF743 935DB071 E205613F 2DFC9D54 E5D201C1
  +B756F592 51E8B189 5FC0C97D 68D06128 015E31EE B21443A1 3DF989EC 24465504
  +A9657194 17C1E9DC 5BE7E7E5 7935BE07 CF291574 542FC7D1 2E40AD71 1D451EFF
  +92E41209 6AD8FEF7 2AE1F925 66F8D4D8 90B7F914 2E6E1E30 93E8329E 1A146948
  +BDB7A070 69C8251C 7B956DBF 0E3A6CC8 28E2720B 01A79D64 1B9FEC84 2EC6A14F
  +9F4B51E6 439A3A42 95950E00 BEE19870 12398461 63DC29AB 68D04DAC 28F1578E
  +530C4551 48AE8D64 39D6BC01 4C830E35 5D0A2E00 B272B548 B2F355DE E94FCB29
  +017967B9 21C04DDB 2E2C354A BD9EE96D E2907E94
crypto pki certificate chain TP-self-signed-3688302998
 -certificate self-signed 01 nvram:IOS-Self-Sig#1.cer

我正在研究并需要帮助的部分代码:

files = glob.glob('*.csv')
    for file in files:
        if os.stat(file).st_size > 0:
            try:
               df=pd.read_csv(file,skiprows = 1, error_bad_lines=False).dropna()
               if len(df) <= 3:
                  os.remove(file)
               elif len(df) > 3:
                  df=pd.read_csv(file,skiprows = 4, error_bad_lines=False).dropna()
                  df=df.replace(r'^crypto pki certificate chain TP-self-signed-\d+(?:\n\s*\+[^\n]*){27}' , '')
                  df.to_csv(file, index=False)
               else:
                  print 'Empty File..passing'
               pass
            except ValueError:
                print 'Modified file'
            pass
            time.sleep(1)

非常感谢您的帮助!

0 个答案:

没有答案