如何在Python中返回整个字符串并从中提取列?

时间:2019-04-20 13:38:25

标签: python regex

例如,从此输出中,我需要带有单词“ test1.txt”的字符串,然后需要该字符串中的第三列,即文件大小。 类似于Linux中的“剪切”命令

5636335  -rw-        1922  Apr 20 2019 09:22:47 +00:00  private-config.cfg
5636332  -rw-        1136  Apr 20 2019 09:22:47 +00:00  NETMAP
5636336  -rw-        0     Apr 20 2019 13:14:51 +00:00  test1.txt
5636325  -rw-        1691  Apr 20 2019 09:22:47 +00:00  startup-config.cfg
5636333  -rw-       16384  Apr 20 2019 09:22:47 +00:00  nvram_00001
5636330  -rw-         341  Apr 20 2019 09:22:47 +00:00  ubridge.log

NETMIKO module
net_connect = ConnectHandler(**cisco)
output = net_connect.send_command('dir')
x = re.search('test1.txt', output)
print(x)

<re.Match object; span=(215, 224), match='test1.txt'>

2 个答案:

答案 0 :(得分:1)

您可以使用:

tr -s ' ' <test1.txt | cut -d ' ' -f3

1922
1136
0
1691
16384
341

ts -s | squeeze-repeats
cut -d | delimiter
cut -f | field

  

我知道如何在Linux中进行操作,我需要Python帮助

import re
sizes = [re.split(r"\s+", l)[2] for l in open("test1.txt").readlines()]
# ['1922', '1136', '0', '1691', '16384', '341']

答案 1 :(得分:0)

您可以切片[13:25]strip()

output = '''5636335  -rw-        1922  Apr 20 2019 09:22:47 +00:00  private-config.cfg
5636332  -rw-        1136  Apr 20 2019 09:22:47 +00:00  NETMAP
5636336  -rw-        0     Apr 20 2019 13:14:51 +00:00  test1.txt
5636325  -rw-        1691  Apr 20 2019 09:22:47 +00:00  startup-config.cfg
5636333  -rw-       16384  Apr 20 2019 09:22:47 +00:00  nvram_00001
5636330  -rw-         341  Apr 20 2019 09:22:47 +00:00  ubridge.log'''

for row in output.split('\n'):
    if 'test1.txt' in row:
        print(row[13:25].strip())

列之间有很多空格,因此普通的split(' ')会创建过多的空列,虽然它可能不起作用,但re.split("\s+")可以做到

output = '''5636335  -rw-        1922  Apr 20 2019 09:22:47 +00:00  private-config.cfg
5636332  -rw-        1136  Apr 20 2019 09:22:47 +00:00  NETMAP
5636336  -rw-        0     Apr 20 2019 13:14:51 +00:00  test1.txt
5636325  -rw-        1691  Apr 20 2019 09:22:47 +00:00  startup-config.cfg
5636333  -rw-       16384  Apr 20 2019 09:22:47 +00:00  nvram_00001
5636330  -rw-         341  Apr 20 2019 09:22:47 +00:00  ubridge.log'''

import re

for row in output.split('\n'):
    if 'test1.txt' in row:
        print(re.split('\s+', row)[2])