Question

我试图解析一个字符串，并用字符串中包含的特定信息填充数组，但是我遇到了一些意外的行为。

我编写了一个脚本，可以在某些用例中成功完成此操作，但不适用于所有可能的情况。

考虑字符串：'BEST POSITION:P(0) = 1.124 P(1) = 2.345 P(2) = 3.145 P(3) = 4.354'

以下代码应创建列表：[1.124, 2.345, 3.145, 4.354]

inputs_best = np.zeros(4)
string_in = 'BEST POSITION:P(0) = 1.124 P(1) = 2.345 P(2) = 3.145 P(3) = 4.354'

best_sols_clean = ''
for item in string_in:
    best_sols_clean += item

best_sols_clean = re.sub('[ \t]', '', best_sols_clean)

count = 0
while best_sols_clean.find('P(') is not -1:
    line_index = best_sols_clean.find('P(')
    try:
        inputs_best[count] = float(best_sols_clean[line_index+5:line_index+10])
        best_sols_clean = best_sols_clean[line_index+10:-1]
        count += 1
    except ValueError:
        inputs_best[count] = float(best_sols_clean[line_index+5:line_index+6])
        best_sols_clean = best_sols_clean[line_index+6:-1]
        count += 1

print(inputs_best)

此脚本的输出为：

[1.124 2.345 3.145 4. ]

对于此字符串，此功能有效，但列表中的最后一个条目的位数太少了。

当一个或多个值是整数（例如：

）时，Except子句用于捕获异常

string_in = 'BEST POSITION:P(0) = 1 P(1) = 2.345 P(2) = 3.145 P(3) = 4'

这会导致错误。

我认为问题出在best_sols_clean = best_sols_clean[line_index+10:-1]行中，尽管我要切成字符串的最后一个元素，但出于某种原因，它会丢弃字符串的结尾数字。

对于字符串string_in = 'BEST POSITION:P(0) = 1 P(1) = 2.345 P(2) = 3.145 P(3) = 4'，程序退出并显示错误消息

Traceback (most recent call last):
  File "test.py", line 17, in <module>
    inputs_best[count] = float(best_sols_clean[line_index+5:line_index+10])
ValueError: could not convert string to float: 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    inputs_best[count] = float(best_sols_clean[line_index+5:line_index+6])
ValueError: could not convert string to float:

我也愿意提供比我尝试的更为精致的解决方案。

Answer 1

您正在尝试对微小的位进行硬编码，这使事情变得极其低效，易受攻击且难以调试。您的索引可能有问题，但深入研究可能不值得。您为什么不只在空间上分割字符串，然后尝试将所有看起来数字的字符串捕获到一个列表中？如下所示：

string_in = 'BEST POSITION:P(0) = 1.124 P(1) = 2.345 P(2) = 3.145 P(3) = 4.354'
numbers = []
for x in string_in.split(' '):
    # Append float-able strings into your list
    try: numbers.append(float(x))
    # Pass only on the ValueErrors, do not use bare except. Any other error should break the code by design
    except ValueError: pass
# Produces: [1.124, 2.345, 3.145, 4.354]

如果输入string_in = 'BEST POSITION:P(0) = 1 P(1) = 2.345 P(2) = 3.145 P(3) = 4'，则返回[1.0, 2.345, 3.145, 4.0]。这样对您有用吗？

Answer 2

您的问题似乎出在这行上

 best_sols_clean = best_sols_clean[line_index+10:-1]

每次循环都从字符串末尾减去一个字符。尝试将其更改为此：

 best_sols_clean = best_sols_clean[line_index+10:]

Answer 3

这将输出字符串中不在括号内的所有数字：

import re
re.findall('[^(]([\d.]+)', string_in)

示例：

import re

string_in = 'BEST POSITION:P(0) = 1.124 P(1) = 2.345 P(2) = 3.145 P(3) = 4.354'
print(re.findall('[^(]([\d.]+)', string_in))
# ['1.124', '2.345', '3.145', '4.354']

string_in = 'BEST POSITION:P(0) = 1 P(1) = 2.345 P(2) = 3.145 P(3) = 4'
print(re.findall('[^(]([\d.]+)', string_in))
# ['1', '2.345', '3.145', '4']

Python字符串解析的异常行为

3 个答案: