我有一个包含数字的文本字符串。另外,我有一个号码清单。我想按字符串和列表的顺序用列表中的数字替换字符串中的数字。
通过使用正则表达式,我从字符串中提取了现有数字,并将它们也传递到列表中,现在我在原始数字和备用数字之间有了匹配。但是,仍不清楚如何找到调整项并按顺序进行替换。
这一行,我从给定的字符串中提取数字:
list_of_numbers_in_string = [int(x) for x in re.findall('\d+', str)]
现在我想知道如何使用它,或者是通过这种输入获得所需结果的另一种方法:
data = 'readingOrder {index:24;} person {offset:0; length:7;} textStyle {offset:0; length:7; underlined:true;} place {offset:52; length:8;} textStyle {offset:52; length:8; underlined:true;}'
new_numbers = [24, 0, 12, 0, 12, 58, 14, 58, 14]
获取此输出:
corrected_data = 'readingOrder {index:24;} person {offset:0; length:12;} textStyle {offset:0; length:12; underlined:true;} place {offset:58; length:14;} textStyle {offset:58; length:14; underlined:true;}'
答案 0 :(得分:1)
接受的答案实际上是不正确的(现已删除)。 data.replace()
将替换数字的第一个出现,但它并不总是正确的。例如,当您尝试将8替换为14时,实际上是将514替换为58。
这是我的解决方法:
import re
data = 'readingOrder {index:24;} person {offset:0; length:7;} textStyle {offset:0; length:7; underlined:true;} place {offset:52; length:8;} textStyle {offset:52; length:8; underlined:true;}'
new_numbers = [24, 0, 12, 0, 12, 58, 14, 58, 14]
offset = 0
for index, match in enumerate(re.finditer('\d+', data)):
data = data[:match.start() + offset] + str(new_numbers[index]) + data[match.end() + offset:]
offset += len(str(new_numbers[index])) - match.end() + match.start()
答案 1 :(得分:0)
如果您一次以一个data
直接对string
(作为new number
)进行操作(即in a for loop
,您执行data = operate on data
),则它会时间复杂度很可能是O(len(new_numbers) * len(data))
。
在O(len(data))
时间内完成此操作的一种有效方法是对a list of characters
进行操作:
def replace_numbers(data, new_numbers):
new_numbers_idx = 0
data_as_char_list = []
skip = False
for data_ch in data:
if data_ch.isdigit():
if not skip:
# When we encounter any number's first digit in data, we will add the new number, and in the next iterations we will skip rest of the digits in data.
# e.g. data = 'hi123hi', new_numbers = [444], then when we encounter `1` we will add ['4', '4', '4'] and skip the rest of the digits '2' and '3' from data by setting skip = True.
new_number_as_char_list = list(str(new_numbers[new_numbers_idx]))
data_as_char_list.extend(new_number_as_char_list)
new_numbers_idx += 1
skip = True
else:
data_as_char_list.append(data_ch)
skip = False
return ''.join(data_as_char_list)
data = 'readingOrder {index:24;} person {offset:0; length:7;} textStyle {offset:0; length:7; underlined:true;} place {offset:52; length:8;} textStyle {offset:52; length:8; underlined:true;}'
new_numbers = [24, 0, 12, 0, 12, 58, 14, 58, 14]
data = replace_numbers(data, new_numbers)
corrected_data = 'readingOrder {index:24;} person {offset:0; length:12;} textStyle {offset:0; length:12; underlined:true;} place {offset:58; length:14;} textStyle {offset:58; length:14; underlined:true;}'
assert data == corrected_data
答案 2 :(得分:0)
替代
import re
data = 'readingOrder {index:24;} person {offset:0; length:7;} textStyle {offset:0; length:7; underlined:true;} place {offset:52; length:8;} textStyle {offset:52; length:8; underlined:true;}'
new_numbers = [24, 0, 12, 0, 12, 58, 14, 58, 14]
x = re.findall("\d+", data)
data = data.replace("{","{{").replace("}","}}")
for n in x:
data = data.replace(n,"{}",1)
data = data.format(*new_numbers)
print(data)
[出]:
readingOrder {index:24;}人{offset:0;长度:12;} textStyle {offset:0;长度:12;带下划线的:true;}地方{offset:58;长度:14;} textStyle {offset:58;长度:14;下划线:true;}