如何添加到每行末尾附近的数字

时间:2011-03-15 04:31:41

标签: python regex perl sed awk

假设文件中有一些文字:

(bookmarks
("Chapter 1 Introduction 1" "#1"
("1.1 Problem Statement and Basic Definitions 2" "#2")
("1.2 Illustrative Examples 4" "#4")
("1.3 Guidelines for Model Construction 26" "#26")
("Exercises 30" "#30")
("Notes and References 34" "#34"))
)

如果有一行,我怎样才能在每一行的最后一个数字中添加11,即

(bookmarks
("Chapter 1 Introduction 1" "#12"
("1.1 Problem Statement and Basic Definitions 2" "#13")
("1.2 Illustrative Examples 4" "#15")
("1.3 Guidelines for Model Construction 26" "#37")
("Exercises 30" "#41")
("Notes and References 34" "#45"))
)

使用sed,awk,python,perl,regex ....

谢谢和问候!

9 个答案:

答案 0 :(得分:5)

awk -F'#' 'NF>1{split($2,a,"[0-9]+");print $1 FS $2+11 a[2];next}1' infile

概念证明

$ awk -F'#' 'NF>1{split($2,a,"[0-9]+");print $1 FS $2+11 a[2];next}1' infile
(bookmarks
("Chapter 1 Introduction 1" "#12"
("1.1 Problem Statement and Basic Definitions 2" "#13")
("1.2 Illustrative Examples 4" "#15")
("1.3 Guidelines for Model Construction 26" "#37")
("Exercises 30" "#41")
("Notes and References 34" "#45"))
)

答案 1 :(得分:4)

use strict;
use warnings;
while(my $line = <DATA>){
  $line =~ s/#(\d+)/'#'.($1 + 11)/e;
}
__DATA__
(bookmarks
("Chapter 1 Introduction 1" "#1"
("1.1 Problem Statement and Basic Definitions 2" "#2")
("1.2 Illustrative Examples 4" "#4")
("1.3 Guidelines for Model Construction 26" "#26")
("Exercises 30" "#30")
("Notes and References 34" "#34"))
)

<强>输出:

(bookmarks
("Chapter 1 Introduction 1" "#12"
("1.1 Problem Statement and Basic Definitions 2" "#13")
("1.2 Illustrative Examples 4" "#15")
("1.3 Guidelines for Model Construction 26" "#37")
("Exercises 30" "#41")
("Notes and References 34" "#45"))
)

答案 2 :(得分:2)

在Python中,尝试:

import re
m = re.search(r'(?<=#)([0-9]+)',txt)

找到下一个号码。然后设置:

txt = txt[:m.start()] + str(int(m.group())+11) + txt[m.end():]

只要search找不到任何进一步的匹配,就重复一次(例如在while循环中)。

注意:regExp (?<=#)([0-9]+)匹配#-character后面的任何数字序列。 start()产生下一场比赛的开始位置; end()产生结束位置,group()产生实际匹配。表达式str(int(m.group()) +11)将匹配的数字转换为int值,添加11并重新转换为字符串。

答案 3 :(得分:1)

如果你可以使用Ruby(1.9 +)

$ ruby -ne 'puts $_=/#/?$_.gsub(/(.*#)(\d+)(.*)/){"#{$1}"+($2.to_i+11).to_s+"#{$3}"}:$_' file
(bookmarks
("Chapter 1 Introduction 1" "#12"
("1.1 Problem Statement and Basic Definitions 2" "#13")
("1.2 Illustrative Examples 4" "#15")
("1.3 Guidelines for Model Construction 26" "#37")
("Exercises 30" "#41")
("Notes and References 34" "#45"))
)

答案 4 :(得分:1)

在Python中

dh = '''"Chapter 1 Introduction 1" "#1"
"1.1 Problem Statement and Basic Definitions 2" "#2"
"1.2 Illustrative Examples 4" "#4"
"1.3 Guidelines for Model Construction 26" "#26"
"Exercises 30" "#30"
"Notes and References 34" "#34"'''

pat = re.compile('^(".+?(\d+)" *"#)\\2" *$',re.M)

def zoo(mat):
    return '%s%s"' % (mat.group(1),str(int(mat.group(2))+11))

print dh
print
print pat.sub(zoo,dh)

结果

"Chapter 1 Introduction 1" "#1"
"1.1 Problem Statement and Basic Definitions 2" "#2"
"1.2 Illustrative Examples 4" "#4"
"1.3 Guidelines for Model Construction 26" "#26"
"Exercises 30" "#30"
"Notes and References 34" "#34"

"Chapter 1 Introduction 1" "#12"
"1.1 Problem Statement and Basic Definitions 2" "#13"
"1.2 Illustrative Examples 4" "#15"
"1.3 Guidelines for Model Construction 26" "#37"
"Exercises 30" "#41"
"Notes and References 34" "#45"

但是从你在其他消息中公开的前面的字符串开始:

eh = '''Chapter 3 Convex Functions 97 
3.1 Definitions 98  
3.2 Basic Properties 103'''

pat = re.compile('^(.+?(\d+)) *$',re.M)

def zaa(mat):
    return '"%s" "%s"' % (mat.group(1),str(int(mat.group(2))+11))

print eh
print
print pat.sub(zaa,eh)

结果

Chapter 3 Convex Functions 97 
3.1 Definitions 98  
3.2 Basic Properties 103

"Chapter 3 Convex Functions 97" "108"
"3.1 Definitions 98" "109"
"3.2 Basic Properties 103" "114"

这都是家庭作业吗?

编辑:

我更正了上面的第一个代码

dh = '''(bookmarks
("Chapter 1 Introduction 1" "#1")
("1.1 Problem Statement and Basic Definitions 2" "#2")
("1.2 Illustrative Examples 4" "#4")
("1.3 Guidelines for Model Construction 26" "#26")
("Exercises 30" "#30")
("Notes and References 34" "#34"))
)'''

pat = re.compile('^(\(".+?(\d+)" *"#)\\2" *(\)\)?)$',re.M)

def zoo(mat):
    return '%s%s"%s' % (mat.group(1),str(int(mat.group(2))+11),mat.group(3))

print dh
print
print pat.sub(zoo,dh)

结果

(bookmarks
("Chapter 1 Introduction 1" "#1")
("1.1 Problem Statement and Basic Definitions 2" "#2")
("1.2 Illustrative Examples 4" "#4")
("1.3 Guidelines for Model Construction 26" "#26")
("Exercises 30" "#30")
("Notes and References 34" "#34"))
)

(bookmarks
("Chapter 1 Introduction 1" "#12")
("1.1 Problem Statement and Basic Definitions 2" "#13")
("1.2 Illustrative Examples 4" "#15")
("1.3 Guidelines for Model Construction 26" "#37")
("Exercises 30" "#41")
("Notes and References 34" "#45"))
)

答案 5 :(得分:1)

从我对你之前的问题的回答:

awk '{n = $NF + 11; print "(\"" $0 "\" \"#" n "\")"}' inputfile

awk 'BEGIN {q="\x22"} {n = $NF + 11; print "(" q $0 q " " q "#" n q ")"}' inputfile

这适用于您在上一个问题中提供的数据。我无法确定你是如何从这个问题到你在这个问题中发布的例子,因为括号的嵌套方式有所不同。您还没有说明原始输入中是否存在(bookmarks )包装器,或者我们没有看到的某些代码是在添加其他内容时添加它。

你正在做的是开始看起来有点像XML。也许你应该使用真实的东西并使用适当的工具来操纵它。

答案 6 :(得分:1)

的Python:

import re
file_name="bin/SO/bookmarks.txt"

print "unmodified file:"
with open(file_name) as f:
    for line in f:
        print line.rstrip()

print   

print "modified file:"
i=11
with open(file_name) as f:
    for line in f:
        m=re.match(r'(^.*"#)(\d+)(.*$)',line)
        if m:
            new_line=m.group(1)+str(int(m.group(2))+i)+m.group(3)
            print new_line
        else:
            print line.rstrip()

输出:

unmodified file:
(bookmarks
("Chapter 1 Introduction 1" "#1"
("1.1 Problem Statement and Basic Definitions 2" "#2")
("1.2 Illustrative Examples 4" "#4")
("1.3 Guidelines for Model Construction 26" "#26")
("Exercises 30" "#30")
("Notes and References 34" "#34"))
)

modified file:
(bookmarks
("Chapter 1 Introduction 1" "#12"
("1.1 Problem Statement and Basic Definitions 2" "#13")
("1.2 Illustrative Examples 4" "#15")
("1.3 Guidelines for Model Construction 26" "#37")
("Exercises 30" "#41")
("Notes and References 34" "#45"))
)

答案 7 :(得分:1)

此语法为s-expressions(简称为sexps),最易于在LispScheme等相关语言中进行操作。对于复杂的任务来说最简单,就是这样;如果你可以假设你的输入足够驯服(例如章节标题中没有"#,你说明它们的换行符等等),那么对于这个任务,文本处理工具(如其他答案所示)是更可取的。

在Lisp或Scheme中,以结构化数据读取和写入数据就像(read)(write data)一样简单。其他事情并不那么容易,例如,没有标准的方法来读取Lisp或Scheme中的命令行参数。

这是一个执行所需转换的Lisp程序。它将数据视为结构化数据,因此您不必担心演示文稿。获取第一个命令行参数的第一行是CLisp;其余的是便携式Common Lisp。

(setq delta (parse-integer (car ext:*args*)))
(defun shift-page (page)
  (format nil "#~D" (+ delta (parse-integer page :start 1))))
(defun shift-pages (entry)
  (let ((title (car entry))
        (page (cadr entry))
        (subentries (cddr entry)))
    (cons title (cons (shift-page page) (mapcar #'shift-pages subentries)))))
(let ((toc (read)))
  (write (cons 'bookmarks (mapcar #'shift-pages (cdr toc)))))

答案 8 :(得分:1)

Emacs Lisp

预赛

在这里,我们将使用来自dashs第三方库的函数,您可以使用Emacs的软件包系统从MELPA安装在Emacs中。 How to install packages in Emacsdash是一个列表和树操作库,它还包含使代码更简洁的各种函数functionals是一个字符串操作库。当您经常在Elisp中编写代码时,我强烈建议您安装这些软件包以简化编码。

-mapmapcar相同,它遍历列表,为每个元素调用函数,并返回包含所有元素更改的列表。例如。 (-map '1+ '(1 2 3)) ; returns (2 3 4)。但是,-map具有anaphoric macro版本,允许编写简洁代码而不是传递lambdas。照应版本以2个破折号开头。例如。 (--map (+ 10 it) '(1 2 3))相当于(-map (lambda (x) (+ 10 x)) '(1 2 3))

->>是来自dash的线程宏,类似function composition,但顺序相反。例如。返回(number-to-string (-sum (-map '1+ '(1 2 3))))的{​​{1}}相当于"9"

字符串方法

假设您将整个结构存储在字符串(->> '(1 2 3) (-map '1+) -sum number-to-string)中。然后,您必须使用正则表达式找到s形式的每个字符序列,并将其替换为增加11的数字。

#[some_number]

树方法

但是等一下,你的结构被递归地括在括号中,它是一个s-expression!我们可以将它作为树遍历并替换从#开始的每个字符串,并包含一个数值,其中新值增加11. -tree-map-nodes就像树的(let* ((old (-map 'car (s-match-strings-all "#[0-9]\\{1,3\\}" s))) (new (--map (->> it (s-chop-prefix "#") string-to-number (+ 11) number-to-string (s-prepend "#")) old))) (s-replace-all (-zip old new) s)) 。它仅在谓词返回true时应用函数。 IOW,如果谓词不适合它,它会跳过一些不变的元素。

-map以广度和深度两种方式进行递归。这意味着,它将列表视为普通元素,第一个元素是整个列表。例如。 -tree-map-nodes不正确,会抛出此错误:(-tree-map-nodes 'zerop '1+ '(0 (1 (0) 1) 0))。相反,您应首先检查元素是否为数字。例如。 *** Eval error *** Wrong type argument: numberp, (0 (1 (0) 1) 0)将返回(--tree-map-nodes (and (numberp it) (zerop it)) (1+ it) '(0 (1 (0) 1) 0))。假设您的树在变量(1 (1 (1) 1) 1)中。然后解决方案在下面,它将返回一个新的修改的s表达式:

q