我的文件包含单行备注,其中包含指向其他备注的链接
>filename_without_extension:line_nr
的形式:
m01.txt:
Line 1. >m02:2
Line 2. >m02:3
Line 3.
m02.txt:
Line 1.
Line 2. >m01:3
Line 3. >m01:1 >m01:3
我想添加自动类wiki"反向链接"到了每条相连的线 还没有它们。所以期望的输出应该是这样的:
m01.txt:
Line 1. >m02:2 >m02:3
Line 2. >m02:3
Line 3. >m02:3 >m02:2
m02.txt:
Line 1.
Line 2. >m01:3 >m01:1
Line 3. >m01:1 >m01:3 >m01:2
我想出了一些非常糟糕且不适用于 sed 的东西。它应该遍历我的notes目录中的所有文件:
link_regex=$(sed -e '/(\>m[0-9]+\:[0-9]+?)+?/p')
linenr_from_link_regex=$(sed -e '/\>m[0-9]+?\:/d')
fname_from_cur_link=$(sed -e '/\:[0-9]+?\b/d;/\.txt/a')
link_from_f=$(sed -e '/^/\>/g;/\.txt$/d;/\:=/a' < "$f")
new_link_to_cur_f=$(sed -i "${linenr_fom_cur_link}a\ ${link_from_f}" ${fname_from_cur_link})
function create-cross-references () {
while read line; do
echo "$link_regex" | \ # look up links
echo "$linenr_from_link_regex" # pipe to get line number from current link
echo "$fname_from_cur_link" # turn current link to new file name
echo "$link_from_f" # turn current file name name to new link
echo "$new_link_to_cur_f" # add new link to current fname
done
}
for f in *.txt; do
create-cross-references
done
我在哪里错了?此外,什么是一个更合理的解决方案(最好仍然使用 sed )避免单步执行所有行(包括那些没有链接的行) 我的笔记文件夹每一次?谢谢你的帮助!
答案 0 :(得分:1)
您可以尝试这样的事情:
#!/bin/bash
function getlinks() {
# $1 must be something like >m01:1
grep "$1" *.txt | sed -e 's/\(.*\)\.\(.*\):Line \([0-9]\+\)..*/>\1:\3 /' | \
# all matches in one single line
tr -d '\n'
}
for fileName in *.txt;do
echo "$fileName:"
while read line;do
#Line 1. whatever ==> 1
lineNumber=$( echo $line | grep -Po '(?<=(Line )).*(?=\.)' )
#m01.txt ==> >m01
fileNameFormatted=$( echo "$fileName" | sed -e 's/\(.*\)\..*/>\1/' )
links=$( getlinks "$fileNameFormatted:$lineNumber" )
echo "$line $links"
done < $fileName
done
输出:
m01.txt:
Line 1. >m02:2 >m02:3
Line 2. >m02:3
Line 3. >m02:2 >m02:3
m02.txt:
Line 1.
Line 2. >m01:3 >m01:1
Line 3. >m01:1 >m01:3 >m01:2
编辑:由于@ martt的评论,
[...]你能否从正则表达式中删除第1行前缀?该 行实际上只包含随机文本+链接(如
Blablalbla. >m01:1
;这是一个误导性的例子)。另外,如何回应对真实文件的更改?
我对原始剧本进行了一些更改。
文本文件中不存在的行号,因此需要变量。 ($lineNumber
)
如果脚本多次运行,交叉链接将会重复,因此有必要避免这种情况。
结果必须存储在同一个文件中。
#!/bin/bash
for fileName in *.txt;do
#"Line 1" it is not present now. We've to carry the count of lines processed
let lineNumber=1
while read line;do
# transform m01.txt into >m01
fileNameFormatted=$( echo "$fileName" | sed -E 's/(.*)\..*/>\1/' )
links=$( \
#search for occurrences of >filename : grep -nr will return something like
# m02.txt:3:whatever. >m01:1 >m01:3
# in this example,
# we take the filename (m02) and the line number (3).
# adding '>' and ':'. Result: >m02:3
grep -nr "$fileNameFormatted:$lineNumber" *.txt | \
sed -E 's/(.*)\.(.*):([0-9]+):(.*).(.*)/>\1:\3/' | \
# replace new lines with spaces
tr '\n' ' ')
# skipping duplicates :
links=$( \
#merge existing line with links found
echo "$line $links" | \
#strip all before the dot
sed -E 's/(.*)\.(.*)/\2/' | \
# replace spaces with new line
tr ' ' '\n' | \
# remove duplicates: >m02:2 >m02:2 >m03:3
# ==> >m02:2 >m03:3
sort -u | \
# replace newlines with spaces.
tr '\n' ' ')
# remove all before the last dot:
# Line 1. >m02:2 >m03:3 ==> Line 1
line=$(echo $line | sed 's/\(.*\)\..*/\1/')
#merge both strings and append them to a temporary file
echo "$line.$links" >> "$fileName.tmp"
let lineNumber++
done < "$fileName"
#replace the original file
mv "$fileName.tmp" "$fileName"
done