我有一个长行的文件。我只需要保存@names,但是用grep我不能得到它。例如,在下面的文本中,我需要在数组中保存名称mrvortex和MurphTWN
\"\/\u003e\n @// \u003cstrong class=\"fullname js-action-profile-name\"\u003eMartin Belanger\u003c\/strong\u003e\n @234/ \u003cspan class=\"username js-action-profile-name\"\u003e@mrvortex\u003c\/span\u003e\n \n \u003c\/a\u003e\n \u003c\/div\u003e\n \u003cp class=\"bio \"\u003e\n Meteorologist and Sr. Manager, TV & Cross-Platform Technologies at Pelmorex Media\n \u003c\/p\u003e\n\n \n\n\n \u003c\/div\u003e\n\u003c\/div\u003e\n\n\n\u003c\/li\u003e\n\n \u003cli class=\"js-stream-item stream-item stream-item\n\" data-item-id=\"151623861\" id=\"stream-item-user-151623861\" data-item-type=\"user\"\u003e\n \n\u003cdiv class=\"account js-actionable-user js-profile-popup-actionable \" data-screen-name=\"MurphTWN\" data-user-id=\"151623861\" data-feedback-token=\"\" data-impression-id=\"\" \u003e\n \n\n \u003cdiv class=\"user-actions btn-group not-following \" data-user-id=\"151623861\"\n data-screen-name=\"MurphTWN\" data-name=\"Chris Murphy TWN\" data-protected=\"false\"\u003e\n\n\n\n \n\n\n \u003cbutton class=\"user-actions-follow-button js-follow-btn follow-button btn\" type=\"button\"\u003e\n \u003cspan class=\"button-text follow-text\"\u003e\n \u003cspan class=\"Icon Icon--follow\"\u003e\u003c\/span\u003e Seguir \n \n \u003c\/span\u003e\n \u003cspan class=\"button-text following-text\"\u003e\n Siguiendo\n \n \u003c\/span\u003e\n \u003cspan class=\"button-text unfollow-text\"\u003e\n Dejar de seguir\n \n \u003c\/span\u003e\n \u003cspan class=\"button-text blocked-text\"\u003eBloqueado\u003c\/span\u003e\n \u003cspan class=\"button-text unblock-text\"\u003eDesbloquear\u003c\/span\u003e\n \u003cspan class=\"button-text pending-text\"\u003ePendiente\u003c\/span\u003e\n \u003cspan class=\"button-text cancel-text\"\u003eCancelar\u003c\/span\u003e\n\u003c\/button\u003e\n\n\n\n\u003c\/div\u003e\n\n\n\n \u003cdiv class=\"content\"\u003e\n \u003cdiv class=\"stream-item-header\"\u003e\n \u003ca class=\"account-group js-user-profile-link\" href=\"\/MurphTWN\" \u003e\n \u003cimg class=\"avatar js-action-profile-avatar \" src=\"https:\/\/pbs.twimg.com\/profile_images\/512972504411828224\/sM3noxz7_normal.jpeg\" alt=\"\" data-user-id=\"151623861\"\/\u003e\n \u003cstrong class=\"fullname js-action-profile-name\"\u003eChris Murphy TWN\u003c\/strong\u003e\u003cspan class=\"Icon Icon--verified Icon--small\"\u003e\u003cspan class=\"u-hiddenVisually\"\u003eCuenta verificada\u003c\/span\u003e\u003c\/span\u003e\n\n \u003cspan class=\"username js-action-profile-name\"\u003e@MurphTWN\u003c\/span\u003e\n \n \u003c\/a\u003e\n \u003c\/div\u003e\n \u003cp class=\"bio \"\u003e\n
答案 0 :(得分:1)
这可能会:
awk -v RS="@" 'NR>1{$1=$1;n=split($1,a,"[^a-zA-Z]");if (a[1]) print a[1]}' file
mrvortex
MurphTWN
或者gnu awk
(点击RS="u003e@"
,gnu支持RS中的多个字符):
awk -v RS="u003e@" 'NR>1{$1=$1;split($1,a,"[^a-zA-Z]");print a[1]}'file
mrvortex
MurphTWN
答案 1 :(得分:0)
并不是100%清楚你需要什么,但我相信你想解析那条长行并捕获@
符号后面的名字(实际名称,而不是234
等。 )。您可以使用grep -o
以及BASH中的模式匹配来完成此任务。请注意,这不是最优雅的解决方案,在脚本中,我从文件中读取长行。 (您可以稍微修改cat
这一行。如果您有疑问,请告诉我们:
#!/bin/bash
## get the long line from a file
ifn=${1:-dat/longline.txt}
[[ -f $ifn && -r $ifn ]] || { echo "Error: file not found '$ifn'"; exit 1; }
declare -a names
## grep the long line, remove leading '@', store in array
for i in $(grep -o '@[[:alpha:]]\+' -- "$ifn"); do
names+=( "${i#@}" )
done
## print array contents
for i in "${names[@]}"; do
echo "names: $i"
done
<强>输出:强>
$ bash atnames.sh
names: mrvortex
names: MurphTWN