Awk创建目录,然后创建带有zip的子目录

时间:2019-03-07 11:37:55

标签: awk

下面的<vector xmlns:android="http://schemas.android.com/apk/res/android" android:height="64dp" android:width="64dp" android:viewportHeight="600" android:viewportWidth="600" > <group android:name="rotationGroup" android:pivotX="300.0" android:pivotY="300.0" android:rotation="45.0" > <path android:name="vectorPath" android:fillColor="#000000" android:pathData="M300,70 l 0,-70 70,70 0,0 -70,70z" /> </group> </vector> 将在目录中创建子目录(该目录始终是file1的最后一行,每个块用一个空行分隔),如果第2行中的数字始终为(在文件1的awk中找到文件2的格式xx-xxxx)。这是当前的awk输出。

如果存在匹配项,并且在目录中创建了子目录,则file2中对应的line1 https将始终是指向要下载的zip文件的链接。我似乎无法在子文件夹中创建该链接,下载并解压缩.zip。下载代码执行并下载zip,但必须手动将其添加到终端。我对很长的帖子表示歉意,想提供所有细节来解决

文件1

$2

文件2

xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001

yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002

进行修改

https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
 cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232

所需的awk输出

cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR { for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next } 

# retrieve the first 7-char of each line in file2 as the key to test 
 against the above hash
{ k = substr($0, 1, 7) }

# if find k, then print
k in a { print a[k] "\t" $0 "\t" l }
# save prev line to 'l' which is supposed to be the URL
{ l = $0  } 
' RS= file1 RS='\n' file2 | while IFS=$'\t' read -r base_dir sub_dir link; 
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
 create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && { filename="%s"; unzip 
"${filename##*/}"; }'
done

1 个答案:

答案 0 :(得分:1)

我相信您的问题与此相关:Bash loop to make directory, if numerical id found in file

您可以在一个awk system()函数中运行所有命令,只需正确地组织它们即可,例如:

# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && { filename="%s"; unzip "${filename##*/}" && rm -f "${filename##*/}"; }'

# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
    # create an associative array (key/value pairs) based on the file1
    NR==FNR { for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next } 

    # retrieve the first 7-char of each line in file2 as the key to test against the above hash
    { k = substr($0, 1, 7) }

    # if find k, then run the system command    
    k in a { cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd) }

    # save prev line to 'l' which is supposed to be the URL
    { l = $0  } 
' RS= file1 RS='\n' file2

print更改为system以执行命令。

注意:如果文件名包含URL编码的字符,则以上unziprm命令可能不起作用。

根据您的awk edit更新:

您还可以只打印awk行中的所需信息,然后以bash格式对其进行处理,而无需执行awk中的所有操作(也可以删除该行以在您的cmd_fmt中定义awk edit部分):

awk '
    # create an associative array (key/value pairs) based on the file1
    NR==FNR { for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next } 

    # retrieve the first 7-char of each line in file2 as the key to test against the above hash
    { k = substr($0, 1, 7) }

    # if find k, then print
    k in a { print a[k] "\t" $0 "\t" l }

    # save prev line to 'l' which is supposed to be the URL
    { l = $0  } 

' RS= file1 RS='\n' file2 | while IFS=$'\t' read -r base_dir sub_dir link; do
    echo "download [$link] to '$base_dir/$sub_dir'"
    # bash command lines to make sub-folders and download files
    mkdir -p "$base_dir/$sub_dir" 
    cd "$base_dir/$sub_dir"

    if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
        echo "  + processing $link"
        # remove query_string from the link, since it might contains '/'
        filename="${link%\?*}"
        # remove path from filename and run `unzip`
        unzip "${filename##*/}" 
    else
        echo "  + error downloading: $link"
    fi

    # return to the base directory if it's a relative path
    # if all are absolute paths, then just comment out the following line
    cd ../..
done

注意:我没有测试curl行,也不知道不同链接的文件名可能是什么。 filename="${link##*/}"将删除最后一个'/'之前的所有字符,这将保留文件名和潜在的query_strings。 "${filename%\?*}"用于从filename删除尾随查询字符串。实际上,您的curl命令下载的文件名可能会有所不同,您必须从头检查并进行调整。