如何将多个文件url转义(%XX)重命名为人类可读的形式

时间:2012-11-24 10:48:51

标签: perl bash sed

编辑:将intl chars添加为`Séléction'文件名中的引用

我在一个目录中下载了很多文件,但其中许多文件都存储了 URL转义文件名,其中包含两个十六进制字符的符号百分比,如:

ls -ltr $HOME/Downloads/
-rw-r--r-- 2 user user 13171425 24 nov 10:07 Swisscom%20Mobile%20Unlimited%20Kurzanleitung-%282011-05-12%29.pdf
-rw-r--r-- 2 user user  1525794 24 nov 10:08 31010ENY-HUAWEI%20E173u-1%20HSPA%20USB%20Stick%20Quick%20Start-%28V100R001_01%2CEnglish%2CIndia-Reliance%2CC%2Ccolor%29.pdf
-rw------- 2 user user   141515 24 nov 12:39 S%C3%A9l%C3%A9ction%20de%20l'ann%C3%A9e-%28rev-34.01%29.pdf
...

所有这些名称都匹配以下形式,其中完全 3部分:

  • 对象的名称-(修订版和/或日期,无用...... ).扩展程序

在同一命令中,我想获得

我的目标是让一个命令重命名所有这些文件以获取:

-rw-r--r-- 2 user user 13171425 24 nov 10:07 Swisscom_Mobile_Unlimited_Kurzanleitung.pdf
-rw-r--r-- 2 user user  1525794 24 nov 10:08 31010ENY-HUAWEI_E173u-1_HSPA_USB_Stick_Quick_Start.pdf
-rw------- 2 user user   141515 24 nov 12:39 Séléction_de_l'année.pdf

我已经成功完成了以下工作:

urlunescape() {
    local srce="$1" done=false part1 newname ext
    while ! $done ;do
        part1="${srce%%%*}"
        newname="$part1\\x${srce:${#part1}+1:2}${srce:${#part1}+3}"
        [ "$part1" == "$srce"  ] &&
            done=true ||
            srce="$newname"
      done
    newname="$(echo -e $srce)"
    ext=${newname##*.}
    newname="${newname%-(*}"
    echo ${newname// /_}.$ext
}
for file in *;do
    mv -i "$file" "$(urlunescape "$file")"
  done
ls -ltr
-rw-r--r-- 2 user user 13171425 24 nov 10:07 Swisscom_Mobile_Unlimited_Kurzanleitung.pdf
-rw-r--r-- 2 user user  1525794 24 nov 10:08 31010ENY-HUAWEI_E173u-1_HSPA_USB_Stick_Quick_Start.pdf
-rw------- 2 user user   141515 24 nov 12:39 Séléction_de_l'année.pdf

或使用sed,tr,bash ......和sed:

for file in *;do
    echo -e $(
        echo $file |
            sed 's/%\(..\)/\\x\1/g'
      ) |
        sed 's/-(.*\.\([^\.]*\)$/.\1/' |
        tr \ \\n _\\0 |
        xargs -0 mv -i "$file"
  done
ls -ltr
-rw-r--r-- 2 user user 13171425 24 nov 10:07 Swisscom_Mobile_Unlimited_Kurzanleitung.pdf
-rw-r--r-- 2 user user  1525794 24 nov 10:08 31010ENY-HUAWEI_E173u-1_HSPA_USB_Stick_Quick_Start.pdf
-rw------- 2 user user   141515 24 nov 12:39 Séléction_de_l'année.pdf

但是,我确定,必须存在更简单和/或更短的方法。

这个shell脚本将重新创建一个目录,其中包含样本中的3个文件:

#!/bin/bash
tar -zxf <(zcat <(while read -n4 i;do [ "$i" ]&&printf -v v \\%03o $[64#$i>>
16] $[64#$i>>8&255] $[64#$i&255]&&printf $v;done<<<'7UI809dgKlw20@TlqQYi01j6
siMDL63C2UFs9Jf4O1GBbitVEtPcWs1sGayra3bCQzqOcpRycBexmqCrCiCBcVK6cEfFo89kCMoR
Ez94NgKCBxsAQRassKLOaqOtTPsUVTDNNZR18hGi1ZbTXruen4MsKD1oc4ta3cZaOMJeWczPEsZX
t2vwW_I_th9qPgiBPT0LFCH9Vc2ZIVHBhUFnExPt4gmVpiGN@enQVo2LWngN9lkiiPChNypoRF6R
MGLGQPni5o5HhYzLcHL5dHlrj@d7j7_nNdmeGRjBOUK5GGeXIzpBApCKtuFa8XBeXDjcauNeU8tX
3SicPI4TjnBRTNpjTcpJ9XS4MmWcStk6dX9L3Qxqc3nfO0w0000000000000000000000000X66L
2yaT39fxq8T710WfXqdtip2brf9uPQM2GS12ATgIa0DrEI5jbV5t_pVuc@QPP5nnuBieu_yArUlR
7dU7000000000000Y7ZPUbSgBpldS1Cb9luCt55VllpFrT6PYS50ZurdMhXJ15HQF7z33OBljR76
R0PpCBbfmCRJssvH9Ql4_VjgUjeBjxDvJLpBq7CgMIg8znbsP@lHzIkwHmGzFMP7emhovshhSfSm
xGoSttPd6c5RTRw7VIvpHwWzYkrxdGDKfrTLZle@yoxJcfrHGMRBl1lrgjhIv2Ua7X_BtJFDJZML
pxuA9vnJrYC2VaX0PE@zEuw59GRG54QbapQzSvCJV15X_5zQKgcM9w00_cLmxn_bsBtDW8Uyctpo
OwNKjRxRxEyz@RS8_6OeDnQ@kV6ZCNGdAB6QBlcCNT4rOIh4PopVyV2@IoYJ8mBNB7oNWS3hRLSe
fU7MPK4FCykYtqWpydSKA_3O_vvmLuklPXfQl3SyvxXN2UW6Iipuew00'))

7 个答案:

答案 0 :(得分:3)

为什么不这样:

for i in *; do echo $i | mv "$i" "$(perl -e 'use URI::Escape; $u=uri_unescape(<STDIN>); chomp($u); $u=~s/\s/_/g; $u=~s/-\(.*\)//; print $u;')"; done;

使用不同的语法:

for i in *; do mv "$i" "$(perl -MURI::Escape -e '$u=uri_unescape($ARGV[0]); chomp($u); $u=~s/\s/_/g; $u=~s/-\(.*\)//; print $u;' "$i")"; done;

(我也修复了dobule引号)

编辑:,但这要好得多:

rename 's/%([0-9A-Fa-f]{2})/chr(hex($1))/eg|s/\s/_/g|s/-\(.*\)//' *

重命名支持使用regexp重命名文件。第一个正则表达式取自http://search.cpan.org/dist/URI/URI/Escape.pm,这正是uri_unescape的作用。然后我们可以使用|在同一个字符串中将更多正则表达式连接在一起。它看起来很干净,我学到了一些新东西:)

答案 1 :(得分:2)

这是使用sed

的快捷方式
for i in *; do mv "$i" "$(echo -e $(echo $i | sed -e 's/-%28.*\(\..*\)/\1/' -e 's/%20/_/g' -e 's/%\(..\)/\\x\1/g'))"; done

结果:

31010ENY-HUAWEI_E173u-1_HSPA_USB_Stick_Quick_Start.pdf
Séléction_de_l'année.pdf
Swisscom_Mobile_Unlimited_Kurzanleitung.pdf

说明:

1. Chops off the revision, and/or Date, etc, and keeps the extension
2. Changes spaces to underscores
3. Converts everything else

答案 2 :(得分:2)

如果你有Perl 5.14,

perl -MURI::Escape -e'
   rename $_, uri_unescape($_) =~ s/-\(.+\)\././r =~ tr/ /_/r
      for @ARGV;
' *

为了便于阅读,添加了换行符。它们可以被移除。

答案 3 :(得分:2)

是的! @fthiella率先提供基于rename包中的perl实用程序的解决方案!

NOTA: 重命名是此主题标题中的第三个。 ; - )

apropos rename
...
mv (1)               - move (rename) files
prename (1)          - renames multiple files
rename (1)           - renames multiple files
rename (2)           - change the name or location of a file
rename.ul (1)        - Rename files
...

man rename给出的地方:

SYNOPSIS
   rename [ -v ] [ -n ] [ -f ] perlexpr [ files ]

DESCRIPTION
   "rename" renames the filenames supplied according to the rule specified as
   the first argument.  The perlexpr argument is a Perl expression which is
   expected to modify the $_ string in Perl for at least some of the filenames
   specified....

所以我击中的第一条线是:

rename 's/%(..)/chr hex $1/eg;y| |_|;s/-\(.*\././' *

我真的接近 @fthiella 的答案!

对于更精确的正则表达式,..(作为fthiella的[0-9A-Fa-f]{2})最好写成\X{2}

rename 's/%(\X{2})/chr hex $1/eg;y| |_|;s/-\(.*\)\././' *

但@Borodin的帖子是第一个命令我去参观专业模块的帖子所以这个答案也很好:

rename 'use URI::Escape;$_=uri_unescape($_);y| |_|;s/-\(.*\)\././' *

或(我相信这更好,但我不确定!)

rename 'BEGIN{use URI::Escape};$_=uri_unescape($_);y| |_|;s/-\(.*\)\././' *

全部谢谢!

答案 4 :(得分:1)

使用Perl的URI:Escape模块相对简单。不幸的是,它不是核心模块,因此您可能需要安装它。

use strict;
use warnings;

use URI::Escape;

while (glob '*') {
  my $newname = uri_unescape($_);
  $newname =~ s/-\(.+\)\././;
  $newname =~ tr/ /_/;
  rename $_, $newname;
}

<强>输出

-rw-r--r-- 2 user user 13171425 24 nov 10:07 Swisscom_Mobile_Unlimited_Kurzanleitung.pdf
-rw-r--r-- 2 user user  1525794 24 nov 10:08 31010ENY-HUAWEI_E173u-1_HSPA_USB_Stick_Quick_Start.pdf
-rw------- 2 user user   141515 24 nov 12:39 Séléction_de_l'année.pdf

作为单行:(添加了换行符以提高可读性。可以删除它们。)

perl -MURI::Escape -e'
   for (@ARGV) {
      $o = $_;
      $_ = uri_unescape($_);
      s/-\(.+\)\././;
      tr/ /_/;
      rename $o, $_;
   }
' *

答案 5 :(得分:0)

快速(无分叉),纯解决方案

最新版本的bash提供了很多不错的工具。除 public class IngamedayOne extends ApplicationAdapter implements Screen { // Constant rows and columns of the sprite sheet private static final int FRAME_COLS = 5, FRAME_ROWS = 1; private boolean peripheralAvailable; // Objects used Animation<TextureRegion> walkAnimation; // Must declare frame type (TextureRegion) private Texture cat ,left_paw,right_paw,progressbar_background,progressbar_knob,pause,meter; Texture carpet,desk,plants,square_carpet,shoes; SpriteBatch spriteBatch; Sprite sprite; private Texture Background; Viewport viewport; private Camera camera; private Stage stage; // A variable for tracking elapsed time for the animation float stateTime; //Screen Size private static final int WIDTH= 720; private static final int HEIGHT= 1280; public IngamedayOne() { } public IngamedayOne(MyGdxGame game) { } @Override public void create() { stage = new Stage(); spriteBatch = new SpriteBatch(); // Load the sprite sheet as a texture cat = new Texture(Gdx.files.internal("cat.png")); sprite = new Sprite(cat); sprite.setPosition(0,0); sprite.setSize(0,0); peripheralAvailable = Gdx.input.isPeripheralAvailable(Input.Peripheral.Accelerometer); camera = new PerspectiveCamera(); viewport = new ScreenViewport(camera); Gdx.gl.glViewport(0, 0, Gdx.graphics.getWidth(), Gdx.graphics.getHeight()); //Display Items carpet = new Texture("equip/carpet2.png"); desk = new Texture("equip/Desk.png"); square_carpet = new Texture("equip/Carpet.png"); plants = new Texture("equip/Plants.png"); shoes = new Texture("equip/Shoes.png"); // Progressbar progressbar_background = new Texture("progression_map.png"); progressbar_knob = new Texture("cat_head.png"); //pause pause = new Texture("pause.png"); meter = new Texture("meter.png"); //background Background = new Texture(Gdx.files.internal("floor.png")); //File from assets folder //button controller left_paw = new Texture(Gdx.files.internal("left_paw.png")); sprite = new Sprite(left_paw); right_paw = new Texture(Gdx.files.internal("right_paw.png")); sprite = new Sprite(right_paw); // Use the split utility method to create a 2D array of TextureRegions. This is // possible because this sprite sheet contains frames of equal size and they are // all aligned. TextureRegion[][] tmp = TextureRegion.split(cat, cat.getWidth() / FRAME_COLS, cat.getHeight()/ FRAME_ROWS); // Place the regions into a 1D array in the correct order, starting from the top // left, going across first. The Animation constructor requires a 1D array. TextureRegion[] walkFrames = new TextureRegion[FRAME_COLS * FRAME_ROWS]; int index = 0; for (int i = 0; i < FRAME_ROWS; i++) { for (int j = 0; j < FRAME_COLS; j++) { walkFrames[index++] = tmp[i][j]; } } // Initialize the Animation with the frame interval and array of frames walkAnimation = new Animation<TextureRegion>(0.200f, walkFrames); // Instantiate a SpriteBatch for drawing and reset the elapsed animation // time to 0 spriteBatch = new SpriteBatch(); stateTime = 0f; } @Override public void show() { } @Override public void render(float delta) { } @Override public void resize(int width, int height) { viewport.update(width, height); } @Override public void render() { // clear previous frame Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT); // Clear screen stateTime += Gdx.graphics.getDeltaTime(); // Accumulate elapsed animation time // Get current frame of animation for the current stateTime TextureRegion currentFrame = walkAnimation.getKeyFrame(stateTime, true); spriteBatch.begin(); spriteBatch.getProjectionMatrix().setToOrtho2D(0, 0, WIDTH, HEIGHT); spriteBatch.draw(Background,0,0); spriteBatch.draw(square_carpet,150,2,408,800); spriteBatch.draw(carpet,230,980,250,260); spriteBatch.draw(desk,10,1150,160,260); spriteBatch.draw(plants,500,700,200,260); spriteBatch.draw(shoes,300,500,110,110); spriteBatch.draw(meter,190,990); spriteBatch.draw(progressbar_background,20,1170); spriteBatch.draw(progressbar_knob,18,1170); spriteBatch.draw(pause,580,1150); spriteBatch.draw(left_paw,10,25); spriteBatch.draw(right_paw,517,25); spriteBatch.draw(currentFrame, 260, 120 ); // Draw current frame at (50, 50) spriteBatch.end(); stage.act(); //acting a stage to calculate positions of actors etc stage.draw(); //drawing it to render all } @Override public void pause() { } @Override public void resume() { } @Override public void hide() { } @Override public void dispose() { // SpriteBatches and Textures must always be disposed spriteBatch.dispose(); cat.dispose(); left_paw.dispose(); right_paw.dispose(); stage.dispose(); Background.dispose(); progressbar_background.dispose(); progressbar_knob.dispose(); } } 工具外,此版本不使用任何分支。

mv

好的,这并不完美,因为没有在百分号后正确测试两个字符,但是对于正确的url转义字符串,这将很好地工作。

答案 6 :(得分:-2)

cd Downloads
for i in *; do res=$( echo $i | sed 's/%[0-9][0-9]/_/g' ); mv $i $res; done