我需要解压缩一堆学生作业(jar)文件,以便我可以使用脚本将内容提交给Moss(斯坦福)抄袭检测服务器。我在Java中做了同样的事情,这是微不足道的,但我正在尝试重新实现为bash脚本。
我正在尝试执行以下操作:
我需要将临时目录列表格式化为
形式的字符串/ tempDir / studentName1 / .languageExt / tempDir / studentName2 / .languageExt
学生目录具有基本结构:
Student_Root_Directory:
Student1
Student2
Student1
Sub-Directories: 1 2 3 4 5
1: student1.jar
2: student1.jar
...
Student2
Sub-Directories: 1 2 3
1. student2.jar
...
要完成上面的前3个步骤:
#!/bin/bash
# Extract all jar files into a temp directory called /home/moss/tempJarFiles/studentName
# $1 is the command line argument that contains the path to the institution submission dir.
# $2 is the language extension: .c, .cpp, .java, .py
students=`ls $1`
student_dir=$1
languageExt=$2
mossDir="/home/moss"
tempDir="/home/moss/tempJarStorage"
for student in $students
do
latestSubmissionDir=`ls -t $student_dir/$student | head -1`
for jarDir in $latestSubmissionDir
do
mkdir $tempDir/$student
cp $student_dir/$student/$jarDir/*.jar $tempDir/$student
unzip -d $tempDir/$student/ -o -j $tempDir/$student/$student.jar *.$languageExt
rm $tempDir/$student/$student.jar
done
done
...这会导致在临时目录中创建许多学生目录,该目录仅包含学生提交的解压缩内容。 我需要新的临时目录的ls输出格式化为包含以下内容的字符串:
/tempDir/studentName1/\*.languageExt /tempDir/studentName2/\*.languageExt
我尝试了
的变体find "$tempDir" -iname "*.$languageExt" -printf "%p/*.$languageExt"
使用iname而不是 - 但我要么输出包含额外的目录信息,例如$ tempDir / * .languageExt(当我只需要子目录$ tempDir / $ studentName / * .languageExt)或者我输出的路径对于每个源文件也列出如下:
$的tempDir / $ studentName / studentNameA.java $的tempDir / $ studentName / studentNameB.java 当我只需要时 $的tempDir / $ studentName / *。java的
我认为这应该很简单,我只是在想它。任何改进脚本的提示也表示赞赏。
答案 0 :(得分:4)
以下脚本帽的修订版可能有效:
#/bin/bash
# Extract all jar files into a temp directory called /home/moss/tempJarFiles/studentName
# $1 is the command line argument that contains the path to the institution submission dir.
# $2 is the language extension: c, cpp, java, py
students_dir=$1
languageExt=$2
studentPathsT=( "$students_dir"/*/ )
mossDir='/home/moss'
tempDir='/home/moss/tempJarStorage'
for studentPathT in "${studentPathsT[@]}"; do
student=$(basename "$studentPathT")
mkdir "$tempDir/$student"
submissionDirsT=( "$studentPathT"*/ )
latestSubmissionDirT=${submissionDirsT[${#submissionDirsT[@]-1]}
cp "$latestSubmissionDirT"*.jar "$tempDir/$student/"
unzip -d "$tempDir/$student/" -o -j "$tempDir/$student/*.jar" "*.$languageExt"
rm "$tempDir/$student"/*.jar
done
# Note that at this point `"$tempDir"/*/*.$languageExt` would expand
# to all extracted submission files, across all students.
# Finally, output each student's extracted files as an unexpanded glob à la
# /{tempDir}/{studentName1}/*.{languageExt}
for pT in "$tempDir"/*/; do
echo "$pT*.$languageExt"
# Note: If there is a chance that your filenames contain
# embedded newlines (rare in practice) using `echo` won't work properly
# as @Charles Duffy points out.
# If that is a concern, use
# printf '%s\0' "$pT*.$languageExt"
# and process the output with a utility that can process NUL characters
# as separators, such as `xargs -0`.
done
ls
并且仅使用路径名扩展和数组变量,以便正确处理包含嵌入空格和其他shell元字符的路径。...T
表示特定路径或路径数组是* T *已终止,即它以/
结尾。9
,因为依赖于路径名扩展的隐式词法排序;如果数字越高,则必须应用显式数字排序。unzip
的globs(路径名模式)是故意双引号,因为它们应由unzip
解释,而不是shell。$languageExt
不以.
开头(例如,cpp
而不是.cpp
),尽管你的评论说的是什么。