我正在使用Windows并获得了一些CSV文件,其中只有部分来自第三列的数据对我感兴趣。以下是我的原始数据的几行示例:
Column.1 Column.2 Column.3 Column.4 Column.5 Column.6
blah blah A/B/C/D/x/x/x blah blah blah
blah blah A/B/C/D/x/x/x blah blah blah
blah blah E/F/G/H/x/x/x blah blah blah
我想用它做的是:
1.删除其他列但仅保留Column.3
2.将字符串从Column.3提取到第4个正斜杠,然后删除其余的字符串
3.删除重复条目
所以输出会是这样的:
A/B/C/D
E/F/G/H
希望这是解释我所追求的更好的方式。
干杯, 艾伦
答案 0 :(得分:1)
尝试在CMD中阅读HELP FOR
通过启用setlocal enableddelayedexpansion
,我们可以创建一个类似于结构的数组:
这将迭代" filename.csv"的行。将每一行设置为名为LINE的临时变量。
然后对于每个令牌" 1,2,3,4,5"由分隔符" \"分开delims=\
中的{LINE
}并将其存储在row
中,然后我们可以在第二个结束后将其回复,如图所示。
@echo off
setlocal enableextensions enabledelayedexpansion
SET /A COUNT=0
for /F "tokens=*" %%A in (d.csv) do (
set LINE="%%A"
set /A COUNT+=1
for /F "tokens=1,2,3,4,5,* delims=\" %%a in (!LINE!) do (
set row[0]=%%a
set row[1]=%%b
set row[2]=%%c
set row[3]=%%d
set row[4]=%%e
set row[5]=%%f
)
echo This is row: !COUNT!
echo This is column A: !row[0]!
echo This is column B: !row[1]!
echo This is column C: !row[2]!
echo This is column D: !row[3]!
echo This is column E: !row[4]!
echo This is column F: !row[5]!
echo.
)
REM this is substring manipulation
echo !row[5]:~1,2!
echo !row[5]:~0,2!
echo !row[5]:~3,5!
echo !row[5]:~-3!
endlocal
A1\anotherB\C\and a d\blah0\blah1\blah1
A2\stuff2\C\D\blah2\blah3\blah1
A3\B\the last C\D\blah4\pizza5\blah1
A4\B\C\D\blah6\blah7\blah1
C:\Users\UserBob\Desktop\RANDOM\32>3.bat
This is row: 1
This is column A: A1
This is column B: anotherB
This is column C: C
This is column D: and a d
This is column E: blah0
This is column F: blah1\blah1
This is row: 2
This is column A: A2
This is column B: stuff2
This is column C: C
This is column D: D
This is column E: blah2
This is column F: blah3\blah1
This is row: 3
This is column A: A3
This is column B: B
This is column C: the last C
This is column D: D
This is column E: blah4
This is column F: pizza5\blah1
This is row: 4
This is column A: A4
This is column B: B
This is column C: C
This is column D: D
This is column E: blah6
This is column F: blah7\blah1
输出继续这是子串输出(echo !row[5]:~1,2!
):
la
bl
h7\bl
ah1
因此,为了您的兴趣,您将使用!row[3]:~num,num!
答案 1 :(得分:0)
@ECHO OFF
SETLOCAL
:: remove variables starting $
FOR /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
FOR /f "tokens=1-4delims=/" %%a IN (q25716731.txt) DO SET "$%%a_%%b_%%c_%%d=%%a/%%b/%%c/%%d"
(
FOR /F "tokens=2delims==" %%a In ('set $ 2^>Nul') DO ECHO(%%a
)>newfile.txt
GOTO :EOF
我使用了一个名为q25716731.txt
的文件,其中包含一些数据供我测试。文件名不重要。
生成newfile.txt。
请注意,您明确声明了“反斜杠”,然后在数据示例中提供正斜杠。生成常规斜线的例程 - 反斜杠的变化应该是显而易见的。
澄清数据和输出要求的修订
@ECHO OFF
SETLOCAL
:: remove variables starting $
FOR /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
FOR /f "skip=1tokens=3delims= " %%s IN (q25716731.txt) DO (
FOR /f "tokens=1-4delims=/" %%a IN ("%%s") DO SET "$%%a_%%b_%%c_%%d=%%a/%%b/%%c/%%d"
)
(
FOR /F "tokens=2delims==" %%a In ('set $ 2^>Nul') DO ECHO(%%a
)>newfile.txt
GOTO :EOF
我使用了一个名为q25716731.txt
的文件,其中包含我的测试数据。
生成newfile.txt
“skip = 1”会跳过列标题行。
目前尚不清楚实际数据是真正的CSV还是实际固定列格式。假设blah
不包含空格。