我试图使用VBScript仅使用标识号来查询包含人员数据列表的文件。目前,我有一个包含所有人员数据的数据集文件和一个查询文件,该文件具有要查询该数据集结果的ID号。查询结果匹配后,我想将行输出到结果文件。
这是数据集文件和查询文件中包含的数据的一般示例。
数据集:
ID,Name,City,State,Zipcode,Phone 1885529946,Hall,Louisville,KY,40208,5026366683 1886910320,Brown,Sacramento,CA,95814,5302981550 1953250581,Rios,Sterling,OK,73567,5803658077 1604767393,Barner,Irvine,CA,92714,9494768597 1713746771,Herrera,Stotts City,MO,65756,4172852393 1022686106,Moore,Ocala,FL,34471,3526032811 1579121274,Beyer,Alexandria,MD,22304,3013838430 1288569655,Rondeau,Augusta,GA,30904,7066671404 1954615404,Angel,Los Angeles,CA,90014,5622961806 1408747874,Lagasse,Traverse City,MI,49686,2318182792
查询文件:
1885529946 1713746771 1408747874
我能够读取查询文件中的所有行并使用import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.BeforeAllCallback;
import org.junit.jupiter.api.extension.BeforeTestExecutionCallback;
import org.junit.jupiter.api.extension.ExtendWith;
import org.junit.jupiter.api.extension.ExtensionContext;
import org.junit.jupiter.api.extension.ExtensionContext.Namespace;
@ExtendWith({ Extension1.class, Extension2.class })
public class Tests {
@Test
void test() {
// executing this results in the following being printed to SYS_OUT.
// PROJECT_ID=112
}
}
class Extension1 implements BeforeAllCallback {
public static final String PROJECT_ID = Extension1.class.getName() + ".PROJECT_ID";
@Override
public void beforeAll(ExtensionContext context) throws Exception {
context.getStore(Namespace.GLOBAL).put(PROJECT_ID, "112");
}
}
class Extension2 implements BeforeTestExecutionCallback {
@Override
public void beforeTestExecution(ExtensionContext context) throws Exception {
System.out.println("PROJECT_ID=" + context.getStore(Namespace.GLOBAL).get(Extension1.PROJECT_ID));
}
}
显示ID号。不会产生错误,脚本不会结束,也不会生成结果文件。结果文件应仅包含来自数据集的与ID号匹配的行。例如:
1885529946,Hall,Louisville,KY,40208,5026366683 1713746771,Herrera,Stotts City,MO,65756,4172852393 1408747874,Lagasse,Traverse City,MI,49686,2318182792
这是我尝试使用的脚本:
WScript.Echo
答案 0 :(得分:3)
您的代码中的问题是那些文件作为 streams 打开。一旦到达此类流的末尾(即.AtEndOfStream
变为true,例如在反复调用.ReadLine()
之后),它就不会神奇地倒回到文件的开头。您的“嵌套循环”方法需要倒回查询文件才能正常运行。
这可以通过关闭并重新打开流来实现,但是效率不是很高。将 all 数字与输入文件中的每一行进行比较也不是很有效。我建议您使用Dictionary对象将数字存储在查询文件中。字典存储键值对,并针对快速键查找(通过.Exists(someKey)
进行了优化),因此它们非常适合此任务。
这样,您可以非常快速地确定是否应将一行写入输出文件:
Const intForReading = 1
Const intForWriting = 2
Const intForAppending = 8
strQueryFile = "C:\numbers_test.txt"
strDataSetFile = "C:\data_test.csv"
strOutputFile = "C:\results_test.csv"
Set objFSO = CreateObject("Scripting.FileSystemObject")
' first import the query file into a dictionary for easy lookup
Set numbers = CreateObject("Scripting.Dictionary")
With objFSO.OpenTextFile(strQueryFile, intForReading)
Do Until .AtEndOfStream
' we are only interested in the key for this task, the value is completely irrelevant.
numbers.Add .ReadLine(), ""
Loop
.Close
End With
Set objFileToWrite = objFSO.OpenTextFile(strOutputFile, intForWriting, true)
With objFSO.OpenTextFile(strDataSetFile, intForReading)
Do Until .AtEndOfStream
line = .ReadLine()
columns = Split(line, ",")
currentNumber = columns(0)
If numbers.Exists(currentNumber) Then objFileToWrite.WriteLine(line)
Loop
.Close
End With
objFileToWrite.Close
答案 1 :(得分:2)
我喜欢将ADODB用于此类任务,并将输入文件视为数据库。技巧通常是为您的系统找到合适的connection string,并在必要时使用Schema.ini file。
option explicit
Const adClipString = 2
dim ado: set ado = CreateObject("ADODB.Connection")
' data files are in this folder
' using the old JET driver
ado.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=.\;Extended Properties=""text;HDR=Yes;FMT=Delimited"";"
' or maybe use ACE if installed
' ado.ConnectionString = "Driver=Microsoft Access Text Driver (*.txt, *.csv);Dbq=.\;Extensions=asc,csv,tab,txt;"
ado.open
' query is in a CSV too, so we can access as a table
' the column names are given in Schema.ini
const QUERY = "SELECT * FROM [data_test.csv] WHERE ID IN (SELECT ID FROM [query_test.csv])"
' or literals
' const QUERY = "SELECT * FROM [data_test.csv] WHERE ID IN ('1885529946', '1713746771', '1408747874')"
dim rs: set rs = ado.Execute(QUERY)
' convenient GetString() method allows formatting the result
' this could be written to file instead of outputting to console
WScript.Echo rs.GetString(adClipString, , vbTab, vbNewLine, "[NULL]")
'or create a new table!
'delete results table if exists
' catch an error if the table does not exist
on error resume next
' for some reason you need to use #csv not .csv here
ado.Execute "DROP TABLE result#csv"
if err then
WScript.Echo err.description
end if
on error goto 0
ado.Execute("SELECT * INTO [result.csv] FROM [data_test.csv] WHERE ID IN (SELECT ID FROM [query_test.csv])")
rs.close
ado.close
Schema.ini文件
[data_test.csv]
Format=CSVDelimited
ColNameHeader=True
Col1=ID Text
Col2=Name Text
Col3=City Text
Col4=Zipcode Text
Col5=Phone Text
[query_test.csv]
Format=CSVDelimited
ColNameHeader=False
Col1=ID Text