在CSV文件中选择不同数量的列

时间:2016-01-28 10:02:14

标签: sql csv

任务是根据条件从多个CSV文件中提取数据。该文件包含sampleId(这是条件)和其他列。在文件的末尾,测量值在0 ... 100个命名列下(数字是列的实际名称)。为了使它更有趣,不同的CSV文件可能会有变化,具体取决于客户的需求。这意味着测量数据计数可以是15,25,50等,但不超过100,并且在一个文件中没有变化。此数据始终位于行的末尾,因此数字前面有一组列。

我希望有一个可以接受参数的SQL语句:

SELECT {0} FROM {1} WHERE sampleId = {2}

0 是数字, 1 是CSV文件名, 2 是我们正在寻找的sampleId。我想到的另一个解决方案是在最后一个修复列之后查看所有列。我不知道是否可能,只是大声思考。

请描述一下,我的SQL知识是基本的。任何帮助都非常感谢。

1 个答案:

答案 0 :(得分:0)

所以终于设法解决了它。代码在VB.NET中,但逻辑很清楚。

 Private Function GetDataFromCSV(sampleIds As Integer()) As List(Of KeyValuePair(Of String, List(Of Integer)))
           Dim dataFiles() As String = System.IO.Directory.GetFiles(OutputFolder(), "*.CSV")
           Dim results As List(Of KeyValuePair(Of String, List(Of Integer))) = New List(Of KeyValuePair(Of String, List(Of Integer)))

          If dataFiles.Length > 0 And sampleIds.Length > 0 Then
                  For index As Integer = 0 To sampleIds.Length - 1
                       If sampleIds(index) > 0 Then
                            For Each file In dataFiles
                                 If System.IO.File.Exists(file) Then
                                      Dim currentId As String = sampleIds(index).ToString()
                                      Dim filename As String = Path.GetFileName(file)
                                      Dim strPath As String = Path.GetDirectoryName(file)

                                   Dim conn As OleDb.OleDbConnection = New OleDb.OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" & strPath & "; Extended Properties='text; HDR=Yes; FMT=Delimited'")

                                   Dim command As OleDb.OleDbCommand = conn.CreateCommand()
                                    command.CommandText = "SELECT * FROM [" & filename & "] 'WHERE Sample ID = " & currentId

                                   conn.Open()
                                   Dim reader As OleDb.OleDbDataReader = command.ExecuteReader()
                                   Dim numberOfFields = reader.FieldCount  
                                   While reader.Read()
                                            If reader("Sample ID").ToString() = currentId Then 'If found write particle data into output file
                                                 Dim particles As List(Of Integer) = New List(Of Integer)
                                                 For field As Integer = 0 To numberOfFields - 1
                                                     particles.Add(CInt(reader(field.ToString())))
                                                 Next field
                                                 results.Add(New KeyValuePair(Of String, List(Of Integer))(currentId, particles))
                                            End If
                                       End While
                                       conn.Close()
                                  End If
                             Next file
                    End If
               Next index
               Return results
          Else
               MessageBox.Show("Missing csv files or invalid sample Id(s)", "Internal error", MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
         End If
         End Function