如何将CSV文件重新格式化为Google日历格式?

时间:2011-05-25 07:07:47

标签: vb.net parsing csv google-calendar-api text-parsing

因此,经过一些研究后,我能够找到将CSV文件转换为

所需的格式
Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description,Location,Private

问题是,我正在使用的CSV导出格式或顺序不正确,收集该信息的最佳方式是什么?这是我的一些来源。

名称,用户名,行类型,开始日期,开始时间,结束时间,结束日期,段开始日期,类型

“Smith,John J”,jjs,Shift,5/29 / 2011,9:30,17:30,5 / 29 / 2011,5 / 29/2011,常规

“Smith,John J”,jjs,Shift,5/30 / 2011,13:30,17:30,5 / 30 / 2011,5 / 30/2011,常规

    Dim Name As String = ""
    Dim UserName As String = ""

    Dim Data As String = """Smith, John J"",jj802b,Shift,5/29/2011,9:30,17:30,5/29/2011,5/29/2011,Transfer"

    For r As Integer = 1 To 10
        Name = Data.Substring(0, Data.LastIndexOf(""""))
        Data = Data.Remove(0, Data.LastIndexOf(""""))
        UserName = Data.Substring(Data.LastIndexOf(""""), ",")
    Next

4 个答案:

答案 0 :(得分:3)

以下是解决方案

Dim Name As String = ""
Dim UserName As String = ""

Dim Data As String = """Smith, John J"",jj802b,Shift,5/29/2011,9:30,17:30,5/29/2011,5/29/2011,Transfer"

For r As Integer = 1 To 10
    Dim DataArr() As String = DecodeCSV(Data) 'Use DecodeCSV function to regex split the string 
    Name = DataArr(0) 'Get First item of array as Name
    UserName = DataArr(1)  'Get Second item of array as UserName 
Next

DecodeCSV by Tim的优秀代码

Public Shared Function DecodeCSV(ByVal strLine As String) As String()

    Dim strPattern As String
    Dim objMatch As Match

    ' build a pattern
    strPattern = "^" ' anchor to start of the string
    strPattern += "(?:""(?<value>(?:""""|[^""\f\r])*)""|(?<value>[^,\f\r""]*))"
    strPattern += "(?:,(?:[ \t]*""(?<value>(?:""""|[^""\f\r])*)""|(?<value>[^,\f\r""]*)))*"
    strPattern += "$" ' anchor to the end of the string

    ' get the match
    objMatch = Regex.Match(strLine, strPattern)

    ' if RegEx match was ok
    If objMatch.Success Then
        Dim objGroup As Group = objMatch.Groups("value")
        Dim intCount As Integer = objGroup.Captures.Count
        Dim arrOutput(intCount - 1) As String

        ' transfer data to array
        For i As Integer = 0 To intCount - 1
            Dim objCapture As Capture = objGroup.Captures.Item(i)
            arrOutput(i) = objCapture.Value

            ' replace double-escaped quotes
            arrOutput(i) = arrOutput(i).Replace("""""", """")
        Next

        ' return the array
        Return arrOutput
    Else
        Throw New ApplicationException("Bad CSV line: " & strLine)
    End If

End Function

答案 1 :(得分:2)

根据CSV文件的确切内容和格式保证,为了提高速度和简便性,有时在,上使用split是解析文件的最简单,最快捷的方法。你的名字col包含一个,,它不是一个分隔符,它增加了一点复杂性,尽管假设名称总是包含1 ,,处理这种情况仍然很简单。

有解析CSV文件的库,这可能很有用。假设您不需要处理符合CSV规范的所有文件,我觉得它们有点过分。尽管如此,您可以使用以下regular expression轻松解析带有命名组的CSV文件以进行说服:

"(?<Name>[^"]+?)",(?<UserName>[^,]+?),(?<RowType>[^,]+?),(?<StartDate>[^,]+?),(?<StartTime>[^,]+?),(?<EndTime>[^,]+?),(?<EndDate>[^,]+?),(?<SegmentStartDate>[^,]+?),(?<Type>\w+)

这将创建命名捕获组,然后您可以使用它们输出到新的CSV文件,如下所示:

Dim ResultList As StringCollection = New StringCollection()
Try
    Dim RegexObj As New Regex("""(?<Name>[^""]+?)"",(?<UserName>[^,]+?),(?<RowType>[^,]+?),(?<StartDate>[^,]+?),(?<StartTime>[^,]+?),(?<EndTime>[^,]+?),(?<EndDate>[^,]+?),(?<SegmentStartDate>[^,]+?),(?<Type>\w+)", RegexOptions.IgnoreCase)
    Dim MatchResult As Match = RegexObj.Match(SubjectString)
    While MatchResult.Success
        'Append to new CSV file - MatchResult.Groups("groupname").Value

        'Name = MatchResult.Groups("Name").Value
        'Start Time = MatchResult.Groups("StartTime").Value         
        'End Time = MatchResult.Groups("EndTime").Value
        'Etc...
    End While
Catch ex As ArgumentException
    'Syntax error in the regular expression
End Try

有关详细信息,请参阅MSDN上的.NET Framework Regular Expressions

答案 2 :(得分:2)

这将是漫长的,对我来说是光秃秃的。

在开始之前,我想注意一些事项:

  • 一个是我正在使用 TextFieldParser,你可以找到 在FileIO命名空间下工作 使用输入CSV。这使得 读分隔文件要容易得多 而不是试图处理常规 表达式和你自己的解析, 等
  • 另一种是存储 数据集我使用List(Of Dictionary(Of String, String))或a 与之相关的词典列表 字符串到其他字符串。实质上 这与此没什么不同 DataTable ' Create a text parser object Dim theParser As New FileIO.TextFieldParser("C:\Path\To\theInput.csv") ' Specify that fields are delimited by commas theParser.Delimiters = {","} ' Specify that strings containing the delimiter are wrapped by quotes theParser.HasFieldsEnclosedInQuotes = True ' Dimension containers for the field names and the list of data rows ' Initialize the field names with the first row r Dim theInputFields As String() = theParser.ReadFields(), theInputRows As New List(Of Dictionary(Of String, String))() ' While there is data to parse Do While Not theParser.EndOfData ' Dimension a counter and a row container Dim i As Integer = 0, theRow As New Dictionary(Of String, String)() ' For each field For Each value In theParser.ReadFields() ' Associate the value of that field for the row theRow(theInputFields(i)) = value ' Increment the count i += 1 Next ' Add the row to the list theInputRows.Add(theRow) Loop ' Close the input file for reading theParser.Close() ' Dimension the list of output field names and a container for the list of formatted output rows Dim theOutputFields As New List(Of String) From {"Subject", "Start Date", "Start Time", "End Date", "End Time", "All Day Event", "Description", "Location", "Private"}, theOutputRows As New List(Of Dictionary(Of String, String))() ' For each data row we've extracted from the CSV For Each theRow In theInputRows ' Dimension a new formatted row for the output Dim thisRow As New Dictionary(Of String, String)() ' For each field name of the output rows For Each theField In theOutputFields ' Dimension a container for the value of this field Dim theValue As String = String.Empty ' Specify ways to get the value of the field based on its name ' These are just examples; choose your own method for formatting the output Select Case theField Case "Subject" ' Output a subject "[Row Type]: [Name]" theValue = theRow("Row Type") & ": " & theRow("Name") Case "Description" ' Output a description from the input field [Type] theValue = theRow("Type") Case "Start Date", "Start Time", "End Date", "End Time" ' Output the value of the field with a correlated name theValue = theRow(theField) Case "All Day Event", "Private" ' Output False by default (you might want to change the case for Private theValue = "False" Case "Location" ' Can probably be safely left empty unless you'd like a default value End Select ' Relate the value we've created to the column in this row thisRow(theField) = theValue Next ' Add the formatted row to the output data theOutputRows.Add(thisRow) Next ' Start building the first line by retriving the name of the first output field Dim theHeader As String = theOutputFields.First ' For each of the remaining output fields For Each theField In (From s In theOutputFields Skip 1) ' Append a comma and then the field name theHeader = theHeader & "," & theField Next ' Create a string builder to store the text for the output file, initialized with the header line and a line break Dim theOutput As New System.Text.StringBuilder(theHeader & vbNewLine) ' For each row in the formatted output rows For Each theRow In theOutputRows ' Dimension a container for this line of the file, beginning with the value of the column associated with the first output field Dim theLine As String = theRow(theOutputFields.First) ' Wrap the first value if necessary If theLine.Contains(",") Then theLine = """" & theLine & """" ' For each remaining output field For Each theField In (From s In theOutputFields Skip 1) ' Dereference and store the associated column value Dim theValue As String = theRow(theField) ' Add a comma and the value to the line, wrapped in quotations as needed theLine = theLine & "," & If(theValue.Contains(","), """" & theValue & """", theValue) Next ' Append the line to the output string theOutput.AppendLine(theLine) Next ' Write the formatted output to file IO.File.WriteAllText("C:\output.csv", theOutput.ToString) 的访问模式 如果你对此感觉更舒服 构造,欢迎您使用它 代替。词典列表 表现完全相同并且要求 设置很少,所以在这里使用它 取而代之。

我承认其中一些是硬编码的,但如果你需要概括程序,可以将某些方面移到应用程序设置和/或更好地分解功能。这里的重点是给你一个大致的想法。代码在下面注释:

Case

对于它的价值,使用您的示例数据似乎导致使用此代码在OpenOffice.org Calc中打开输出文件。您希望为字段输出的格式由您自己决定,因此请在Select中修改相应的{{1}}语句,以便快速编码!

答案 3 :(得分:1)

This answer对类似的问题建议使用VB的TextFieldParser类,对我来说这似乎比滚动自己的csv解析器更好。 乍一看,您有所需的数据字段,开始和结束日期,其余的,除了可能主题和/或描述可以留空或填充默认/固定值...