我正在寻找VBA脚本来将带有单个Excel单元的文本分离到多个单元格

时间:2015-09-01 08:48:43

标签: excel vba

我有很多出版物,所有数据都放在一个单元格中。我想将每个单元格中的数据分成不同的列。一个用于作者,标题,年份,期刊,日期等。

例如,在一个单元格中我有:

Plant RA,Page JP,Bonham JH,Jones JP Stairway to Heaven(1971)Led Zepplin Dec 5; 1(39):14802-14084

代表作者的名字,代表头衔的Stairway to Heaven,(1971)年份,Led Zepplin为期刊名称,12月5日为日期,1(39)为期刊,14802-14804为迄今。

虽然每篇引文的长度和作者数量等都有所不同,但是可以自动化这个过程吗?基于LEFT(B1,SEARCH(" ",B1)-1)之类的命令,某些命令的变量太多,但我无法帮助,但我觉得这一定是人们经常遇到的问题。一个简单的解决方案是否可能?

提前致谢。

1 个答案:

答案 0 :(得分:0)

Well, it depends on how badly your citation formats vary, but one very useful tool for automating complicated string handling is regular expressions. Here's some code that demonstrates how you could structure one possible solution. The "Test()" subroutine will demo the process.

MSDN Regular Expression Quick Reference

Sub Test()
    Dim s(1 To 3) As String
    'Format A
    s(1) = "Plant RA, Page JP, Bonham JH, Jones JP Stairway to Heaven (1971) Led Zepplin Dec 5;1(39):14802-14084"
    'Format B
    s(2) = "Plant RA, Page JP, Bonham JH, Jones JP Stairway to Heaven [1971] Led Zepplin Dec 5;1(39):14802-14084"
    'Unknown Format
    s(3) = "Plant RA, Page JP, Bonham JH, Jones JP Stairway to Heaven (1971) Led Zepplin Dec 5-1(39):14802-14084"

    test_string = s(1)

    MsgBox GetFormat(test_string) & Chr(10) & GetYear(test_string)
End Sub
Function GetYear(ByVal s As String)
    Dim YearPattern As Object
    Set YearPattern = CreateObject("Scripting.Dictionary")
    YearPattern.Add "FormatA", "\(\d{4}\)"
    YearPattern.Add "FormatB", "\[\d{4}\]"

    F = GetFormat(s)
    If F = "Unknown Format" Then
        GetYear = "Error: Format not recognized"
    Else
        Set Result = FindPattern(s, YearPattern(F))
        n = Result.Count
        If n = 0 Then
            GetYear = "![No Result]"
        ElseIf n = 1 Then
            GetYear = Result(0)
        Else
            GetYear = "![Multiple results]: "
            For Each r In Result
                GetYear = GetYear & ", " & r
            Next
        End If
        GetYear = Clean(GetYear, CType)
    End If
End Function
Function GetFormat(ByVal s As String)
    Set FormatPatterns = CreateObject("Scripting.Dictionary")
    FormatPatterns.Add "FormatA", ",+.*\(\d{4}\).*;.*\):"
    FormatPatterns.Add "FormatB", ",+.*\[\d{4}\].*;.*\):"

    If FindPattern(s, FormatPatterns("FormatA")).Count > 0 Then
        GetFormat = "FormatA"
    ElseIf FindPattern(s, FormatPatterns("FormatB")).Count > 0 Then
        GetFormat = "FormatB"
    Else
        GetFormat = "Unknown Format"
    End If
End Function
Function FindPattern(ByVal s As String, ByVal p As String) As Variant
    'Argument 1: The string to execute regular expressions on (s)
    'Argument 2: A pattern string to execute (p)
    'Return Value: An array of regular expression results
    Set r = CreateObject("vbscript.regexp")
    r.Global = True
    r.IgnoreCase = True
    r.MultiLine = True
    r.Pattern = p
    Set FindPattern = r.Execute(s)
End Function
Function Clean(ByVal s As String, Optional ByVal CType As String) As String
    'Removes unwanted characters from a string (s)
    'Based on the specified type of string "CType"
    Select Case CType
    Case "Year"
        Clean = Replace(Replace(Replace(s, "(", ""), ")", ""), ": ,", ": ")
    Case Else
        Clean = Replace(s, ": ,", ": ")
    End Select
End Function