Excel 2013宏 - 表到列(用于文本分析)

时间:2015-01-22 18:28:21

标签: excel text-mining

我正在尝试移动一些数据,以便更轻松地进行一些基本的文本挖掘。我有一个表格,每个句子都有一行,第一列是标识符,下面是“N”列。 例如:

Record  Word1   Word2   Word3   Word N
1       The     quick   brown   fox
2       jumps   over    the 
3       lazy    white       
4       dog         

我需要将该表格格式的数据移动到一个列表中,每行一个单词,以及该单词所在的记录。

示例:

Record  Word
1       the
1       quick
1       brown
1       fox
2       jumps
2       over
2       the
3       lazy
3       white
4       dog

我找到了将整个表放在一列中的宏,但不是我需要识别该单词出现在哪个记录中的方式。(Excel Macros: From Table to Column

我还在这里找到了以下代码:http://community.spiceworks.com/scripts/show/1169-excel-table-to-single-column

Option Explicit

Public Sub DoCopies()
Dim lRowIdx As Long
Dim lColIdx As Long
Dim lRowStart As Long
Dim lRowOut As Long

Dim s1 As Worksheet
Dim s2 As Worksheet

Dim oBook As Workbook

Dim r As Range
Dim lRows As Long
Dim lCols As Long

  On Error GoTo errorExit

  Application.DisplayAlerts = False
  Set oBook = ThisWorkbook
  Set s1 = Worksheets(1)

  ' remove other tabs
  While (oBook.Sheets.Count > 1)
    oBook.Sheets(oBook.Sheets.Count).Delete
  Wend

  ' create the new tab
  Set s2 = oBook.Worksheets.Add(After:=oBook.Worksheets(oBook.Worksheets.Count))
  s2.Name = "Result"

  Set r = s1.UsedRange
  lCols = r.Columns.Count
  lRows = r.Rows.Count

  'skip header
  lRowStart = 1
  While (Trim$(s1.Cells(lRowStart, 1) = ""))
    lRowStart = lRowStart + 1
  Wend

  lRowStart = lRowStart + 1

  ' Take each row, put on tab 2
  For lRowIdx = lRowStart To lRows

    If (Trim$(s1.Cells(lRowIdx, 1)) <> "") Then

      For lColIdx = 1 To lCols
        lRowOut = lRowOut + 1
        s2.Cells(lRowOut, 1) = s1.Cells(lRowIdx, lColIdx)
      Next lColIdx

    End If

  Next lRowIdx

  s2.Activate

  Application.DisplayAlerts = True
  Exit Sub

errorExit:
  Application.DisplayAlerts = True
  Call MsgBox(CStr(Err.Number) & ": " & Err.Description, vbCritical Or vbOKOnly, "Unexpected Error")

End Sub

但是那个宏会返回这样的数据:

1
The
quick
brown
fox
2
jumps
over
the
<null>
3
lazy
white
<null>
<null>
4
dog
<null>
<null>
<null>

我尝试过使用代码,但无法理解。

任何帮助将不胜感激。谢谢!

2 个答案:

答案 0 :(得分:0)

Microsoft已经为您有效地编写了大部分代码。缺少的只是过滤列Value以选择(Blanks)然后删除这些行 - 并更改列标签,删除列。详情here

答案 1 :(得分:0)

感谢 pnuts 让我指向正确的方向。您的链接有来自 Pankaj Jaju 的评论,提供了我需要的确切脚本:

Sub NormaliseTable() 
' start with the cursor in the table 
  Dim rTab As Range, C As Range, rNext As Range 
  Set rTab = ActiveCell.CurrentRegion 
  If rTab.Rows.Count=1 Or rTab.Columns.Count = 1 Then 
    MsgBox "Not a well-formed table!" 
    Exit Sub 
  End If 
  Worksheets.Add  ' the sheet for the results 
  Range("A1:C1") = Array("Row","Column","Value") 
  Set rNext = Range("A2") 
  For Each C In rTab.Offset(1,1).Resize(rTab.Rows.Count-1, _ 
         rTab.Columns.Count-1).Cells 
    If Not IsEmpty(C.Value) Then 
      rNext.Value = rTab.Cells(C.Row-rTab.Row+1,1) 
      rNext.Offset(0,1).Value = rTab.Cells(1,C.Column-rTab.Column+1)     
      rNext.Offset(0,2).Value = C.Value 
      Set rNext = rNext.Offset(1,0) 
    End If 
  Next 
End Sub

再次感谢您的指导!