扩展每个列单元格的列单元格

时间:2015-07-17 09:39:34

标签: arrays excel vba excel-vba loops

我有3组不同的数据(在不同的列中)

  1. A列中的动物(5种不同)
  2. B栏中的水果(1000种不同)
  3. C栏中的国家(10种不同)
  4. 通过这3个数据集合,我希望获得5×1000×10,总共50k对应的元素。 E F G(每种动物与每种水果和每个国家相对应)。

    可以通过手动复制和粘贴值来完成,但这需要很长时间。有没有办法通过VBA代码或

    自动化它

    是否存在上述无限数据集的通用公式?如果不清楚,请告诉我。

    以下是一个较小的数据示例以及结果的结果:

    Expanding data sets for each in other

8 个答案:

答案 0 :(得分:15)

我通过通用收集,您希望这可以容纳任意数量的列和每个列中的任意数量的条目。一些变体数组应该提供计算每个值的重复周期所必需的维度。

test4

将列标题标签放在第2行,从第A列开始,将数据直接放在第A行。

我添加了一些错误控件来警告超出工作表上的行数。这通常不是可能考虑的因素,但是将未确定数量的列中的值的数量相乘可以快速产生大量结果。你不会超过1,048,576行。

Variant Array expansion

答案 1 :(得分:14)

非连接选择SQL语句的经典示例,它返回所列表的所有组合结果的笛卡尔积。

SQL数据库解决方案

只需将Animals,Fruit,Country作为单独的表导入任何SQL数据库,如MS Access,SQLite,MySQL等,并列出没有连接的列表,包括隐式(WHERE)和显式(JOIN)加入:

SELECT Animals.Animal, Fruits.Fruit, Countries.Country
FROM Animals, Countries, Fruits;

Cartesian SQL

Excel解决方案

使用ODBC连接到包含动物,国家和水果范围的工作簿,在VBA中运行非连接SQL语句的相同概念。例如,每个数据分组都在自己的同名工作表中。

Sub CrossJoinQuery()

    Dim conn As Object
    Dim rst As Object
    Dim sConn As String, strSQL As String

    Set conn = CreateObject("ADODB.Connection")
    Set rst = CreateObject("ADODB.Recordset")

    sConn = "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
               & "DBQ=C:\Path To\Excel\Workbook.xlsx;"
    conn.Open sConn

    strSQL = "SELECT * FROM [Animals$A1:A3], [Fruits$A1:A3], [Countries$A1:A3] "
    rst.Open strSQL, conn

    Range("A1").CopyFromRecordset rst

    rst.Close
    conn.Close

    Set rst = Nothing
    Set conn = Nothing

End Sub

Cartesian SQL in VBA

答案 2 :(得分:11)

我对此问题的第一种方法类似于@Jeeped发布的方法:

  1. 将输入列加载到数组并计算每列中的行数
  2. 使用所有组合填充数组
  3. 将数组分配给输出范围
  4. 使用MicroTimer我计算了上述算法各部分的平均时间。对于更大的输入数据,第3部分占总执行时间的90%-93%。

    以下是我尝试提高将数据写入工作表的速度。我已经定义了一个常量iMinRSize=17。一旦可以使用相同的值填充超过iMinRSize个连续行,代码就会停止填充数组并直接写入工作表范围。

    Sub CrossJoin(rSrc As Range, rTrg As Range)
    
      Dim vSrc() As Variant, vTrgPart() As Variant
      Dim iLengths() As Long
      Dim iCCnt As Integer, iRTrgCnt As Long, iRSrcCnt As Long
      Dim i As Integer, j As Long, k As Long, l As Long
      Dim iStep As Long
    
      Const iMinRSize As Long = 17
      Dim iArrLastC As Integer
    
      On Error GoTo CleanUp
      Application.ScreenUpdating = False
      Application.EnableEvents = False
    
      vSrc = rSrc.Value2
      iCCnt = UBound(vSrc, 2)
      iRSrcCnt = UBound(vSrc, 1)
      iRTrgCnt = 1
      iArrLastC = 1
      ReDim iLengths(1 To iCCnt)
      For i = 1 To iCCnt
        j = iRSrcCnt
        While (j > 0) And IsEmpty(vSrc(j, i))
          j = j - 1
        Wend
        iLengths(i) = j
        iRTrgCnt = iRTrgCnt * iLengths(i)
        If (iRTrgCnt < iMinRSize) And (iArrLastC < iCCnt) Then iArrLastC = iArrLastC + 1
      Next i
    
      If (iRTrgCnt > 0) And (rTrg.row + iRTrgCnt - 1 <= rTrg.Parent.Rows.Count) Then
        ReDim vTrgPart(1 To iRTrgCnt, 1 To iArrLastC)
    
        iStep = 1
        For i = 1 To iArrLastC
          k = 0
          For j = 1 To iRTrgCnt Step iStep
            k = k + 1
            If k > iLengths(i) Then k = 1
            For l = j To j + iStep - 1
              vTrgPart(l, i) = vSrc(k, i)
            Next l
          Next j
          iStep = iStep * iLengths(i)
        Next i
    
        rTrg.Resize(iRTrgCnt, iArrLastC) = vTrgPart
    
        For i = iArrLastC + 1 To iCCnt
          k = 0
          For j = 1 To iRTrgCnt Step iStep
            k = k + 1
            If k > iLengths(i) Then k = 1
            rTrg.Resize(iStep).Offset(j - 1, i - 1).Value2 = vSrc(k, i)
          Next j
          iStep = iStep * iLengths(i)
        Next i
      End If
    
    CleanUp:
      Application.ScreenUpdating = True
      Application.EnableEvents = False
    End Sub
    
    Sub test()
      CrossJoin Range("a2:f10"), Range("k2")
    End Sub
    

    如果我们将iMinRSize设置为Rows.Count,则所有数据都会写入数组。以下是我的样本测试结果:

    enter image description here

    如果具有最高行数的输入列首先出现,则代码效果最佳,但修改代码以对列进行排序并按正确顺序处理不是一个大问题。

答案 3 :(得分:7)

您可以使用工作表公式执行此操作。 如果您有NAME'd范围 - 动物,水果和国家,那么“技巧”就是在该数组中生成索引以提供所有各种组合。

例如:

=CEILING(ROWS($1:1)/(ROWS(Fruits)*ROWS(Countries)),1)

将生成一系列基于1的数字,这些数字会在Fruits * Countries中的数字条目中重复出现 - 这样可以为每只动物提供所需的行数。

=MOD(CEILING(ROWS($1:1)/ROWS(Countries),1)-1,ROWS(Fruits))+1

将生成一个基于1的系列,该系列会针对多个国家/地区重复每个Fruit。

=MOD(ROWS($1:1)-1,ROWS(Countries))+1))

生成1..n的重复序列,其中n是国家/地区的数量。

将这些放入公式(带有一些错误检查)

D3:  =IFERROR(INDEX(Animals,CEILING(ROWS($1:1)/(ROWS(Fruits)*ROWS(Countries)),1)),"")
E3:  =IF(E3="","",INDEX(Fruits,MOD(CEILING(ROWS($1:1)/ROWS(Countries),1)-1,ROWS(Fruits))+1))
F3:  =IF(E3="","",INDEX(Countries,MOD(ROWS($1:1)-1,ROWS(Countries))+1))

enter image description here

答案 4 :(得分:3)

实际上,我想修改我的旧答案。但是,我的新答案与旧答案完全不同。因为,旧答案是针对特定列的,这个是针对通用列的。在回答了旧的答案之后,提问者说了他想要普遍做到的新要求。对于固定列,我们可以认为固定循环和无限列,我们需要从另一种方式思考。所以,我也这样做。 SO用户也可以看到代码差异,我认为这对初学者很有帮助。

这个新代码并不像旧代码那么简单。如果你想清楚地了解代码,我建议逐行调试代码。

不要担心代码。我已经逐步测试了它。它对我来说非常有用。如果不适合你,请告诉我。有一件事是这段代码可能导致空行(没有数据)的错误。因为,目前我没有添加检查。

以下是我解决问题的通用方法:

Public Sub matchingCell()

    Dim startRawColumn, endRawColumn, startResultColumn, endResultColumn, startRow As Integer
    Dim index, row, column, containerIndex, tempIndex As Integer
    Dim columnCount, totalCount, timesCount, matchingCount, tempCount As Integer
    Dim isExist As Boolean
    Dim arrayContainer() As Variant

    'Actually, even it is for universal, we need to know start column and end column of raw data.
    'And also start row. And start column for write result.
    'I set them for my test data.
    'You need to modify them(startRawColumn, endRawColumn, startRow, startResultColumn).

    'Set the start column and end column for raw data
    startRawColumn = 1
    endRawColumn = 3

    'Set the start row for read data and write data
    startRow = 2

    'Set the start column for result data
    startResultColumn = 4

    'Get no of raw data column
    columnCount = endRawColumn - startRawColumn

    'Set container index
    containerIndex = 0

    'Re-create array container for count of column
    ReDim arrayContainer(0 To columnCount)

    With Sheets("sheetname")

        'Getting data from sheet

        'Loop all column for getting data of each column
        For column = startRawColumn To endRawColumn Step 1

            'Create tempArray for column
            Dim tempArray() As Variant

            'Reset startRow
            row = startRow

            'Reset index
            index = 0

            'Here is one things. I looped until to blank. 
            'If you want anymore, you can modify the looping type. 
            'Don't do any changes to main body of looping.

            'Loop until the cell is blank
            Do While .Cells(row, column) <> ""

                'Reset isExist flag
                isExist = False

                'Remove checking for no data
                If index > 0 Then

                    'Loop previous data for duplicate checking
                    For tempIndex = 0 To index - 1 Step 1

                        'If found, set true to isExist and stop loop
                        If tempArray(tempIndex) = .Cells(row, column) Then

                            isExist = True

                            Exit For

                        End If

                    Next tempIndex

                End If

                'If there is no duplicate data, store data
                If Not isExist Then

                    'Reset tempArray
                    ReDim Preserve tempArray(index)

                    tempArray(index) = .Cells(row, column)

                    'Increase index
                    index = index + 1

                End If

                'Increase row
                row = row + 1

            Loop

            'Store column with data
            arrayContainer(containerIndex) = tempArray

            'Increase container index
            containerIndex = containerIndex + 1

        Next column

        'Now, we got all data column including data which has no duplicate
        'Show result data on sheet

        'Getting the result row count
        totalCount = 1

        'Get result row count
        For tempIndex = 0 To UBound(arrayContainer) Step 1

            totalCount = totalCount * (UBound(arrayContainer(tempIndex)) + 1)

        Next tempIndex

        'Reset timesCount
        timesCount = 1

        'Get the last column for result
        endResultColumn = startResultColumn + columnCount

        'Loop array container
        For containerIndex = UBound(arrayContainer) To 0 Step -1

            'Getting the counts for looping
            If containerIndex = UBound(arrayContainer) Then

                duplicateCount = 1

                timesCount = totalCount / (UBound(arrayContainer(containerIndex)) + 1)

            Else

                duplicateCount = duplicateCount * (UBound(arrayContainer(containerIndex + 1)) + 1)

                timesCount = timesCount / (UBound(arrayContainer(containerIndex)) + 1)

            End If

            'Reset the start row
            row = startRow

            'Loop timesCount
            For countIndex = 1 To timesCount Step 1

                'Loop data array
                For index = 0 To UBound(arrayContainer(containerIndex)) Step 1

                    'Loop duplicateCount
                    For tempIndex = 1 To duplicateCount Step 1

                        'Write data to cell
                        .Cells(row, endResultColumn) = arrayContainer(containerIndex)(index)

                        'Increase row
                        row = row + 1

                    Next tempIndex

                Next index

            Next countIndex

            'Increase result column index
            endResultColumn = endResultColumn - 1

        Next containerIndex

    End With

End Sub

答案 5 :(得分:2)

这是一个递归版本。它假定数据不包含任何内部选项卡,因为核心功能返回以制表符分隔的产品字符串。主子需要传递一个范围,该范围由数据和输出范围的左上角单元组成。这可能会稍微调整一下,但足以用于测试目的。

ColumnProducts Range("A:C"), Range("E1")

是解决OP问题的电话吗?这是代码:

'the following function takes a collection of arrays of strings
'and returns a variant array of tab-delimited strings which
'comprise the (tab-delimited) cartesian products of
'the arrays in the collection

Function CartesianProduct(ByVal Arrays As Collection) As Variant
    Dim i As Long, j As Long, k As Long, m As Long, n As Long
    Dim head As Variant
    Dim tail As Variant
    Dim product As Variant

    If Arrays.Count = 1 Then
        CartesianProduct = Arrays.Item(1)
        Exit Function
    Else
        head = Arrays.Item(1)
        Arrays.Remove 1
        tail = CartesianProduct(Arrays)
        m = UBound(head)
        n = UBound(tail)
        ReDim product(1 To m * n)
        k = 1
        For i = 1 To m
            For j = 1 To n
                product(k) = head(i) & vbTab & tail(j)
                k = k + 1
            Next j
        Next i
        CartesianProduct = product
    End If
End Function

Sub ColumnProducts(data As Range, output As Range)
    Dim Arrays As New Collection
    Dim strings As Variant, product As Variant
    Dim i As Long, j As Long, n As Long, numRows As Long
    Dim col As Range, cell As Range
    Dim outRange As Range

    numRows = Range("A:A").Rows.Count
    For Each col In data.Columns
        n = col.EntireColumn.Cells(numRows).End(xlUp).Row
        i = col.Cells(1).Row
        ReDim strings(1 To n - i + 1)
        For j = 1 To n - i + 1
            strings(j) = col.Cells(i + j - 1)
        Next j
        Arrays.Add strings
    Next col
    product = CartesianProduct(Arrays)
    n = UBound(product)
    Set outRange = Range(output, output.Offset(n - 1))
    outRange.Value = Application.WorksheetFunction.Transpose(product)
    outRange.TextToColumns Destination:=output, DataType:=xlDelimited, Tab:=True
End Sub

答案 6 :(得分:1)

好的,所以你只想要一份所有可能组合的清单。这就是我要做的事情:

  • 首先选择原始数据并逐列删除重复项。
  • 然后将这3列读入3个独立的数组。
  • 计算所有数组的总长度。
  • 然后使用循环粘贴国家数组的第一个值与动物和水果的组合一样多次,因此这些数组的长度成倍增加。
  • 在循环内创建另一个循环,发布所有水果选项。有许多重复行,等于最大动物数。
  • 然后将没有重复的动物粘贴在一起,直到最后一排。

答案 7 :(得分:1)

这是我解决问题的方法。

Public Sub matchingCell()

    Dim animalRow, fruitRow, countryRow, checkRow, resultRow As Long
    Dim isExist As Boolean

    'Set the start row
    animalRow = 2
    resultRow = 2

    'Work with data sheet
    With Sheets("sheetname")

        'Loop until animals column is blank
        Do While .Range("A" & animalRow) <> ""

            'Set the start row
            fruitRow = 2

            'Loop until fruits column is blank
            Do While .Range("B" & fruitRow) <> ""

                'Set the start row
                countryRow = 2

                'Loop until country column is blank
                Do While .Range("C" & countryRow) <> ""

                    'Set the start row
                    checkRow = 2

                    'Reset flag
                    isExist = False

                    'Checking for duplicate row
                    'Loop all result row until D is blank
                    Do While .Range("D" & checkRow) <> ""

                        'If duplicate row found
                        If .Range("D" & checkRow) = .Range("A" & animalRow) And _
                           .Range("E" & checkRow) = .Range("B" & fruitRow) And _
                           .Range("F" & checkRow) = .Range("C" & countryRow) Then

                           'Set true for exist flag
                           isExist = True

                        End If

                        checkRow = checkRow + 1

                    Loop

                    'If duplicate row not found
                    If Not isExist Then

                        .Range("D" & resultRow) = .Range("A" & animalRow)
                        .Range("E" & resultRow) = .Range("B" & fruitRow)
                        .Range("F" & resultRow) = .Range("C" & countryRow)

                        'Increase resultRow
                        resultRow = resultRow + 1

                    End If

                    'Increase countryRow
                    countryRow = countryRow + 1

                Loop

                'Increase fruitRow
                fruitRow = fruitRow + 1

            Loop

            'Increase fruitRow
            animalRow = animalRow + 1

        Loop

    End With

End Sub

我已经测试过了。它运作良好。祝你有愉快的一天。