我有一个包含4列和20,000行数据的Excel文件。我需要计算重复行的数量,这意味着在所有4列中具有相同信息的行。
例如:
DOB Name Salary City
-----------------------------------------------
7/31/1975 John 45,000 Chicago
4/15/1963 Blaire 53,000 Los Angeles
7/31/1975 John 45,000 Chicago
计算上面的表格会返回2.是否有一个公式我可以在Excel中为我执行此操作?我只找到了从一列中计算重复数据的信息。
答案 0 :(得分:1)
添加对 Microsoft ActiveX 数据对象的引用(工具 -> 引用...);最新版本,通常是 6.1
。
然后您可以对您的数据执行以下 SQL 语句:
SELECT DOB, Name, Salary, City, COUNT(*) AS CountOfDuplicates
FROM [Sheet1$]
GROUP BY DOB, Name, Salary, City
HAVING COUNT(*) > 1
使用如下代码:
Const filename = "C:\path\to\excel\file.xlsx"
Const sheetname = "Sheet1"
Dim connectionString As String
connectionString = _
"Provider=Microsoft.ACE.OLEDB.12.0;" & _
"Data Source=""" & filepath & """;" & _
"Extended Properties=""Excel 12.0 Macro;HDR=Yes"""
' If your data is in a macro-enabled file (.xlsm), the previous line should
' look like this:
' "Extended Properties=""Excel 12.0 Macro;HDR=No"""
Dim sql As String
sql = _
"SELECT DOB, Name, Salary, City, COUNT(*) AS CountOfDuplicates " & _
"FROM [" & sheetname & "$] " & _
"GROUP BY DOB, Name, Salary, City " & _
"HAVING COUNT(*) > 1"
' If your data is only part of the worksheet, you can specify the range in the FROM clause as follows:
' "FROM [" & sheetname & "!B10:E20000$] " & _
Dim rs As New ADODB.Recordset
rs.Open sql, connectionString
有了一个填充的记录集,你可以做很多事情:
参考:
Microsoft Access SQL(在查询 Excel 文件时使用):
ADODB:
Excel:
答案 1 :(得分:0)
您可以在整个范围内执行Countifs
=Countifs(A2:A6,A2:A6,B2:B6,B2:B6,...,ZZ2:ZZ6)
(以 CTRL + SHIFT + ENTER 作为数组输入)
基本上,如果要比较A,B,C和D列的确切数据,请执行此操作
=COUNTIFS(A2:A5,A2:A5,B2:B5,B2:B5,C2:C5,C2:C5,D2:D5,D2:D5)
(作为数组输入)