时间:2010-07-23 18:09:41

标签: c# openxml openxml-sdk import-from-excel

8 个答案:

答案 0 :(得分:61)

我认为这应该是你所要求的。如果您有共享字符串,那么另一个函数就是处理,我假设您在列标题中执行此操作。不确定这是完美的,但我希望它有所帮助。

static void Main(string[] args)
{
    DataTable dt = new DataTable();

    using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(@"..\..\example.xlsx", false))
    {

        WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
        IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
        string relationshipId = sheets.First().Id.Value;
        WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
        Worksheet workSheet = worksheetPart.Worksheet;
        SheetData sheetData = workSheet.GetFirstChild<SheetData>();
        IEnumerable<Row> rows = sheetData.Descendants<Row>();

        foreach (Cell cell in rows.ElementAt(0))
        {
            dt.Columns.Add(GetCellValue(spreadSheetDocument, cell));
        }

        foreach (Row row in rows) //this will also include your header row...
        {
            DataRow tempRow = dt.NewRow();

            for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
            {
                tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));
            }

            dt.Rows.Add(tempRow);
        }

    }
    dt.Rows.RemoveAt(0); //...so i'm taking it out here.

}


public static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
    SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
    string value = cell.CellValue.InnerXml;

    if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
    {
        return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
    }
    else
    {
        return value;
    }
}

答案 1 :(得分:14)

您好上面的代码工作正常,除了一个更改

替换下面的代码行

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));

如果您使用(i-1),它将抛出异常:

specified argument was out of the range of valid values. parameter name index.

答案 2 :(得分:2)

 Public Shared Function ExcelToDataTable(filename As String) As DataTable
        Try

            Dim dt As New DataTable()

            Using doc As SpreadsheetDocument = SpreadsheetDocument.Open(filename, False)

                Dim workbookPart As WorkbookPart = doc.WorkbookPart
                Dim sheets As IEnumerable(Of Sheet) = doc.WorkbookPart.Workbook.GetFirstChild(Of Sheets)().Elements(Of Sheet)()
                Dim relationshipId As String = sheets.First().Id.Value
                Dim worksheetPart As WorksheetPart = DirectCast(doc.WorkbookPart.GetPartById(relationshipId), WorksheetPart)
                Dim workSheet As Worksheet = worksheetPart.Worksheet
                Dim sheetData As SheetData = workSheet.GetFirstChild(Of SheetData)()
                Dim rows As IEnumerable(Of Row) = sheetData.Descendants(Of Row)()

                For Each cell As Cell In rows.ElementAt(0)
                    dt.Columns.Add(GetCellValue(doc, cell))
                Next

                For Each row As Row In rows
                    'this will also include your header row...
                    Dim tempRow As DataRow = dt.NewRow()

                    For i As Integer = 0 To row.Descendants(Of Cell)().Count() - 1
                        tempRow(i) = GetCellValue(doc, row.Descendants(Of Cell)().ElementAt(i))
                    Next

                    dt.Rows.Add(tempRow)
                Next
            End Using

            dt.Rows.RemoveAt(0)

            Return dt

        Catch ex As Exception
            Throw ex
        End Try
    End Function


    Public Shared Function GetCellValue(document As SpreadsheetDocument, cell As Cell) As String
        Try

            If IsNothing(cell.CellValue) Then
                Return ""
            End If

            Dim value As String = cell.CellValue.InnerXml

            If cell.DataType IsNot Nothing AndAlso cell.DataType.Value = CellValues.SharedString Then
                Dim stringTablePart As SharedStringTablePart = document.WorkbookPart.SharedStringTablePart
                Return stringTablePart.SharedStringTable.ChildElements(Int32.Parse(value)).InnerText
            Else
                Return value
            End If

        Catch ex As Exception
            Return ""
        End Try
    End Function

答案 3 :(得分:2)

此解决方案适用于没有空单元格的电子表格。

要处理空单元格,您需要替换此行:

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));

有这样的事情:

Cell cell = row.Descendants<Cell>().ElementAt(i);
int index = CellReferenceToIndex(cell);
tempRow[index] = GetCellValue(spreadSheetDocument, cell);

并添加此方法:

private static int CellReferenceToIndex(Cell cell)
{
    int index = 0;
    string reference = cell.CellReference.ToString().ToUpper();
    foreach (char ch in reference)
    {
        if (Char.IsLetter(ch))
        {
            int value = (int)ch - (int)'A';
            index = (index == 0) ? value : ((index + 1) * 26) + value;
        }
        else
            return index;
    }
    return index;
}

答案 4 :(得分:0)

这是我完整的解决方案,其中还考虑了空单元格。

public static class ExcelHelper
        {
            //To get the value of the cell, even it's empty. Unable to use loop by index
            private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, string cellColumnReference)
            {
                Cell theCell = null;
                string value = "";
                foreach (Cell cell in theCells)
                {
                    if (cell.CellReference.Value.StartsWith(cellColumnReference))
                    {
                        theCell = cell;
                        break;
                    }
                }
                if (theCell != null)
                {
                    value = theCell.InnerText;
                    // If the cell represents an integer number, you are done. 
                    // For dates, this code returns the serialized value that represents the date. The code handles strings and 
                    // Booleans individually. For shared strings, the code looks up the corresponding value in the shared string table. For Booleans, the code converts the value into the words TRUE or FALSE.
                    if (theCell.DataType != null)
                    {
                        switch (theCell.DataType.Value)
                        {
                            case CellValues.SharedString:
                                // For shared strings, look up the value in the shared strings table.
                                var stringTable = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
                                // If the shared string table is missing, something is wrong. Return the index that is in the cell. Otherwise, look up the correct text in the table.
                                if (stringTable != null)
                                {
                                    value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
                                }
                                break;
                            case CellValues.Boolean:
                                switch (value)
                                {
                                    case "0":
                                        value = "FALSE";
                                        break;
                                    default:
                                        value = "TRUE";
                                        break;
                                }
                                break;
                        }
                    }
                }
                return value;
            }

            private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, int index)
            {
                return GetCellValue(wbPart, theCells, GetExcelColumnName(index));
            }

            private static string GetExcelColumnName(int columnNumber)
            {
                int dividend = columnNumber;
                string columnName = String.Empty;
                int modulo;
                while (dividend > 0)
                {
                    modulo = (dividend - 1) % 26;
                    columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
                    dividend = (int)((dividend - modulo) / 26);
                }
                return columnName;
            }

            //Only xlsx files
            public static DataTable GetDataTableFromExcelFile(string filePath, string sheetName = "")
            {
                DataTable dt = new DataTable();
                try
                {
                    using (SpreadsheetDocument document = SpreadsheetDocument.Open(filePath, false))
                    {
                        WorkbookPart wbPart = document.WorkbookPart;
                        IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                        string sheetId = sheetName != "" ? sheets.Where(q => q.Name == sheetName).First().Id.Value : sheets.First().Id.Value;
                        WorksheetPart wsPart = (WorksheetPart)wbPart.GetPartById(sheetId);
                        SheetData sheetdata = wsPart.Worksheet.Elements<SheetData>().FirstOrDefault();
                        int totalHeaderCount = sheetdata.Descendants<Row>().ElementAt(0).Descendants<Cell>().Count();
                        //Get the header                    
                        for (int i = 1; i <= totalHeaderCount; i++)
                        {
                            dt.Columns.Add(GetCellValue(wbPart, sheetdata.Descendants<Row>().ElementAt(0).Elements<Cell>().ToList(), i));
                        }
                        foreach (Row r in sheetdata.Descendants<Row>())
                        {
                            if (r.RowIndex > 1)
                            {
                                DataRow tempRow = dt.NewRow();

                                //Always get from the header count, because the index of the row changes where empty cell is not counted
                                for (int i = 1; i <= totalHeaderCount; i++)
                                {
                                    tempRow[i - 1] = GetCellValue(wbPart, r.Elements<Cell>().ToList(), i);
                                }
                                dt.Rows.Add(tempRow);
                            }
                        }                    
                    }
                }
                catch (Exception ex)
                {

                }
                return dt;
            }
        }

答案 5 :(得分:0)

首先将 ExcelUtility.cs 添加到您的项目中:

ExcelUtility.cs

using System.Data;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

namespace Core_Excel.Utilities
{
    static class ExcelUtility
    {
        public static DataTable Read(string path)
        {
            var dt = new DataTable();

            using (var ssDoc = SpreadsheetDocument.Open(path, false))
            {
                var sheets = ssDoc.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                var relationshipId = sheets.First().Id.Value;
                var worksheetPart = (WorksheetPart) ssDoc.WorkbookPart.GetPartById(relationshipId);
                var workSheet = worksheetPart.Worksheet;
                var sheetData = workSheet.GetFirstChild<SheetData>();
                var rows = sheetData.Descendants<Row>().ToList();

                foreach (var row in rows) //this will also include your header row...
                {
                    var tempRow = dt.NewRow();

                    var colCount = row.Descendants<Cell>().Count();
                    foreach (var cell in row.Descendants<Cell>())
                    {
                        var index = GetIndex(cell.CellReference);

                        // Add Columns
                        for (var i = dt.Columns.Count; i <= index; i++)
                            dt.Columns.Add();

                        tempRow[index] = GetCellValue(ssDoc, cell);
                    }

                    dt.Rows.Add(tempRow);
                }
            }

            var m = dt.Rows[0][9];

            return dt;
        }

        private static string GetCellValue(SpreadsheetDocument document, Cell cell)
        {
            var stringTablePart = document.WorkbookPart.SharedStringTablePart;
            var value = cell.CellValue.InnerXml;

            if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                return stringTablePart.SharedStringTable.ChildElements[int.Parse(value)].InnerText;

            return value;
        }

        public static int GetIndex(string name)
        {
            if (string.IsNullOrWhiteSpace(name))
                return -1;

            int index = 0;
            foreach (var ch in name)
            {
                if (char.IsLetter(ch))
                {
                    int value = ch - 'A' + 1;
                    index = value + index * 26;
                }
                else
                    break;
            }

            return index - 1;
        }
    }
}

用法:

var path = "D:\\Documents\\test.xlsx";
var dt = ExcelUtility.Read(path);

然后享受吧!

答案 6 :(得分:0)

我知道自从这个线程开始以来已经很久了。但是,上述解决方案都没有真正适合我。空单元格问题等。

我在 GitHub 上找到了一个带有“MIT”许可的非常好的解决方案: https://github.com/ExcelDataReader/ExcelDataReader 这对 C# 和 VBnet 应用程序都适用。 来自 VBNET 的示例调用(c# 的示例代码在 GitHub 上):

        Using stream As FileStream = New FileStream(DataPath & "\" & fName.Name, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)

            Using reader As IExcelDataReader = ExcelReaderFactory.CreateReader(stream)
                ds = reader.AsDataSet(New ExcelDataSetConfiguration() With {
                    .UseColumnDataType = False,
                    .ConfigureDataTable = Function(tableReader) New ExcelDataTableConfiguration() With {
                        .UseHeaderRow = True
                    }
                })
            End Using

        End Using

结果是一个数据集,工作簿中的每个工作表都有一个表格。

我真的很喜欢自己编译用 C# 制作的 dll,而不是使用现成的 dll。这样我就可以控制我向客户提供的内容。

答案 7 :(得分:-1)

如果rows值为null或为空,则得到错误的值。

所有列都填充了数据,如果它正常工作。但也许所有行都不是