如何使用epplus遍历excel表中的行?

时间:2014-02-12 23:18:43

标签: c# excel epplus

我是epplus的新用户,我正在尝试从excel表中读取一些值。

这是我到目前为止所做的:

var fileInfo = new FileInfo(filename);
using(var excelPackage = new OfficeOpenXml.ExcelPackage(fileInfo))
{
    foreach (var sheet in excelPackage.Workbook.Worksheets)
    {
        foreach (ExcelTable table in sheet.Tables)
        {
             foreach(var row in table.Rows)  // <-- !!
             { ... }
        }
    }
}

然而,现在我很难过,因为ExcelTable只有Columns属性,但不是我预期的Rows属性。我在库中的任何对象上都找不到Rows属性。

如何遍历表格,阅读Row for Row?

6 个答案:

答案 0 :(得分:87)

在搜索同一问题的帮助时,我偶然发现了link。它肯定对我有用!绝对比使用Interop对象更好。 :)

我稍微改编了一下:

var package = new ExcelPackage(new FileInfo("sample.xlsx"));

ExcelWorksheet workSheet = package.Workbook.Worksheets[0];
var start = workSheet.Dimension.Start;
var end = workSheet.Dimension.End;
for (int row = start.Row; row <= end.Row; row++)
{ // Row by row...
    for (int col = start.Column; col <= end.Column; col++)
    { // ... Cell by cell...
        object cellValue = workSheet.Cells[row, col].Text; // This got me the actual value I needed.
    }
}

答案 1 :(得分:15)

这是一种将完整行作为ExcelRange然后可以迭代或用于LINQ的方法:

for (var rowNum = 1; rowNum <= sheet.Dimension.End.Row; rowNum++)
{
    var row = sheet.Cells[string.Format("{0}:{0}", rowNum)];
    // just an example, you want to know if all cells of this row are empty
    bool allEmpty = row.All(c => string.IsNullOrWhiteSpace(c.Text));
    if (allEmpty) continue; // skip this row
    // ...
}

答案 2 :(得分:10)

您可以访问表的.Worksheet属性并索引其单元格。我为此编写了一个扩展方法,它生成了一系列将列名映射到单元格值的字典:

public static IEnumerable<IDictionary<string, object>> GetRows(this ExcelTable table)
{
    var addr = table.Address;
    var cells = table.WorkSheet.Cells;

    var firstCol = addr.Start.Column;

    var firstRow = addr.Start.Row;
    if (table.ShowHeader)
        firstRow++;
    var lastRow = addr.End.Row;

    for (int r = firstRow; r <= lastRow; r++)
    {
        yield return Enumerable.Range(0, table.Columns.Count)
            .ToDictionary(x => table.Columns[x].Name, x => cells[r, firstCol + x].Value);
    }
}

答案 3 :(得分:2)

我不确定epplus,但我想我会快速建议使用LinqToExcel

var excel = new ExcelQueryFactory(excel);

var info = excel.Worksheet("Sheet1")
                .Select(z=> new
                     {
                      Name = row["Name"].Cast<string>(),
                      Age = row["Age"].Cast<int>(),
                     }).ToList();

你可以从NuGet获得它

Install-Package LinqToExcel

答案 4 :(得分:1)

我还试图弄清楚如何正确地遍历对象并获得此API所需的数据。

我从各种帖子中收集了信息,并从作者那里收集了入门页面,并将它们汇总在一起以帮助自己和他人。

主要问题是您进行迭代的切入点。我见过的大多数解决方案都是在工作表之后进行的,而这个问题在表格上是很具体的,我对两者都很好奇,所以我将对两者都提出自己的发现。

工作表示例:

using (var package = new ExcelPackage(new FileInfo(file)))
{
    //what i've seen used the most, entry point is the worksheet not the table w/i the worksheet(s)
    using (var worksheet = package.Workbook.Worksheets.FirstOrDefault())
    {
        if (worksheet != null)
        {
            for (int rowIndex = worksheet.Dimension.Start.Row; rowIndex <= worksheet.Dimension.End.Row; rowIndex++)
            {
                var row = worksheet.Row(rowIndex);
                //from comments here... https://github.com/JanKallman/EPPlus/wiki/Addressing-a-worksheet
                //#:# gets entire row, A:A gets entire column
                var rowCells = worksheet.Cells[$"{rowIndex}:{rowIndex}"];
                //returns System.Object[,]
                //type is string so it likely detects many cells and doesn't know how you want the many formatted together...
                var rowCellsText = rowCells.Text;
                var rowCellsTextMany = string.Join(", ", rowCells.Select(x => x.Text));
                var allEmptyColumnsInRow = rowCells.All(x => string.IsNullOrWhiteSpace(x.Text));
                var firstCellInRowWithText = rowCells.Where(x => !string.IsNullOrWhiteSpace(x.Text)).FirstOrDefault();
                var firstCellInRowWithTextText = firstCellInRowWithText?.Text;
                var firstCellFromRow = rowCells[rowIndex, worksheet.Dimension.Start.Column];
                var firstCellFromRowText = firstCellFromRow.Text;
                //throws exception...
                //var badRow = rowCells[worksheet.Dimension.Start.Row - 1, worksheet.Dimension.Start.Column - 1];

                //for me this happened on row1 + row2 beign merged together for the column headers
                //not sure why the row.merged property is false for both rows though
                if (allEmptyColumnsInRow)
                    continue;

                for (int columnIndex = worksheet.Dimension.Start.Column; columnIndex <= worksheet.Dimension.End.Column; columnIndex++)
                {
                    var column = worksheet.Column(columnIndex);
                    var currentRowColumn = worksheet.Cells[rowIndex, columnIndex];
                    var currentRowColumnText = currentRowColumn.Text;
                    var currentRowColumnAddress = currentRowColumn.Address;
                    //likely won't need to do this, but i wanted to show you can tangent off at any level w/ that info via another call
                    //similar to row, doing A:A or B:B here, address is A# so just get first char from address
                    var columnCells = worksheet.Cells[$"{currentRowColumnAddress[0]}:{currentRowColumnAddress[0]}"];
                    var columnCellsTextMany = string.Join(", ", columnCells.Select(x => x.Text));
                    var allEmptyRowsInColumn = columnCells.All(x => string.IsNullOrWhiteSpace(x.Text));
                    var firstCellInColumnWithText = columnCells.Where(x => !string.IsNullOrWhiteSpace(x.Text)).FirstOrDefault();
                    var firstCellInColumnWithTextText = firstCellInColumnWithText?.Text;
                }
            }
        }
    }
}

现在这里的事情可能会有些混乱,至少对我而言,我没有任何表格可以开始。在同一包using语句下,如果我要首先遍历工作表单元格,然后使用Tables属性触摸任何东西,则会引发异常。如果我重新实例化一个软件包并使用相同/相似的代码,则在查看是否有任何表时不会炸毁。

表格示例:

//for some reason, if i don't instantiating another package and i work with the 'Tables' property in any way, the API throws a...
//Object reference not set to an instance of an object.
//at OfficeOpenXml.ExcelWorksheet.get_Tables()
//excetion... this is because i have data in my worksheet but not an actual 'table' (Excel => Insert => Table)
//a parital load of worksheet cell data + invoke to get non-existing tables must have a bug as below code does not
//throw an exception and detects null gracefully on firstordefault
using (var package = new ExcelPackage(new FileInfo(file)))
{
    //however, question was about a table, so lets also look at that... should be the same?
    //no IDisposable? :(
    //adding a table manually to my worksheet allows the 'same-ish' (child.Parent, aka table.WorkSheet) code to iterate
    var table = package.Workbook.Worksheets.SelectMany(x => x.Tables).FirstOrDefault();

    if (table != null)
    {
        for (int rowIndex = table.Address.Start.Row; rowIndex <= table.Address.End.Row; rowIndex++)
        {
            var row = table.WorkSheet.Row(rowIndex);

            var rowCells = table.WorkSheet.Cells[$"{rowIndex}:{rowIndex}"];
            var rowCellsManyText = string.Join(", ", rowCells.Select(x => x.Text));

            for (int columnIndex = table.Address.Start.Column; columnIndex <= table.Address.End.Column; columnIndex++)
            {
                var currentRowColumn = table.WorkSheet.Cells[rowIndex, columnIndex];
                var currentRowColumnText = currentRowColumn.Text;
            }
        }
    }
}

基本上,所有东西都以相同的方式工作和运行,您只需要追求child.Parent,AKA table.WorkSheet即可获得相同的东西。正如其他人提到的那样,扩展方法甚至包装类都可以根据您的业务需求的具体情况为您提供更详细的信息,但这不是此问题的目的。

关于索引注释和响应,我建议坚持使用“行”和“列”属性,首先,最后,for,foreach等。而不是硬编码索引与未索引的基本属性,我在这里至少没有新版本的问题。

答案 5 :(得分:0)

我遇到了同样的问题,我使用ExcelTable来获取表边界并使用ExcelWorksheet来检索数据。所以你的代码看起来像这样:

var fileInfo = new FileInfo(filename);
using(var excelPackage = new OfficeOpenXml.ExcelPackage(fileInfo))
{
    foreach (var sheet in excelPackage.Workbook.Worksheets)
    {
        foreach (ExcelTable table in sheet.Tables)
        {
            ExcelCellAddress start = table.Address.Start;
            ExcelCellAddress end = table.Address.End;

            for (int row = start.Row; row <= end.Row; ++row)
            {
                ExcelRange range = sheet.Cells[row, start.Column, row, end.Column];
                ...
            }
        }
    }
}

您需要检查表格标题或其他内容,但这对我来说很有用。