如何从Excel电子表格中读取单个列?

时间:2015-12-14 17:05:08

标签: c# excel c#-4.0

我正在尝试从Excel文档中读取单个列。我想阅读整个专栏,但显然只存储有数据的单元格。我也想尝试处理这种情况,其中列中的单元格是空的,但如果列中有更深的东西,它将读入稍后的单元格值。例如:

| Column1 |
|---------|
|bob      |
|tom      |
|randy    |
|travis   |
|joe      |
|         |
|jennifer |
|sam      |
|debby    |

如果我有该列,我不介意在""之后的行中joe的值,但我确实希望它在空白单元格之后继续获取值。但是,我不希望它继续debby之后的35,000行,假设debby是列中的最后一个值。

也可以安全地假设这将始终是第一列。

到目前为止,我有这个:

Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);

foreach (Excel.Range r in myRange)
{
    MessageBox.Show(r.Text);
}

我发现很多来自旧版本.NET的例子做了类似的事情,但不是这样,并且想要确保我做了一些更现代的事情(假设用来做这个的方法改变了一些量)。

我当前的代码读取整个列,但在最后一个值之后包含空白单元格。

EDIT1

我喜欢下面的Isedlacek答案,但我确实遇到了问题,我不确定他的代码是否具体。如果我以这种方式使用它:

Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);

var nonEmptyRanges = myRange.Cast<Excel.Range>()
.Where(r => !string.IsNullOrEmpty(r.Text));

foreach (var r in nonEmptyRanges)
{
    MessageBox.Show(r.Text);
}

MessageBox.Show("Finished!");

Finished! MessageBox永远不会显示。我不确定为什么会这样,但似乎从未真正完成搜索。我尝试在循环中添加一个计数器,看看它是否只是不断搜索列,但它似乎不是......它似乎只是停止。

Finished! MessageBox的位置,我试图关闭工作簿和电子表格,但该代码从未运行过(正如预期的那样,因为MessageBox从未运行过)。

如果我手动关闭Excel电子表格,我会收到COMException:

  

用户代码
未处理COMException   附加信息:来自HRESULT的异常:0x803A09A2

有什么想法吗?

3 个答案:

答案 0 :(得分:3)

答案取决于您是否想要获取已使用单元格的边界范围,或者是否要从列中获取非空值。

以下是如何有效地从列中获取非空值的方法。请注意,同时读取整个tempRange.Value属性 MUCH 比逐个单元读取更快,但权衡的结果是生成的数组可能耗尽大量内存。

private static IEnumerable<object> GetNonNullValuesInColumn(_Application application, _Worksheet worksheet, string columnName)
{
    // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
    var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);

    // if there is no intersection, there are no values in the column
    if (tempRange == null)
        yield break;

    // get complete set of values from the temp range (potentially memory-intensive)
    var value = tempRange.Value2;

    // if value is NULL, it's a single cell with no value
    if (value == null)
        yield break;

    // if value is not an array, the temp range was a single cell with a value
    if (!(value is Array))
    {
        yield return value;
        yield break;
    }

    // otherwise, the value is a 2-D array
    var value2 = (object[,]) value;
    var rowCount = value2.GetLength(0);
    for (var row = 1; row <= rowCount; ++row)
    {
        var v = value2[row, 1];
        if (v != null)
            yield return v;
    }
}

这是获得包含列中非空单元格的最小范围的有效方法。请注意,我仍在一次读取整个tempRange值,然后使用结果数组(如果是多单元格范围)来确定哪些单元格包含第一个和最后一个值。然后我在弄清楚哪些行有数据之后构造了边界范围。

private static Range GetNonEmptyRangeInColumn(_Application application, _Worksheet worksheet, string columnName)
{
    // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
    var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);

    // if there is no intersection, there are no values in the column
    if (tempRange == null)
        return null;

    // get complete set of values from the temp range (potentially memory-intensive)
    var value = tempRange.Value2;

    // if value is NULL, it's a single cell with no value
    if (value == null)
        return null;

    // if value is not an array, the temp range was a single cell with a value
    if (!(value is Array))
        return tempRange;

    // otherwise, the temp range is a 2D array which may have leading or trailing empty cells
    var value2 = (object[,]) value;

    // get the first and last rows that contain values
    var rowCount = value2.GetLength(0);
    int firstRowIndex;
    for (firstRowIndex = 1; firstRowIndex <= rowCount; ++firstRowIndex)
    {
        if (value2[firstRowIndex, 1] != null)
            break;
    }
    int lastRowIndex;
    for (lastRowIndex = rowCount; lastRowIndex >= firstRowIndex; --lastRowIndex)
    {
        if (value2[lastRowIndex, 1] != null)
            break;
    }

    // if there are no first and last used row, there is no used range in the column
    if (firstRowIndex > lastRowIndex)
        return null;

    // return the range
    return worksheet.Range[tempRange[firstRowIndex, 1], tempRange[lastRowIndex, 1]];
}

答案 1 :(得分:1)

如果您不介意完全丢失空行:

brew cleanup
brew link node
brew uninstall node
brew install node

答案 2 :(得分:0)

    /// <summary>
    /// Generic method which reads a column from the <paramref name="workSheetToReadFrom"/> sheet provided.<para />
    /// The <paramref name="dumpVariable"/> is the variable upon which the column to be read is going to be dumped.<para />
    /// The <paramref name="workSheetToReadFrom"/> is the sheet from which te column is going to be read.<para />
    /// The <paramref name="initialCellRowIndex"/>, <paramref name="finalCellRowIndex"/> and <paramref name="columnIndex"/> specify the length of the list to be read and the concrete column of the file from which to perform the reading. <para />
    /// Note that the type of data which is going to be read needs to be specified as a generic type argument.The method constraints the generic type arguments which can be passed to it to the types which implement the IConvertible interface provided by the framework (e.g. int, double, string, etc.).
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="dumpVariable"></param>
    /// <param name="workSheetToReadFrom"></param>
    /// <param name="initialCellRowIndex"></param>
    /// <param name="finalCellRowIndex"></param>
    /// <param name="columnIndex"></param>
    static void ReadExcelColumn<T>(ref List<T> dumpVariable, Excel._Worksheet workSheetToReadFrom, int initialCellRowIndex, int finalCellRowIndex, int columnIndex) where T: IConvertible
    {
        dumpVariable = ((object[,])workSheetToReadFrom.Range[workSheetToReadFrom.Cells[initialCellRowIndex, columnIndex], workSheetToReadFrom.Cells[finalCellRowIndex, columnIndex]].Value2).Cast<object>().ToList().ConvertAll(e => (T)Convert.ChangeType(e, typeof(T)));
    }