我正在使用Open XML SDK打开Excel xlsx文件,我尝试读取每张表中位置A1的单元格值。 我使用以下代码:
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(openFileDialog1.FileName, false))
{
var sheets = spreadsheetDocument.WorkbookPart.Workbook.Descendants<Sheet>();
foreach (Sheet sheet in sheets)
{
WorksheetPart worksheetPart = (WorksheetPart)spreadsheetDocument.WorkbookPart.GetPartById(sheet.Id);
Worksheet worksheet = worksheetPart.Worksheet;
Cell cell = GetCell(worksheet, "A", 1);
Console.Writeline(cell.CellValue.Text);
}
}
private static Cell GetCell(Worksheet worksheet, string columnName, uint rowIndex)
{
Row row = GetRow(worksheet, rowIndex);
if (row == null)
return null;
return row.Elements<Cell>().Where(c => string.Compare
(c.CellReference.Value, columnName +
rowIndex, true) == 0).First();
}
// Given a worksheet and a row index, return the row.
private static Row GetRow(Worksheet worksheet, uint rowIndex)
{
return worksheet.GetFirstChild<SheetData>().
Elements<Row>().Where(r => r.RowIndex == rowIndex).First();
}
位置A1的第一个工作表中的文本只是'test'但是,在我的控制台中,我将值'0'视为cell.CellValue.Text
有没有人有想法获得正确的细胞价值?
答案 0 :(得分:62)
Excel工作表中的所有字符串都存储在一个名为SharedStringTable的结构中。此表的目标是将所有字符串集中在基于索引的数组中,然后在文档中多次使用该字符串以仅引用此数组中的索引。话虽如此,当您获得A1单元格的文本值时,您收到的0是SharedStringTable的索引。要获得真正的价值,您可以使用此辅助函数:
public static SharedStringItem GetSharedStringItemById(WorkbookPart workbookPart, int id)
{
return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id);
}
然后在你的代码中调用它来获得真正的价值:
Cell cell = GetCell(worksheet, "A", 1);
string cellValue = string.Empty;
if (cell.DataType != null)
{
if (cell.DataType == CellValues.SharedString)
{
int id = -1;
if (Int32.TryParse(cell.InnerText, out id))
{
SharedStringItem item = GetSharedStringItemById(workbookPart, id);
if (item.Text != null)
{
cellValue = item.Text.Text;
}
else if (item.InnerText != null)
{
cellValue = item.InnerText;
}
else if (item.InnerXml != null)
{
cellValue = item.InnerXml;
}
}
}
}
答案 1 :(得分:15)
Amurra的回答似乎达到百分之九十,但可能需要一些细微差别。
1)函数“GetSharedStringItemById”返回SharedStringItem,而不是字符串,这样调用代码示例将不起作用。要将实际值作为字符串获取,我相信您需要请求SharedStringItem的InnerText属性,如下所示:
public static string GetSharedStringItemById(WorkbookPart workbookPart, int id)
{
return workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(id).InnerText;
}
2)该函数也(正确地)要求将int作为其签名的一部分,但示例代码调用提供了一个字符串cell.CellValue.Text。将字符串转换为int很简单,但需要完成,因为编写的代码可能会令人困惑。
答案 2 :(得分:10)
很久以前发现这个非常有用的片段,所以不能提及作者。
private static string GetCellValue(string fileName, string sheetName, string addressName)
{
string value = null;
using(SpreadsheetDocument document = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart wbPart = document.WorkbookPart;
// Find the sheet with the supplied name, and then use that Sheet
// object to retrieve a reference to the appropriate worksheet.
Sheet theSheet = wbPart.Workbook.Descendants<Sheet>().
Where(s => s.Name == sheetName).FirstOrDefault();
if(theSheet == null)
{
throw new ArgumentException("sheetName");
}
// Retrieve a reference to the worksheet part, and then use its
// Worksheet property to get a reference to the cell whose
// address matches the address you supplied:
WorksheetPart wsPart = (WorksheetPart)(wbPart.GetPartById(theSheet.Id));
Cell theCell = wsPart.Worksheet.Descendants<Cell>().
Where(c => c.CellReference == addressName).FirstOrDefault();
// If the cell does not exist, return an empty string:
if(theCell != null)
{
value = theCell.InnerText;
// If the cell represents a numeric value, you are done.
// For dates, this code returns the serialized value that
// represents the date. The code handles strings and Booleans
// individually. For shared strings, the code looks up the
// corresponding value in the shared string table. For Booleans,
// the code converts the value into the words TRUE or FALSE.
if(theCell.DataType != null)
{
switch(theCell.DataType.Value)
{
case CellValues.SharedString:
// For shared strings, look up the value in the shared
// strings table.
var stringTable = wbPart.
GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
// If the shared string table is missing, something is
// wrong. Return the index that you found in the cell.
// Otherwise, look up the correct text in the table.
if(stringTable != null)
{
value = stringTable.SharedStringTable.
ElementAt(int.Parse(value)).InnerText;
}
break;
case CellValues.Boolean:
switch(value)
{
case "0":
value = "FALSE";
break;
default:
value = "TRUE";
break;
}
break;
}
}
}
}
return value;
}
答案 3 :(得分:3)
我发现this帖子读取整个Excel数据作为数据表非常有用。它还使用open-xml sdk。
using System;
using System.Data;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
public static DataTable ReadAsDataTable(string fileName)
{
DataTable dataTable = new DataTable();
using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
Worksheet workSheet = worksheetPart.Worksheet;
SheetData sheetData = workSheet.GetFirstChild<SheetData>();
IEnumerable<Row> rows = sheetData.Descendants<Row>();
foreach (Cell cell in rows.ElementAt(0))
{
dataTable.Columns.Add(GetCellValue(spreadSheetDocument, cell));
}
foreach (Row row in rows)
{
DataRow dataRow = dataTable.NewRow();
for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
{
dataRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));
}
dataTable.Rows.Add(dataRow);
}
}
dataTable.Rows.RemoveAt(0);
return dataTable;
}
private static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
string value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
}
else
{
return value;
}
}
注意:存在一个问题,即在读取excel时会忽略每行中的空单元格。因此,当您确定每行中的每个单元格都有一些数据时,此代码最适合。如果您希望对其进行适当的处理,则可以执行以下操作:
更改for
循环代码:
dataRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));
到
Cell cell = row.Descendants<Cell>().ElementAt(i);
int actualCellIndex = CellReferenceToIndex(cell);
dataRow[actualCellIndex] = GetCellValue(spreadSheetDocument, cell);
并添加以下修改过的代码段中使用的方法:
private static int CellReferenceToIndex(Cell cell)
{
int index = 0;
string reference = cell.CellReference.ToString().ToUpper();
foreach (char ch in reference)
{
if (Char.IsLetter(ch))
{
int value = (int)ch - (int)'A';
index = (index == 0) ? value : ((index + 1) * 26) + value;
}
else
return index;
}
return index;
}
我从this回答了这个问题。
答案 4 :(得分:2)
另一个选项:将数据导出到html表并使用样式表指定只读单元格。有关更多信息,请参阅此页 http://www.c-sharpcorner.com/UploadFile/kaushikborah28/79Nick08302007171404PM/79Nick.aspx