转换文件时遇到两个问题:
19.08.2019
看起来像这样
8/19/2019
2。转换后,带有逗号的其他行将添加到csv文件中。我该如何克服呢?
11,900011,S1,8/19/2019,11,6.90,9.90,,18.50,,8.80,,,,,,,,,,,,,,,,,,,,,,,,
12,900012,S1,8/19/2019,12,6.70,8.80,,14.50,,9.40,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
....
我使用图书馆
using Excel=Microsoft.Office.Interop.Excel;
有我的代码:
public static void Convert()
{
try
{
Excel.Application app = new Excel.Application();
//Load file . xlsx
Excel.Workbook wb = app.Workbooks.Open(Program.filePaths[1]);
//Save file .csv
wb.SaveAs(Program.filePaths[0], Excel.XlFileFormat.xlCSVWindows, Type.Missing, Type.Missing, false, false, Excel.XlSaveAsAccessMode.xlNoChange, Excel.XlSaveConflictResolution.xlLocalSessionChanges, false, Type.Missing, Type.Missing, Type.Missing);
wb.Close(false);
app.Quit();
}catch(Exception ex)
{
MessageBox.Show(ex.Message);
}
}
预先感谢您的帮助
答案 0 :(得分:0)
我使用npoi从excel转换为CSV npoi (很抱歉,如果丢失字符串,这是我项目的副本)
public Dictionary<string, string> ExceltoCsv(IWorkbook input)
{
var csvTrennzeichen = OutputSettings.ColumnSeparator.ToString();
var result = new Dictionary<string, string>();
for (var sheetIndex = 0; sheetIndex < input.NumberOfSheets; sheetIndex++)
{
var sheet = input.GetSheetAt(sheetIndex);
var sheetresult = new List<string>();
for (var row = sheet.FirstRowNum; row < sheet.LastRowNum; row++)
{
var rowObj = sheet.GetRow(row);
if (rowObj.Cells.All(x => string.IsNullOrEmpty(WertAuslesen(x))))
continue;
var line = string.Join(csvTrennzeichen, rowObj.Cells
.Select(cell => WertAuslesen(cell).Replace("\r", " ").Replace("\n", " "))
.Select(cell => OutputSettings.Writeinquotes ? string.Format("\"{0}\"", cell.Replace("\"", "\"\"")) : cell));
sheetresult.Add(line);
}
result.Add(sheet.SheetName, string.Join("\r\n", sheetresult));
}
return result;
}
private string WertAuslesen(ICell oldCell)
{
switch (oldCell.CellType)
{
case CellType.Boolean:
return oldCell.BooleanCellValue.ToString();
case CellType.Error:
return oldCell.ErrorCellValue.ToString();
case CellType.Formula:
return oldCell.CellFormula;
case CellType.Numeric:
return !DateUtil.IsCellDateFormatted(oldCell)
? oldCell.NumericCellValue.ToString(OutputSettings.GetDecimalFormat(Digits(oldCell.CellStyle.GetDataFormatString())))
: oldCell.DateCellValue.ToString(OutputSettings.DateFormat);
case CellType.String:
return oldCell.RichStringCellValue.ToString();
case CellType.Unknown:
return oldCell.StringCellValue;
default:
return "";
}
}
private static int Digits(string format)
{
var digits = format.ContainsAny(',', '.') ? format.Split(new[] { ',', '.' }).Last() : "";
return digits.Length;
}
我还觉得有必要添加outputsettings类,因为它可以清除一切,而不是不必要的
public class OutputSettings
{
public static readonly OutputSettings Default = new OutputSettings(Encoding.UTF8, null, "yyyyMMdd", "hh:mm:ss", ".", "", "y", "n", ',', true, "", null);
//I am immutable
public OutputSettings(CultureInfo culture) : this(
Encoding.UTF8,
null,
culture.DateTimeFormat.ShortDatePattern,
culture.DateTimeFormat.LongTimePattern,
culture.NumberFormat.NumberDecimalSeparator,
culture.NumberFormat.NumberGroupSeparator,
"y",
"n",
',',
true,
"",
null)
{
}
public OutputSettings(
Encoding encoding,
Version ioVersion,
string dateFormat,
string timeFormat,
string decimalSeperator,
string thousandSeperator,
string yesString,
string noString,
char columnseperator,
bool writeinquotes,
string outputFolder,
IResourceHandler resourceHandler)
{
Encoding = encoding;
IOVersion = ioVersion;
DateFormat = dateFormat;
TimeFormat = timeFormat;
DecimalSeperator = decimalSeperator;
ThousandSeperator = thousandSeperator;
YesString = yesString;
NoString = noString;
ColumnSeparator = columnseperator;
Writeinquotes = writeinquotes;
OutputFolder = outputFolder;
ResourceHandler = resourceHandler;
}
public IResourceHandler ResourceHandler { get; }
public Encoding Encoding { get; }
public Version IOVersion { get; }
public string DateFormat { get; }
public string TimeFormat { get; }
public string DateTimeFormat => DateFormat + " " + TimeFormat;
public string DecimalSeperator { get; }
public string ThousandSeperator { get; }
public string DecimalFormat => GetDecimalFormat(2);
public string YesString { get; }
public string NoString { get; }
private char _columnseperator;
public char ColumnSeparator
{
get
{
return _columnseperator;
}
private set
{
if (value != ',' && value != ';')
throw new ArgumentException(Localization.Resources.StaticTranslationResource.IO_SEPARATOR_MUSS_COMMA_ODER_SEMICOLON_SEIN);
_columnseperator = value;
}
}
public bool Writeinquotes { get; }
public string OutputFolder { get; set; }
public string GetDecimalFormat(int precision)
{
if (precision < 0)
throw new ArgumentException(Localization.Resources.StaticTranslationResource.OUTPUT_ANZAHL_DER_STELLEN_DARF_NICHT_NEGATIV_SEIN, nameof(precision));
var sb = new StringBuilder($"#{ThousandSeperator}##0{DecimalSeperator}");
if (precision == 0)
{
sb.Append('#');
}
else
{
for (int i = 0; i < precision; i++)
{
sb.Append('0');
}
}
return sb.ToString();
}
}
edit:我使用很多扩展方法使我的代码可读 containsany是其中之一
public static bool ContainsAll<T>(this IEnumerable<T> superset, params T[] subset) => !subset.Except(superset).Any();
public static bool ContainsAll<T>(this IEnumerable<T> superset, IEnumerable<T> subset) => !subset.Except(superset).Any();
public static bool ContainsAny<T>(this IEnumerable<T> superset, params T[] subset) => subset.Any(superset.Contains);
public static bool ContainsAny<T>(this IEnumerable<T> superset, IEnumerable<T> subset) => subset.Any(superset.Contains);
答案 1 :(得分:0)
关于日期,如果您的日期在Excel中格式正确,则Excel导出为CSV时应遵循该格式。我认为这不会破坏您现有的格式;只是按原样导出,对吧?
Excel.Worksheet sheet = wb.ActiveSheet;
sheet.Columns[4].NumberFormat = "yyyy.mm.dd";
至于额外的列/行...,这意味着“某些内容”位于这些单元格中,即使只是格式化也是如此。如果您执行列/行删除,则另存为CSV时将阻止它们导出。
如果您不知道其中有什么,执行此操作的简单方法是找到包含“真实”数据的最后一行,这取决于您知道如何定义...也许其中任何不包含任何内容的行A,B和E列。从最后一个实际行之后的行开始,并将所有内容删除到UsedRange.
或者,您可以在Excel中使用内置的CountA
函数,该函数应该很快。如果该函数为该行返回0,则可以指望该行中任何单元格中没有任何内容。快速示例:
Excel.Range last = sheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell,
Type.Missing);
for (int row = last.Row; row > 0; row++)
{
Excel.Range r = (Excel.Range)sheet.Cells[row, 1];
double o = addIn.Application.WorksheetFunction.CountA(r.EntireRow);
if (o == 0)
r.EntireRow.Delete();
}
未经测试,但应该在那里99%...我认为您需要自下而上,但不能100%确定。我的想法是,如果不删除,它将跳过删除行。