将xlxs转换为csv和数据格式

时间:2020-01-23 10:20:49

标签: c# .net excel save converters

转换文件时遇到两个问题:

  1. 我希望日期格式看起来像这样:
19.08.2019

看起来像这样

8/19/2019

2。转换后,带有逗号的其他行将添加到csv文件中。我该如何克服呢?

11,900011,S1,8/19/2019,11,6.90,9.90,,18.50,,8.80,,,,,,,,,,,,,,,,,,,,,,,,
12,900012,S1,8/19/2019,12,6.70,8.80,,14.50,,9.40,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
....

我使用图书馆

using Excel=Microsoft.Office.Interop.Excel;

有我的代码:

 public static void Convert()
        {
            try
            {
                Excel.Application app = new Excel.Application();
                //Load file . xlsx
                Excel.Workbook wb = app.Workbooks.Open(Program.filePaths[1]);
                //Save file .csv
                wb.SaveAs(Program.filePaths[0], Excel.XlFileFormat.xlCSVWindows, Type.Missing, Type.Missing, false, false, Excel.XlSaveAsAccessMode.xlNoChange, Excel.XlSaveConflictResolution.xlLocalSessionChanges, false, Type.Missing, Type.Missing, Type.Missing);
                wb.Close(false);
                app.Quit();


            }catch(Exception ex)
            {
                MessageBox.Show(ex.Message);
            }


        }

预先感谢您的帮助

2 个答案:

答案 0 :(得分:0)

我使用npoi从excel转换为CSV npoi (很抱歉,如果丢失字符串,这是我项目的副本)

        public Dictionary<string, string> ExceltoCsv(IWorkbook input)
    {
        var csvTrennzeichen = OutputSettings.ColumnSeparator.ToString();
        var result = new Dictionary<string, string>();
        for (var sheetIndex = 0; sheetIndex < input.NumberOfSheets; sheetIndex++)
        {
            var sheet = input.GetSheetAt(sheetIndex);
            var sheetresult = new List<string>();
            for (var row = sheet.FirstRowNum; row < sheet.LastRowNum; row++)
            {
                var rowObj = sheet.GetRow(row);
                if (rowObj.Cells.All(x => string.IsNullOrEmpty(WertAuslesen(x))))
                    continue;

                var line = string.Join(csvTrennzeichen, rowObj.Cells
                                                        .Select(cell => WertAuslesen(cell).Replace("\r", " ").Replace("\n", " "))
                                                        .Select(cell => OutputSettings.Writeinquotes ? string.Format("\"{0}\"", cell.Replace("\"", "\"\"")) : cell));

                sheetresult.Add(line);
            }

            result.Add(sheet.SheetName, string.Join("\r\n", sheetresult));
        }
        return result;
    }

    private string WertAuslesen(ICell oldCell)
    {
        switch (oldCell.CellType)
        {
            case CellType.Boolean:
                return oldCell.BooleanCellValue.ToString();
            case CellType.Error:
                return oldCell.ErrorCellValue.ToString();
            case CellType.Formula:
                return oldCell.CellFormula;
            case CellType.Numeric:
                return !DateUtil.IsCellDateFormatted(oldCell)
                    ? oldCell.NumericCellValue.ToString(OutputSettings.GetDecimalFormat(Digits(oldCell.CellStyle.GetDataFormatString())))
                    : oldCell.DateCellValue.ToString(OutputSettings.DateFormat);
            case CellType.String:
                return oldCell.RichStringCellValue.ToString();
            case CellType.Unknown:
                return oldCell.StringCellValue;
            default:
                return "";
        }
    }

    private static int Digits(string format)
    {
        var digits = format.ContainsAny(',', '.') ? format.Split(new[] { ',', '.' }).Last() : "";
        return digits.Length;
    }

我还觉得有必要添加outputsettings类,因为它可以清除一切,而不是不必要的

    public class OutputSettings
{
    public static readonly OutputSettings Default = new OutputSettings(Encoding.UTF8, null, "yyyyMMdd", "hh:mm:ss", ".", "", "y", "n", ',', true, "", null);
    //I am immutable
    public OutputSettings(CultureInfo culture) : this(
        Encoding.UTF8,
        null,
        culture.DateTimeFormat.ShortDatePattern,
        culture.DateTimeFormat.LongTimePattern,
        culture.NumberFormat.NumberDecimalSeparator,
        culture.NumberFormat.NumberGroupSeparator,
        "y",
        "n",
        ',',
        true,
        "",
        null)
    {
    }

    public OutputSettings(
        Encoding encoding,
        Version ioVersion,
        string dateFormat,
        string timeFormat,
        string decimalSeperator,
        string thousandSeperator,
        string yesString,
        string noString,
        char columnseperator,
        bool writeinquotes,
        string outputFolder,
        IResourceHandler resourceHandler)
    {
        Encoding = encoding;
        IOVersion = ioVersion;
        DateFormat = dateFormat;
        TimeFormat = timeFormat;
        DecimalSeperator = decimalSeperator;
        ThousandSeperator = thousandSeperator;
        YesString = yesString;
        NoString = noString;
        ColumnSeparator = columnseperator;
        Writeinquotes = writeinquotes;
        OutputFolder = outputFolder;
        ResourceHandler = resourceHandler;
    }

    public IResourceHandler ResourceHandler { get; }

    public Encoding Encoding { get; }

    public Version IOVersion { get; }

    public string DateFormat { get; }

    public string TimeFormat { get; }

    public string DateTimeFormat => DateFormat + " " + TimeFormat;

    public string DecimalSeperator { get; }

    public string ThousandSeperator { get; }

    public string DecimalFormat => GetDecimalFormat(2);

    public string YesString { get; }

    public string NoString { get; }

    private char _columnseperator;
    public char ColumnSeparator
    {
        get
        {
            return _columnseperator;
        }
        private set
        {
            if (value != ',' && value != ';')
                throw new ArgumentException(Localization.Resources.StaticTranslationResource.IO_SEPARATOR_MUSS_COMMA_ODER_SEMICOLON_SEIN);
            _columnseperator = value;
        }
    }

    public bool Writeinquotes { get; }
    public string OutputFolder { get; set; }

    public string GetDecimalFormat(int precision)
    {
        if (precision < 0)
            throw new ArgumentException(Localization.Resources.StaticTranslationResource.OUTPUT_ANZAHL_DER_STELLEN_DARF_NICHT_NEGATIV_SEIN, nameof(precision));

        var sb = new StringBuilder($"#{ThousandSeperator}##0{DecimalSeperator}");
        if (precision == 0)
        {
            sb.Append('#');
        }
        else
        {
            for (int i = 0; i < precision; i++)
            {
                sb.Append('0');
            }
        }
        return sb.ToString();
    }
}

edit:我使用很多扩展方法使我的代码可读 containsany是其中之一

        public static bool ContainsAll<T>(this IEnumerable<T> superset, params T[] subset) => !subset.Except(superset).Any();

    public static bool ContainsAll<T>(this IEnumerable<T> superset, IEnumerable<T> subset) => !subset.Except(superset).Any();

    public static bool ContainsAny<T>(this IEnumerable<T> superset, params T[] subset) => subset.Any(superset.Contains);

    public static bool ContainsAny<T>(this IEnumerable<T> superset, IEnumerable<T> subset) => subset.Any(superset.Contains);

答案 1 :(得分:0)

关于日期,如果您的日期在Excel中格式正确,则Excel导出为CSV时应遵循该格式。我认为这不会破坏您现有的格式;只是按原样导出,对吧?

Excel.Worksheet sheet = wb.ActiveSheet;
sheet.Columns[4].NumberFormat = "yyyy.mm.dd";

至于额外的列/行...,这意味着“某些内容”位于这些单元格中,即使只是格式化也是如此。如果您执行列/行删除,则另存为CSV时将阻止它们导出。

如果您不知道其中有什么,执行此操作的简单方法是找到包含“真实”数据的最后一行,这取决于您知道如何定义...也许其中任何不包含任何内容的行A,B和E列。从最后一个实际行之后的行开始,并将所有内容删除到UsedRange.

的最后一行

或者,您可以在Excel中使用内置的CountA函数,该函数应该很快。如果该函数为该行返回0,则可以指望该行中任何单元格中没有任何内容。快速示例:

Excel.Range last = sheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell,
    Type.Missing);

for (int row = last.Row; row > 0; row++)
{
    Excel.Range r = (Excel.Range)sheet.Cells[row, 1];
    double o = addIn.Application.WorksheetFunction.CountA(r.EntireRow);
    if (o == 0)
        r.EntireRow.Delete();
}

未经测试,但应该在那里99%...我认为您需要自下而上,但不能100%确定。我的想法是,如果不删除,它将跳过删除行。

相关问题