我有一个分隔的文本文件:其中一列是DocDate
ddate列如下所示
20070222
20070221
(此文本文件中有100个日期,如此)
所以它(用|)分隔
| DDate |
| 20070222 |
| 20070221 |
我需要翻译成这个
| DDate |
| 2007年2月22日|
| 2007年2月21日|
我有一个当前的替换语句,我已经将这个文本文件格式化了。如果有人可以告诉我如何将其放在声明中,那将是非常棒的。
using (StreamReader stream = new StreamReader(File.Open(@"C:\nPrep\" + textBox1.Text + "\\CI\\ncr.txt", FileMode.Open)))
{
string fileText = stream.ReadToEnd();
fileText = fileText.Replace(@"BegAtt|EndAtt", "BegAtt#|EndAtt#");
fileText = fileText.Replace(@"Cc|*RFP", "CC|RFP");
fileText = fileText.Replace(@"<swme> ", string.Empty);
fileText = fileText.Replace(@" </swme>",";");
using (StreamWriter writer = new StreamWriter(File.Open(@"C:\" + textBox1.Text + "\\nc" + "\\Data\\ncr.txt", FileMode.Create)))
{
writer.Write(fileText);
}
}
}
例:
在日期转换之前:
216442 | 216443 ||| 20080823 | EM
日期之后:
216442 | 216443 ||| 08/23/2005 | EM
答案 0 :(得分:5)
您可以通过以下方法运行日期字符串:
private static string ReformatDate(string input)
{
return DateTime.ParseExact(input, "|yyyyMMdd|", CultureInfo.InvariantCulture)
.ToString("MM/dd/yyyy", CultureInfo.InvariantCulture);
}
示例:
Console.WriteLine(ReformatDate("|20070222|")); // prints 02/22/2007
<强>更新强>
完整示例包括文件解析:
private const int DATE_COLUMN = 4;
static void Main(string[] args)
{
string inputFile = @"c:\temp\input.txt";
string outputFile = @"c:\temp\output.txt";
using (StreamReader reader = File.OpenText(inputFile))
using(Stream outputStream = File.OpenWrite(outputFile))
using (StreamWriter writer = new StreamWriter(outputStream))
{
do
{
string line = reader.ReadLine();
if (line == null)
{
break;
}
writer.WriteLine(TransformLine(line));
} while (true);
}
File.Delete(inputFile);
File.Move(outputFile, inputFile);
}
private static char[] separator = "|".ToCharArray();
private static string TransformLine(string line)
{
string[] columns = line.Split(separator);
columns[DATE_COLUMN] = ReformatDate(columns[4]);
return string.Join("|", columns);
}
private static string ReformatDate(string input)
{
return DateTime.ParseExact(input, "yyyyMMdd", CultureInfo.InvariantCulture)
.ToString("MM/dd/yyyy", CultureInfo.InvariantCulture);
}
现在它将原始文件替换为具有转换行的文件。
答案 1 :(得分:0)
老实说,我不会在这里尝试任何聪明的东西,问题非常简单。
进行转换的最简单方法是将原始文件转换为C#DateTime,然后使用新样式再次格式化。您可以将DateTime.Parse方法与适合旧样式的格式字符串一起使用,并使用不同格式字符串的ToString调用来跟随它。
DateTime.Parse
DateTime.ToString
DateTimeFormatInfo class -has the custom format strings
至于一般解决方案,我想我会将列映射到具有一些简单属性的类。如果内存不是问题,则从文件数据创建这些对象的列表,格式化日期,然后将对象写回。我觉得这样做更容易,因为它比复杂的正则表达式语句等更透明,更容易调试,并且因为它更加便于维护。
该类将具有映射到列的属性(想想ORM对数据库的作用)。您可以为其添加不同的格式选项,并以创造性方式覆盖ToString方法,添加不同的验证/格式/逻辑规则等。
问题就像这样:
然后,如果您的格式规则发生变化,或者文件格式或其他任何内容,您都有一个简单的维护路径。
编辑:对于更宽松的架构,您可以通过几种巧妙的方式使用Dictionary对象,如下所示:
foreach(Dictionary<string, string> row in rowList){
foreach(string columnName in ColumnArray){
WriteToken(row[columnNmae]);
}
}
我为此目的做了一些非常通用的东西。今天天气很慢,我想:)
public class ListFormatter {
// stores transformation delegates keyed by column name (multiple keys for each column is allowed)
public List<KeyValuePair<String, Func<String, String>>> Transforms = new List<KeyValuePair<String, Func<String, String>>>();
// method for tokenizing and writing back - encapsulate file format to some extent
public Func<String, String[]> GetTokensFromLine { get; set; }
public Func<IEnumerable<String>, String> GetLineFromTokens { get; set; }
public String ReservedColumnNameAnyColumn = String.Empty;
public String ReservedColumnNameWholeLine = "WholeLine";
public ListFormatter() {
// by default let's set up for '|' delimited tokens, client can overwrite however
GetTokensFromLine = s => { return s.Split('|'); };
GetLineFromTokens = l => {
var b = new StringBuilder();
for (int i = 0; i < l.Count(); i++) {
b.Append((i > 0) ? " | " + l.ElementAt(i) : l.ElementAt(i));
}
return b.ToString();
};
}
public void FormatList(StreamReader inStream, StreamWriter outStream) {
// get the column names
var columns = GetTokensFromLine(inStream.ReadLine());
// TODO - validate that every column has a name
// write he column header to the output
outStream.WriteLine(GetLineFromTokens(columns));
// iterate through the stream
while (true) {
// get a line of text, run any transforms registered to work on the whole line
var line = RunTransforms(inStream.ReadLine(), GetRowTransforms());
if (line == null) break;
// get the row of tokens TODO - validate for number of tokens
var tokens = GetTokensFromLine(line);
// run transforms on the columns
for (var i = 0; i < tokens.Count(); i++ ) {
tokens[i] = RunTransforms(tokens[i], GetColumnTransforms(columns[i]));
}
// write the new line to the output
outStream.WriteLine(GetLineFromTokens(tokens));
}
}
/// <summary>
/// Gets the transforms associated with a single column value
/// </summary>
/// <param name="name">The name.</param>
/// <returns></returns>
public IEnumerable<Func<String, String>> GetColumnTransforms(string name) {
return from kv in Transforms where kv.Key == ReservedColumnNameAnyColumn || kv.Key == name select kv.Value;
}
/// <summary>
/// Gets the transforms associated with the whole row
/// </summary>
/// <returns></returns>
public IEnumerable<Func<String, String>> GetRowTransforms() {
return from kv in Transforms where kv.Key == ReservedColumnNameWholeLine select kv.Value;
}
/// <summary>
/// Runs the transforms on a string
/// </summary>
/// <param name="item">The item.</param>
/// <param name="transformList">The transform list.</param>
/// <returns></returns>
public string RunTransforms(string item, IEnumerable<Func<String, String>> transformList) {
if (item != null) {
foreach (var func in transformList) {
item = func(item);
}
}
return item;
}
}
// usage example
public void FormatList() {
var formatter = new ListFormatter();
// add some rules
// formats every line of text
formatter.Transforms.Add(new KeyValuePair<string, Func<string, string>>(formatter.ReservedColumnNameWholeLine, s => s.Trim()));
// format every column entry
formatter.Transforms.Add(new KeyValuePair<string, Func<string, string>>(formatter.ReservedColumnNameAnyColumn, s => s.Trim()));
// format that date
formatter.Transforms.Add(new KeyValuePair<string, Func<string, string>>("DDate", s => DateTime.ParseExact(s, "oldformat", CultureInfo.InvariantCulture).ToString("newformat")));
// format
using (var reader = File.OpenText("infile"))
using(var outputStream = new StreamWriter(File.OpenWrite("outfile"))) {
formatter.FormatList(reader, outputStream);
}
}
允许您为特定列,所有列和整行文本添加任意数量的规则。默认情况下使用分隔符,但可以覆盖它们。实际的格式化程序类适用于流,因此任何缓冲区和文件管理内容都留给客户端。
这个想法是将核心功能封装成简单且可重用的东西。因此,例如,要添加其他文本替换,您只需添加另一个适用于整行文本的规则,或单独的每个列值,无论哪种情况适合。实际规则与格式化过程分开,可以单独测试。以下是配置其他替代品的方法:
formatter.Transforms.Add(new KeyValuePair<string, Func<string, string>>(formatter.ReservedColumnNameWholeLine, s => {
// Make other replacements.
s = s.Replace(@"BegAtt|EndAtt", "BegAtt#|EndAtt#");
s = s.Replace(@"Cc|*RFP", "CC|RFP");
s = s.Replace(@"<swme> ", string.Empty);
s = s.Replace(@" </swme>", ";");
return s;
}));
答案 2 :(得分:0)
我认为这可以满足您的需求:
using System;
using System.Linq;
using System.Text.RegularExpressions;
using System.IO;
class Program
{
static void Main(string[] args)
{
string inputFilename = "input.txt";
string outputFilename = "output.txt";
string[] dateColumnNames = { "DDate" };
using (StreamReader stream = new StreamReader(File.Open(inputFilename, FileMode.Open)))
using (StreamWriter writer = new StreamWriter(File.Open(outputFilename, FileMode.Create)))
{
int[] dateColumns = new int[0];
while (true)
{
string line = stream.ReadLine();
if (line == null)
break;
// Split into columns.
string[] columns = line.Split('|');
// Find date columns.
int[] newDateColumns =
columns.Select((name, index) => new { Name = name, Index = index })
.Where(x => dateColumnNames.Contains(x.Name))
.Select(x => x.Index)
.ToArray();
if (newDateColumns.Length > 0)
dateColumns = newDateColumns;
// Replace dates.
foreach (int i in dateColumns)
{
if (columns.Length > i)
{
Regex regex = new Regex(@"(\d{4})(\d{2})(\d{2})");
columns[i] = regex.Replace(columns[i], "$2/$3/$1");
line = string.Join("|", columns);
}
}
// Make other replacements.
line = line.Replace(@"BegAtt|EndAtt", "BegAtt#|EndAtt#");
line = line.Replace(@"Cc|*RFP", "CC|RFP");
line = line.Replace(@"<swme> ", string.Empty);
line = line.Replace(@" </swme>", ";");
// Output line.
writer.WriteLine(line);
}
}
}
}
示例输入:
a|b|c|d|DDate|e
216442|20011223|||20080823|EM
216443|20011223|||20080824|EM
a|DDate|c|d|e|f
216442|20011223|||20080823|EM
<swme> Just a test </swme>
输出:
a|b|c|d|DDate|e
216442|20011223|||08/23/2008|EM
216443|20011223|||08/24/2008|EM
a|DDate|c|d|e|f
216442|12/23/2001|||20080823|EM
Just a test;
请注意,DDate列会发生变化。如果愿意,您还可以指定多个日期列。只需更改数组dateColumnNames
。