导入不带分隔符的自定义文本格式

时间:2015-04-25 22:29:12

标签: c# sql-server regex ssis etl

我想将此.txt文件格式导入SQL Server表或将每个文本块转换为管道分隔行。

哪些工具或C#解决方案建议您解决此问题?

任何建议都将不胜感激。

谢谢。

=================
INPUT (.txt file)
=================
ID: 37
Name: Josephy Murphy
Email: jmurphy@email.com
Description: bla, bla, bla, bla...

ID: 38
Name: Paul Newman
Email: pnewman@email.com
Description: bla, bla, bla, bla...

:
:

=========================
OUTPUT (SQL Server Table)
=========================

ID | Name           | Email             | Description  
37 | Josephy Murphy | jmurphy@email.com | bla, bla, bla, bla...
38 | Paul Newman    | pnewman@email.com | bla, bla, bla, bla...

:
: 

3 个答案:

答案 0 :(得分:0)

真正简单地解析这个文件。这样的项目已经做了40年。见下面的代码。我将结果放入DataTable。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.txt";
        enum States
        {
            FIND_OUTPUT,
            GET_SEPERATOR,
            GET_TABLE_HEADER,
            GET_DATA_TABLE,
            END
        }
        static void Main(string[] args)
        {
            DataTable dt = new DataTable();
            dt.Columns.Add("ID", typeof(int));
            dt.Columns.Add("Name", typeof(string));
            dt.Columns.Add("Email", typeof(string));
            dt.Columns.Add("Description", typeof(string));

            States state = States.FIND_OUTPUT;
            StreamReader reader = new StreamReader(FILENAME);
            string inputLine = "";
            while ((inputLine = reader.ReadLine()) != null)
            {
                inputLine = inputLine.Trim();
                if (inputLine.Length > 0)
                {
                    switch (state)
                    {
                        case States.FIND_OUTPUT:
                            if (inputLine.StartsWith("OUTPUT (SQL Server Table)"))
                                state = States.GET_SEPERATOR;
                            break;
                        case States.GET_SEPERATOR:
                            state = States.GET_TABLE_HEADER;
                            break;
                        case States.GET_TABLE_HEADER:
                            state = States.GET_DATA_TABLE;
                            break;
                        case States.GET_DATA_TABLE:
                            string[] dataArray = inputLine.Split(new char[] { '|' });
                            dt.Rows.Add(dataArray);
                            break;
                    }
                }
                else
                {
                    if (state == States.GET_DATA_TABLE)
                        break; //exit while loop if blank row at end of data table
                }
            }

            reader.Close();
        }
    }
}
​

答案 1 :(得分:0)

Python易用:

input='''\
ID: 37
Name: Josephy Murphy
Email: jmurphy@email.com
Description: bla, bla, bla, bla...

ID: 38
Name: Paul Newman
Email: pnewman@email.com
Description: bla, bla, bla, bla...'''

import re
fields=('ID', 'Name', 'Email', 'Description')
out={k:[] for k in fields}
for m in re.finditer(r'(^ID.*?(?=^ID|\Z))', input, flags=re.S | re.M):
    for k, v in [map(str.strip, line.split(':')) for line in m.group(1).splitlines() if line.strip()]:
        out[k].append(v)

# you now have all the data in a structure that could be used with SQL
# just print to show...    
fmt='{:3}| {:20}| {:20}| {:20}'
print fmt.format(*fields)    
for i in range(len(out['ID'])):
    print fmt.format(*[out[k][i] for k in fields])  

打印:

ID | Name                | Email               | Description         
37 | Josephy Murphy      | jmurphy@email.com   | bla, bla, bla, bla...
38 | Paul Newman         | pnewman@email.com   | bla, bla, bla, bla...

答案 2 :(得分:0)

我现在直接写入SQL Server而不是DataTable。您需要在Insert SQL中输入连接字符串和SQL表的名称。

如果您真的添加了很多行,我会考虑使用SQL Server附带的SQLCMD.EXE。它接受数据的任何分隔符和字符串SQL。我从未将它与Insert一起使用,我通常将它用于Select SQL。有许多不同的命令行可执行文件可与SQL Server一起使用。见下面的网页 https://msdn.microsoft.com/en-us/library/ms162816.aspx



//initialization stuff...                               
button1.setOnTouchListener(new OnTouchListener{
@Override public boolean onTouch(View v, MotionEvent event){
    if(event.getAction == MotionEvent.ACTION_DOWN)
        move_sprite();//for example
    return true;      
}});