gzip文件用双引号中的列值之一将记录拆分成列

时间:2016-09-16 12:31:34

标签: python python-2.7

我的gzip文件包含以逗号分隔的列,但是当列值在双引号内时,逗号应保持不变。我写了以下代码:

                           input = gzip.open(file, "rb")
                            reader = codecs.getreader("utf-8")
                            input_file = reader(input)
                            try:
                                count = 0
                                for line in input_file:

                                    try:
                                        # print 'count='
                                        # print count
                                        if len(line) != 0:

                                            col = line.split(',')

我在文件中的数据如下:

4798151,1137351,nam_p0,2762913,nam_r000,"NAM_Rack, Power & Cooling",3
4798151,1135623,nam_s0,2762914,nam_a0,"NAM_Advise, Transform & Manage",3

当我用comman分割数据时,带双引号的逗号应该忽略并进入一列。我不知道如何添加条件处理双引号中的文本作为一个。快速反应将是一个很大的帮助。 感谢。

1 个答案:

答案 0 :(得分:0)

使用csv

<强>演示

        // Get the data from CSV file.
        string[,] values = LoadCsv("ImportTest.csv");

        //Calulate how many columns and rows in the dataset
        int countCols = values.GetUpperBound(1) + 1;
        int countRows = values.GetUpperBound(0) + 1;

        string rFormSite = "siteurl";
        // opens the site
        SPWeb webSite = new SPSite(rFormSite).OpenWeb();
        // gets the blank file to copy
        SPFile BLANK = webSite.Folders["EventSubmissions"].Files["Blank.xml"];

        // reads the blank file into an xml document
        MemoryStream inStream = new MemoryStream(BLANK.OpenBinary());
        XmlTextReader reader = new XmlTextReader(inStream);
        XmlDocument xdBlank = new XmlDocument();
        xdBlank.Load(reader);
        reader.Close();
        inStream.Close();

        //Get latest ID from the list
        int itemID = GetNextID(webSite, "EventSubmissions");
        if (itemID == -1) return;

        //Iterate each row of the dataset             
        for (int row = 1; row < countRows; row++)
        {

            //display current event name
            Console.WriteLine("Event name - " + values[row, 4]);
            XmlDocument xd = xdBlank;

            XmlElement root = xd.DocumentElement;

            //Cycling through all columns of the document//   
            for (int col = 0; col < countCols; col++)
            {
                string field = values[0, col];
                string value = values[row, col];

                switch (field)
                {
                    case "startDate":
                        value = //How do format the date here ;
                        break;
                    case "endDate":
                        value = "";
                        break;
                    case "AutoFormID":
                        value = itemID.ToString();
                        break;                       
                }

                XmlNodeList nodes = xd.GetElementsByTagName("my:" + field);
                foreach (XmlNode node in nodes)
                {
                    node.InnerText = value;
                }

            }

            // saves the XML Document back as a file
            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
            SPFile newFile = webSite.Folders["EventSubmissions"].Files.Add(itemID.ToString() + ".xml", (encoding.GetBytes(xd.OuterXml)), true);
            itemID++;
        }

       Console.WriteLine("Complete");
        Console.ReadLine();

在这种情况下,您可以使用此直接方法

>>> import StringIO
>>> import csv
>>> line = '4798151,1137351,nam_p0,2762913,nam_r000,"NAM_Rack, Power & Cooling",3'
>>> handler = StringIO.StringIO(line)
>>> [row for row in csv.reader(handler, delimiter=',')]
[['4798151', '1137351', 'nam_p0', '2762913', 'nam_r000', 'NAM_Rack, Power & Cooling', '3']]