保存和读取数据框列表

时间:2019-10-07 09:31:29

标签: python pandas list dataframe

我有一个包含数据帧的列表(每个数据帧都有一个时间轴,始终以0开始并以不同的方式结束),我想另存为.csv: enter image description here

我希望能够以原始格式读取.csv文件作为数据帧列表。

由于无法弄清楚如何使用数据框保存列表,因此我将列表隐藏起来并将所有内容保存为一个数据框: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <title>RePLICA</title> <style type="text/css"> .chart-table { shape-rendering: crispEdges; } .graph-square text { font: 10px sans-serif; } div.tooltip-donut { position: absolute; text-align: center; padding: .5rem; background: #FFFFFF; color: #313639; border: 1px solid #313639; border-radius: 8px; pointer-events: none; font-size: 1.3rem; } .brush .extent { stroke: gray; fill: dodgerblue; fill-opacity: .365; } </style> </head> <body> <script src="https://d3js.org/d3.v4.min.js" charset="utf-8"></script> <script type="text/javascript"> //data var lanes = ["Chinese", "Japanese", "Korean", "Moldova"], laneLength = lanes.length, timeSegments = [{ "lane": 0, "id": "Qin", "segment_start": 100, "segment_end": 210, "flag": false }, { "lane": 0, "id": "Jin", "segment_start": 210, "segment_end": 420, "flag": true }, { "lane": 0, "id": "Sui", "segment_start": 420, "segment_end": 615, "flag": false }, { "lane": 1, "id": "Yamato", "segment_start": 300, "segment_end": 530, "flag": false }, { "lane": 1, "id": "Asuka", "segment_start": 530, "segment_end": 700, "flag": true }, { "lane": 1, "id": "Nara", "segment_start": 710, "segment_end": 800, "flag": false }, { "lane": 1, "id": "Heian", "segment_start": 800, "segment_end": 1180, "flag": true }, { "lane": 2, "id": "Three Kingdoms", "segment_start": 100, "segment_end": 670, "flag": false }, { "lane": 2, "id": "North and South States", "segment_start": 670, "segment_end": 900, "flag": true }, { "lane": 3, "id": "Chisinau", "segment_start": 250, "segment_end": 600, "flag": false }, { "lane": 3, "id": "Balti", "segment_start": 600, "segment_end": 900, "flag": true }, { "lane": 3, "id": "Ungheni", "segment_start": 920, "segment_end": 1380, "flag": false } ], timeBegin = d3.min(timeSegments, function(d) { return d.segment_start; }), timeEnd = d3.max(timeSegments, function(d) { return d.segment_end; }); var widthTotal = 1300, heightTotal = 500, margin = { top: 10, right: 15, bottom: 0, left: 100 }, widthSVG = widthTotal - margin.right - margin.left, heightSVG = heightTotal - margin.top - margin.bottom, graphHeight = laneLength * 10 + heightTotal / 3; // - 3 just a coonstant // scales var scaleX = d3.scaleLinear() .domain([timeBegin, timeEnd]) .range([0, widthSVG]); var scaleY = d3.scaleLinear() .domain([0, laneLength]) .range([0, graphHeight]); var colorScale = d3.scaleOrdinal(d3.schemeCategory10); var chart = d3.select("body") .append("svg") .attr("width", widthSVG + margin.right + margin.left) .attr("height", heightSVG + margin.top + margin.bottom) .attr("class", "chart-table"); var graph = chart.append("g") .attr("transform", "translate(" + margin.left + "," + margin.top + ")") .attr("width", widthSVG) .attr("height", graphHeight) .attr("class", "graph-square"); // Draw the axis chart.append("g") .attr("transform", "translate(" + margin.left + ", " + (graphHeight + 20) + ")") // This controls the vertical position of the Axis .call(d3.axisBottom(scaleX)); // Delimitation lines graph.append("g").selectAll(".laneLines") .data(timeSegments) .enter().append("line") .attr("x1", 0) .attr("y1", function(d) { return scaleY(d.lane); }) .attr("x2", widthSVG) .attr("y2", function(d) { return scaleY(d.lane); }) .attr("stroke", "lightgray"); // Lanes Names display graph.append("g").selectAll(".laneText") .data(lanes) .enter().append("text") .text(function(d) { return d; }) .attr("x", -margin.right) .attr("y", function(d, i) { return scaleY(i + .5); }) .attr("dy", ".5ex") .attr("text-anchor", "end") .attr("class", "laneText"); // Add DIV for "hover_info" var div = d3.select("body").append("div") .attr("class", "tooltip-donut") .style("opacity", 0); // Graph item rects graph.append("g").selectAll(".graphItem") .data(timeSegments) .enter().append("rect") .attr("x", function(d) { return scaleX(d.segment_start); }) .attr("y", function(d) { let shiftVertical = 9; if (d.flag) { shiftVertical = 0 }; return scaleY(d.lane + .5) - shiftVertical; }) .attr("width", function(d) { return scaleX(d.segment_end - d.segment_start + scaleX.domain()[0]); }) .attr("height", 10) .style("fill", function(d) { return colorScale(d.lane); }) // Hover effect .on('mouseover', function(d, i) { d3.select(this).transition() .duration('50') .attr('opacity', '.5'); div.transition() .duration(50) .style("opacity", 1); let hover_info = ("id:" + d.id + "<br/>" + "start:" + d.segment_start + "<br/>" + "end:" + d.segment_end).toString(); //Makes the new div appear on hover: div.html(hover_info) .style("left", (d3.event.pageX + 10) + "px") .style("top", (d3.event.pageY - 15) + "px"); }) .on('mouseout', function(d, i) { d3.select(this).transition() .duration('50') .attr('opacity', '1') //Makes the new div disappear: div.transition() .duration('50') .style("opacity", 0); }); </script> </body> </html>

为了阅读.csv,我尝试了以下方法: pd.concat(data).to_csv(csvfile) 这将给出全零的位置 df = pd.read_csv(csvfile)

为此添加行数以获得最后一个数据帧 zero_indices = list(df.loc[df['Unnamed: 0'] == 0].index)

获取范围-上面列表中连续条目的元组 zero_indices.append(len(df))

将数据框提取到列表中 zero_ranges = [(zero_indices[i], zero_indices[i+1]) for i in range(len(zero_indices) - 1)]

我遇到的问题是索引位于带有数据帧的最终列表中,但我真正想要的是将最终列表中的“未命名:0”列设置为每个数据帧的索引: enter image description here

1 个答案:

答案 0 :(得分:1)

我不确定您要如何处理此问题,但这是我从您的问题陈述中所了解的。让我知道它是否是您想要的:

我们有两个df:

>>> ee = {"Unnamed : 0" : [0,1,2,3,4,5,6,7,8],"price" : [43,43,14,6,4,2,6,4,2], "time" : [3,4,5,2,5,6,6,3,4], "hour" : [1,1,1,5,4,3,4,5,4]}
>>> one = pd.DataFame.from_dict(ee)
>>> dd = {"Unnamed : 0" : [0,1,2,3,4,5],"price" : [23,4,32,4,3,234], "time" : [3,2,4,3,2,4], "hour" : [3,4,3,2,4,4]}
>>> two = pd.DataFrame.from_dict(dd)

看起来像这样:

print(one)
       Unnamed : 0  price  time  hour
    0            0     23     3     3
    1            1      4     2     4
    2            2     32     4     3
    3            3      4     3     2
    4            4      3     2     4
    5            5    234     4     4

print(two)
         Unnamed : 0  price  time  hour
      0            0     23     3     3
      1            1      4     2     4
      2            2     32     4     3
      3            3      4     3     2
      4            4      3     2     4
      5            5    234     4     4

现在由列表运算符组合这两个列表:

list_dfs = [one,two]
print(list_dfs)

[        Unnamed : 0  price  time  hour
     0            0     43     3     1
     1            1     43     4     1
     2            2     14     5     1
     3            3      6     2     5
     4            4      4     5     4
     5            5      2     6     3
     6            6      6     6     4
     7            7      4     3     5
     8            8      2     4     4,    
        Unnamed : 0  price  time  hour
     0            0     23     3     3
     1            1      4     2     4
     2            2     32     4     3
     3            3      4     3     2
     4            4      3     2     4
     5            5    234     4     4]

使用DataFrame的功能

  

set_index()

list_dfs_index = list(map(lambda x : x.set_index("Unnamed : 0"), list_dfs))
print(list_dfs_index)

[                price  time  hour
 Unnamed : 0
    0               43     3     1
    1               43     4     1
    2               14     5     1
    3                6     2     5
    4                4     5     4
    5                2     6     3
    6                6     6     4
    7                4     3     5
    8                2     4     4,              
                 price  time  hour
 Unnamed : 0
    0               23     3     3
    1                4     2     4
    2               32     4     3
    3                4     3     2
    4                3     2     4
    5              234     4     4]

或者,您可以在将数据帧放入列表之前,使用相同的set_index函数将索引设置为“未命名:0”。