Question

如果我将多索引数据帧输出到csv，为什么我的第一个索引会重复？

我的数据框：

String JSONstring = "{ \"id\": \"hello\", \"name\": \"Hello\", \"has\": { \"CORS\": false, \"CORS2\": true }, \"has2\": { \"CORS3\": false, \"CORS4\": true } }\";"

var deserializedTicker = JsonConvert.DeserializeObject<JsonInfo>(JSONstring);

String corsvalue = "";
try { corsvalue = deserializedTicker.has.CORS.ToLower(); } catch { }


public class JsonInfo 
{
  public string id { get; set; }
  public string name { get; set; }
  public JsonHasInfo has { get; set; }
  public JsonHas2Info has2 { get; set; }
}

public class JsonHasInfo
{
  public bool CORS { get; set; }
  public bool CORS2 { get; set; }
}

public class JsonHas2Info
{
  public bool CORS3 { get; set; }
  public bool CORS4 { get; set; }
}

下面是我将数据帧输出为csv文件时得到的结果。

输出（csv文件）：

In [1]: \
import numpy as np
import pandas as pd 
df = pd.DataFrame({
    'index1': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
    'index2': [1, 2, 3, 4, 1, 2, 3, 4],
    'column1': np.random.randint(2, size=8),
    'column2': np.random.randint(2, size=8),
    'column3': np.random.randint(2, size=8)
}).set_index(['index1', 'index2'])

Out[2]: \ 
print(df)
               column1  column2  column3
index1 index2                           
A      1             1        1        1
       2             0        1        1
       3             1        0        1
       4             0        0        0
B      1             0        1        0
       2             1        1        0
       3             0        0        0
       4             1        1        1

我想以与第一个代码段中的数据帧完全相同的格式输出csv文件。

Answer 1

重复第一个索引，因为这是数据的底层外观，大熊猫只是以这种方式对其进行格式化以使其更易于读取。当您调用“ to_csv”时，将输出原始数据。要执行您想要的操作，可以重置索引，然后将该列中的重复值替换为空白。

df.reset_index(inplace=True)
df.loc[df['index1'].duplicated(), 'index1'] = ''
df.to_csv('mycsv.csv', index=False)

熊猫对csv文件的多索引

1 个答案: