使用pandas中的多索引数据框进行索引

时间:2017-03-26 09:01:30

标签: python pandas dataframe indexing multi-index

考虑以下示例数据:

// resizing the vector:
Wt.resize(k_max + 1);
for (int k = 0; k < k_max + 1; k++) {
    Wt[k].resize(2 * i_max + 1);
    for (int i = 0; i < 2 * i_max + 1; i++) {
        Wt[k][i].resize(2 * j_max + 1);
    }
}

// when using the vector:
for (int k = 0; k <= k_max; k++) {
    for (int i = -i_max; i <= i_max; i++) {
        for (int j = -j_max; j <= j_max; j++) {
            Wt[k][i + i_max][j + j_max] = ...
        }
    }
}

那么,我如何仅使用第二级索引data = {"Taxon": ["Firmicutes"]*5, "Patient": range(5), "Tissue": np.random.randint(0, 1000, size=5), "Stool": np.random.randint(0, 1000, size=5)} df = pd.DataFrame(data).set_index(["Taxon", "Patient"]) print(df) Stool Tissue Taxon Patient Firmicutes 0 740 389 1 786 815 2 178 265 3 841 484 4 211 534 查询数据框?例如,我想知道与Patient相关的所有数据。

我已经尝试Patient 2了,它运行良好。但有没有办法用这些(data[data.index.get_level_values(1)==2]lociloc)索引方法实现相同的目标?

2 个答案:

答案 0 :(得分:1)

我认为最简单的是使用xs

np.random.seed(100)
names = ['Taxon','Patient']
mux = pd.MultiIndex.from_product([['Firmicutes', 'another'], range(1, 6)], names=names)
df = pd.DataFrame(np.random.randint(10, size=(10,2)), columns=['Tissue','Stool'], index=mux)
print (df)
                    Tissue  Stool
Taxon      Patient               
Firmicutes 1             8      8
           2             3      7
           3             7      0
           4             4      2
           5             5      2
another    1             2      2
           2             1      0
           3             8      4
           4             0      9
           5             6      2
print (df.xs(2, level=1))
            Tissue  Stool
Taxon                    
Firmicutes       3      7
another          1      0

#if need also level Patient
print (df.xs(2, level=1, drop_level=False))
                    Tissue  Stool
Taxon      Patient               
Firmicutes 2             3      7
another    2             1      0

使用loc的解决方案 - 可以指定axis

print (df.loc(axis=0)[:,2])
                    Tissue  Stool
Taxon      Patient               
Firmicutes 2             3      7
another    2             1      0

答案 1 :(得分:0)

是的,请使用import UIKit class ThirdViewController: UIViewController { var order1Text = String() var myOrder = OrderModel() override func viewDidLoad() { super.viewDidLoad() let barViewControllers = self.tabBarController?.viewControllers let svc = barViewControllers![1] as SecondViewController //error here svc.myOrder = self.myOrder } override func didReceiveMemoryWarning() { super.didReceiveMemoryWarning() } override func viewWillAppear(animated: Bool) { order1Label.text = myOrder.currentOrder() } @IBOutlet var order1Label: UILabel! } ,这正是您所寻找的。请参阅文档here

一些虚拟数据:

pd.IndexSlice

您可以明确地编写它:

data = {"Taxon": ["Firmicutes"]*5,
        "Patient": range(5),
        "Tissue": np.random.randint(0, 1000, size=5),
        "Stool": np.random.randint(0, 1000, size=5)}

df = pd.DataFrame(data).set_index(["Taxon", "Patient"])
print(df)

                    Stool  Tissue
Taxon      Patient               
Firmicutes 0          158     137
           1          697     980
           2          751     759
           3          171     556
           4          701     620

或者您可以使用更易读的pd.IndexSlice:

df.loc[(slice(None), 2), :]

                        Stool   Tissue
Taxon       Patient         
Firmicutes        2     751     759