请帮我弄明白该怎么做。我有一个数据帧。在“指标”栏中有一堆不同的参数(字符串),但我只需要“生活满意度”。我不知道如何删除其他指标,如“没有基本设施的住所”及其相应的价值观和国家。
import numpy as np
import pandas as pd
oecd_bli = pd.read_csv("/Users/vladelec/Desktop/Life.csv")
df = pd.DataFrame(oecd_bli)
df.drop(df.columns[[0,2,4,5,6,7,8,9,10,11,12,13,15,16]], axis=1, inplace=True)
#dropped other columns that a do not need
以下是我的数据框的截图:
答案 0 :(得分:1)
您可以加载数据,如下所示:
file_name = "/Users/vladelec/Desktop/Life.csv"
# Columns you want to load
keep_cols = ['Country', 'Indicator']
# pd.read_csv() will load the data into a pd.DataFrame
oecd_bli = pd.read_csv(file_name, usecols=keep_cols)
如果您只想"Life Satisfaction"
Indicator
,那么您可以执行以下操作:
oecd_bli = oecd_bli[oecd_bli['Indicator'] == "Life Satisfaction"]
如果您想要保留更多Indicators
,那么您可以这样做:
keep_indicators = [
"Life Satisfaction",
"Crime Indicator",
]
oecd_bli = oecd_bli[oecd_bli['Indicator'].isin(keep_indicators)]
答案 1 :(得分:0)
@GiantsLoveDeathMetal有好点。原则上,您可以将原始数据作为oecd_bli
读取,并选择满足特定条件的DataFrame子集。
<强>演示强>
import pandas as pd
# Given a DataFrame of raw data
d = {
"Country": pd.Series(["Australia", "Austria", "Fiji", "Japan"]),
"Indicator": pd.Series(["Dwellings ...", "Dwellings ...", "Life ...", "Life ..."]),
"Value": pd.Series([1.1, 1.0, 2.2, 2.9]),
}
oecd_bli = pd.DataFrame(d, columns=["Country", "Indicator", "Value"] )
oecd_bli
# Select rows starting with "Life" in column "Indicator"
condition = oecd_bli["Indicator"].str.startswith("Life")
subset = oecd_bli[condition]
subset
或者,通过.loc
使用标签索引选择子集:
subset = oecd_bli.loc[condition, :]
此处loc
需要[<rows>, <columns>]
。因此,显示满足条件的那些行。
<强>详情
请注意,为给出True
条件的每一行都会显示一个DataFrame视图。这是因为DataFrame响应boolean arrays。
布尔数组的示例:
>>> condition = oecd_bli["Indicator"].str.startswith("Life")
>>> condition
0 False
1 False
2 True
3 True
Name: Indicator, dtype: bool
设置条件的其他方法:
>>> condition = oecd_bli["Indicator"] == "Life ..."
>>> condition = ~oecd_bli["Indicator"].str.startswith("Dwell")
>>> condition = oecd_bli["Indicator"].isin(["Life ...", "Crime ..."])
>>> condition = (oecd_bli["Indicator"] == "Life ...") | (oecd_bli["Indicator"] == "Crime ...")
==
)~
)不受欢迎的事件isin
|
,&
等进行多次比较。)