困难的数据框查询

时间:2019-03-22 07:43:56

标签: python pandas dataframe

我很确定这个问题已经存在,因此,如果有人能指出我正确的方向

我有两个数据帧DF1:

+----------+-----------+------------+-------------+--------------------+
| Survived |  Surname  | FamilySize | NumSurvived | FamilySurvivalRate |
+----------+-----------+------------+-------------+--------------------+
|        0 | Braund    |          2 |           0 | 0                  |
|        1 | Cumings   |          1 |           1 | 1                  |
|        1 | Heikkinen |          1 |           1 | 1                  |
|        1 | Futrelle  |          2 |           1 | 0.5                |
|        0 | Allen     |          2 |           1 | 0.5                |
|        0 | Moran     |          3 |           1 | 0.333333333        |
|        0 | McCarthy  |          1 |           0 | 0                  |
|        0 | Palsson   |          4 |           0 | 0                  |
+----------+-----------+------------+-------------+--------------------+

和DF2:

+----------+-----------+------------+-------------+--------------------+
| Survived |  Surname  | FamilySize | NumSurvived | FamilySurvivalRate |
+----------+-----------+------------+-------------+--------------------+
|        0 | Braund    |          2 |           0 |                    |
|        1 | Cumings   |          1 |           1 |                    |
|        1 | Heikkinen |          1 |           1 |                    |
|        1 | Futrelle  |          2 |           1 |                    |
|        0 | Allen     |          2 |           1 |                    |
|        0 | Moran     |          3 |           1 |                    |
|        0 | McCarthy  |          1 |           0 |                    |
|        0 | Palsson   |          4 |           0 |                    |
+----------+-----------+------------+-------------+--------------------+

对于DF2中的每个姓,我需要在DF1中找到该姓的FamilySurvivalRate,并将其值放入DF2中。如果该姓氏不在DF1中,则必须为0。

谢谢!

5 个答案:

答案 0 :(得分:1)

使用由SeriesSeries.map创建的df1print (df2) Survived Surname FamilySize NumSurvived 0 0 Braund 2 0 1 1 Cumings1 1 1 <- change surname for no match 2 1 Heikkinen 1 1 3 1 Futrelle 2 1 4 0 Allen 2 1 5 0 Moran 3 1 6 0 McCarthy 1 0 7 0 Palsson 4 0 s = df1.set_index('Surname')['FamilySurvivalRate'] df2['FamilySurvivalRate'] = df2['Surname'].map(s).fillna(0) print (df2) Survived Surname FamilySize NumSurvived FamilySurvivalRate 0 0 Braund 2 0 0.000000 1 1 Cumings1 1 1 0.000000 2 1 Heikkinen 1 1 1.000000 3 1 Futrelle 2 1 0.500000 4 0 Allen 2 1 0.500000 5 0 Moran 3 1 0.333333 6 0 McCarthy 1 0 0.000000 7 0 Palsson 4 0 0.000000 替换不匹配的值:

double _opacityValue = 0.50;//This value goes from 0.0 to 1.0. In this case the opacity is from 50%

final Widget _bodyWithOpacity = Opacity(
  opacity: _opacityValue,
  child: body,
);

答案 1 :(得分:0)

您需要基于DF2中存在的条目合并两个数据框,然后用0填充缺失值:

(
    df2
    # Remove FamilySurvivalRate from DF2, as it is of not interest
    .drop(columns=["FamilySurvivalRate"]
    # Retrieve possibly existing values from df1
    .merge(df1, how="left")
    # Fill missing values with 0
    .fillna({"FamilySurvivalRate": 0})
)

答案 2 :(得分:0)

您可以尝试以下操作:

DF2.loc[DF2['Surname']==DF1['Surname'],['FamilySurvivalRate']] = DF1['FamilySurvivalRate']

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html

答案 3 :(得分:0)

尝试一下,希望它能解决您的问题

let nameDisplayDiv = document.getElementById('display')
let containerDiv = document.getElementById('container')

function refresh() {
  let scrollTop = containerDiv.scrollTop + containerDiv.clientHeight / 2
  let height = 0
  for (let child of containerDiv.children) {
    let top = height
    let bottom = height += child.clientHeight
    if (top < scrollTop && bottom > scrollTop) {
      // Found the element that's currently viewed!
      nameDisplayDiv.innerHTML = child.style.backgroundColor
      break
    }
  }
}

containerDiv.onscroll = refresh

答案 4 :(得分:0)

我认为使用merge()可以达到同样的效果。

df2.merge(df1[["Surname","FamilySurvivalRate"]],how ='left', on = "Surname").fillna(0)