使用迭代更改一行下面的所有值

时间:2018-08-19 00:53:52

标签: python pandas dataframe contains

我正在研究代表美国地区并在其中包含州的df。各州旁边有[edit]。 2个州之间的所有区域都属于上面的州。我认为这应该可行,但是由于某种原因它并没有改变df的值...您知道这里发生了什么吗?你会怎么做?

这是df

0                      Alabama[edit]
1                            Auburn 
2                          Florence 
3                      Jacksonville 
4                        Livingston 
5                        Montevallo 
6                              Troy 
7                        Tuscaloosa 
8                          Tuskegee 
9                       Alaska[edit]
10                        Fairbanks 
11                     Arizona[edit]
12                        Flagstaff 
13                            Tempe 
14                           Tucson 
15                    Arkansas[edit]
16                      Arkadelphia 
17                           Conway 
18                     Fayetteville 
19                        Jonesboro 
20                         Magnolia 
21                       Monticello 
22                     Russellville 
23                           Searcy 
24                  California[edit]
25                           Angwin 
26                           Arcata 
27                         Berkeley 
28                            Chico 
29                        Claremont 

这是我的不改变df的解决方案:

df['state'] = 'replace this'
edit = '\[edit\]'
for index, row in df.iterrows():
    if edit in row['RegionName']:
        st = df.loc[index, ['RegionName']]
        df.loc[index, ['RegionName']] = None
        df.iloc[index:, 1] = st

4 个答案:

答案 0 :(得分:1)

假设您的列名是2018-08-19 15:28:00.987654,则可以使用regions

str.extract

如果您想保留“地区”列中的状态,只需删除df.assign( state=df.region.str.extract(r'(.*?)\[edit\]').ffill() ).mask(df.region.str.endswith('[edit]')).dropna() region state 1 Auburn Alabama 2 Florence Alabama 3 Jacksonville Alabama 4 Livingston Alabama 5 Montevallo Alabama 6 Troy Alabama 7 Tuscaloosa Alabama 8 Tuskegee Alabama 10 Fairbanks Alaska 12 Flagstaff Arizona 13 Tempe Arizona 14 Tucson Arizona 16 Arkadelphia Arkansas 17 Conway Arkansas 18 Fayetteville Arkansas 19 Jonesboro Arkansas 20 Magnolia Arkansas 21 Monticello Arkansas 22 Russellville Arkansas 23 Searcy Arkansas 25 Angwin California 26 Arcata California 27 Berkeley California 28 Chico California 29 Claremont California

mask

答案 1 :(得分:0)

如果我对您的理解正确,那么这是一种避免显式循环的解决方案。

# Create a new column of state names with NaN in any
# row that did not contain a state name flagged with "edit"
df['state'] = df[df['RegionName'].str.contains('edit')]['RegionName']

# Forward-fill the NaNs in the state column
df = df.ffill()

# Delete rows where RegionName == state and
# reset index to default integers
df = df[df.iloc[:, 0] != df.iloc[:, 1]].reset_index(drop=True)

# Delete "[edit]" flag from strings
df['state'] = df['state'].str.replace('\[edit\]', '')

# Result:
df
      RegionName       state
0         Auburn     Alabama
1       Florence     Alabama
2   Jacksonville     Alabama
3     Livingston     Alabama
4     Montevallo     Alabama
5           Troy     Alabama
6     Tuscaloosa     Alabama
7       Tuskegee     Alabama
8      Fairbanks      Alaska
9      Flagstaff     Arizona
10         Tempe     Arizona
11        Tucson     Arizona
12   Arkadelphia    Arkansas
13        Conway    Arkansas
14  Fayetteville    Arkansas
15     Jonesboro    Arkansas
16      Magnolia    Arkansas
17    Monticello    Arkansas
18  Russellville    Arkansas
19        Searcy    Arkansas
20        Angwin  California
21        Arcata  California
22      Berkeley  California
23         Chico  California
24     Claremont  California

答案 2 :(得分:0)

尝试以下代码

import pandas as pd
import numpy as np
df['State']=df['RegionName']
df.loc[~df['RegionName'].str.contains('[edit]'),'State']=np.nan
df['State']=df['State'].str.replace('[edit]','').fillna(method='ffill')
print(df)

答案 3 :(得分:0)

创建一个标识状态的掩码。使用它为状态创建一个新列,向前填充,并仅选择掩码排除的行。

mask = df.region.str.endswith('[edit]')
df.loc[mask, 'state'] = df.region[mask].str.replace('\[edit\]', '')
df.state = df.state.ffill()
df[~mask]
# outputs:
          region       state
1         Auburn     Alabama
2       Florence     Alabama
3   Jacksonville     Alabama
4     Livingston     Alabama
5     Montevallo     Alabama
6           Troy     Alabama
7     Tuscaloosa     Alabama
8       Tuskegee     Alabama
10     Fairbanks      Alaska
12     Flagstaff     Arizona
13         Tempe     Arizona
14        Tucson     Arizona
16   Arkadelphia    Arkansas
17        Conway    Arkansas
18  Fayetteville    Arkansas
19     Jonesboro    Arkansas
20      Magnolia    Arkansas
21    Monticello    Arkansas
22  Russellville    Arkansas
23        Searcy    Arkansas
25        Angwin  California
26        Arcata  California
27      Berkeley  California
28         Chico  California
29     Claremont  California