我有一个数据框,其中有两列“状态”和“代码”,每列中都缺少值。
public interface DataSource {
String getMovieName();
}
public class LocalDataSource implements DataSource {
@Inject
public LocalDataSource() {
}
@Override
public String getMovieName() {
return "";
}
}
public class RemoteDataSource implements DataSource {
@Inject
public RemoteDataSource() {
}
@Override
public String getMovieName() {
return "";
}
}
// Without factory.
@Module
public abstract class DataSourceModule {
@Binds
@Local
abstract DataSource bindLocalDataSource(LocalDataSource localDataSource);
@Binds
@Remote
abstract DataSource bindRemoteDataSource(RemoteDataSource remoteDataSource);
}
public class Repository implements DataSource {
private final DataSource localDataSource;
private final DataSource remoteDataSource;
@Inject
public Repository(@Local DataSource localDataSource, @Remote DataSource remoteDataSource) {
this.localDataSource = localDataSource;
this.remoteDataSource = remoteDataSource;
}
@Override
public String getMovieName() {
String name = localDataSource.getMovieName();
return name.isEmpty() ? remoteDataSource.getMovieName() : name;
}
}
//With data source factory
public class DataSourceFactory{
private final LocalDataSource localDataSource;
private final RemoteDataSource remoteDataSource;
@Inject
public DataSourceFactory(LocalDataSource localDataSource, RemoteDataSource remoteDataSource) {
this.localDataSource = localDataSource;
this.remoteDataSource = remoteDataSource;
}
public static DataSource createLocalDataSource(){
return localDataSource;
}
public static DataSource createRemoteDataSource(){
return remoteDataSource;
}
}
public class Repository implements DataSource {
private final DataSource localDataSource;
private final DataSource remoteDataSource;
@Inject
public Repository(DataSourceFactory dataSourceFactory) {
this.localDataSource = dataSourceFactory.createLocalDataSource();
this.remoteDataSource = dataSourceFactory.createRemoteDataSource();
}
@Override
public String getMovieName() {
String name = localDataSource.getMovieName();
return name.isEmpty() ? remoteDataSource.getMovieName() : name;
}
}
缺少值
import pandas as pd
df = pd.DataFrame([['Alabama', 'AL'], ['Alaska', 'AK'], ['Arizona', 'AZ'], ['Arkansas', 'AR'], ['Iowa','IA'],['Hawaii','HI'], ['Idaho', 'ID'], ['Alabama', ''], ['', 'IA'], ['Alaska',''], ['', 'AZ']], columns=['State', 'Code'])
我尝试过的
State Code
7 Alabama
8 IA
9 Alaska
10 AZ
这将设置代码中的缺失值。我还需要更新此功能以设置状态。我正在试图简化这一点。
必填输出
state_code_dict = {
'Alabama': 'AL',
'Alaska': 'AK',
'Arizona': 'AZ',
'Arkansas': 'AR',
'Iowa':'IA',
'Hawaii':'HI',
'Idaho': 'ID',
}
def state_code(x):
if (x['Code'] == ''):
return state_code_dict[x['State']]
else:
return x['Code']
df['Code'] = df.apply(lambda x: state_code(x), axis=1)
答案 0 :(得分:4)
IIUC,您可以使用map
首先映射代码,然后使用状态,使用布尔掩码仅在有空值时分配值。
mask = df.Code == ''
df.loc[mask, 'Code'] = df[mask].State.map(state_code_dict)
mask = df.State == ''
df.loc[mask, 'State'] = df[mask].Code.map({v:k for k,v in state_code_dict.items()})
State Code
0 Alabama AL
1 Alaska AK
2 Arizona AZ
3 Arkansas AR
4 Iowa IA
5 Hawaii HI
6 Idaho ID
7 Alabama AL
8 Iowa IA
9 Alaska AK
10 Arizona AZ
答案 1 :(得分:4)
您可以将空白字符串替换为np.nan
,然后将fillna
和pd.Series.map
一起使用。与@RafaelC类似,但实现方式不同。
code_state_dict = {v: k for k, v in state_code_dict.items()}
df.replace('', np.nan, inplace=True)
df['Code'].fillna(df['State'].map(state_code_dict), inplace=True)
df['State'].fillna(df['Code'].map(code_state_dict), inplace=True)
print(df)
State Code
0 Alabama AL
1 Alaska AK
2 Arizona AZ
3 Arkansas AR
4 Iowa IA
5 Hawaii HI
6 Idaho ID
7 Alabama AL
8 Iowa IA
9 Alaska AK
10 Arizona AZ
答案 2 :(得分:1)
填写代码
df['Code'] = df.apply(lambda x: x['Code'] if x['Code']!='' else state_code_dict[x['State']],axis=1)
要填写州
state_code_dict2 = {v: k for k, v in state_code_dict.items()}
df['State'] = df.apply(lambda x: x['State'] if x['State']!='' else state_code_dict2[x['Code']],axis=1)
答案 3 :(得分:0)
与Filling a series based on key value pairs类似的问题
使用数据:
(df.replace('', np.nan)
.sort_values(by=['State', 'Code'], ascending=False)
.groupby('State').ffill().bfill()
.groupby('Code').ffill().bfill())
输出:
Code State
4 IA Iowa
6 ID Idaho
5 HI Hawaii
3 AR Arkansas
2 AZ Arizona
1 AK Alaska
9 AK Alaska
0 AL Alabama
7 AL Alabama
8 IA Iowa
10 AZ Arizona