我有一个包含多个整数列的熊猫数据框,以及{column:{integer:string_label}}的相应字典。
我正在尝试创建一个数据框,其中整数已被其标签替换。我得到的最接近的是,但输出有些出乎意料。
代码
import pandas as pd
import numpy as np
df = pd.DataFrame({'a':[1,2,3,3,8],'b':[8,8,8,8,7]})
dic = {'a':{1:"label1",2:"label2",3:"label3"}, 'b':{8:'label8',7:'label7'}}
converters = {column: lambda x: dic[column][x] if x in dic[column].keys() else np.nan for column in dic.keys()}
new = pd.DataFrame.from_dict({col: series.apply(converters[col])
if col in converters else series
for col, series in df.iteritems()})
print new
#Output:
# a b
# 0 NaN label8
# 1 NaN label8
# 2 NaN label8
# 3 NaN label8
# 4 label8 label7
答案 0 :(得分:0)
问题是你在lambda函数中使用变量column
,lambda声明不会存储值,它将使用变量在调用时保存的内容(在{中{1}}),它可以是任何东西。事实上,如果您运行代码有时会发现它会产生不同的结果。
可能的解决方案:
series.apply(converters[col])
答案 1 :(得分:0)
// File: settings.hpp
#include <string>
const std::string TERMINAL_STRING "Printing to the terminal";
const std::string FILE_STRING "Printing to a file";
// File: printer.hpp
#include <string>
#include <iostream>
class Printer
{
private:
const std::string welcomeMessage;
static std::string initWelcomeMessage(std::ostream&);
public:
Printer(std::ostream&);
}
extern Printer::print;
// File: printer.cpp
#include "settings.hpp"
std::string Printer::initWelcomeMessage(std::ostream &outStream)
{
if (&outStream == &std::cout)
{
return (TERMINAL_STRING);
}
else
{
return (FILE_STRING);
}
}
Printer::Printer(std::ostream &outStream) :
message(initWelcomeMessage(outStream)
{
outStream << welcomeMessage << std::endl;
return;
}
// File: main.cpp
#include "printer.hpp"
printer print(std::cout);
int main()
{
return (0);
}
import numpy as np
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3, 3, 8], 'b': [8, 8, 8, 8, 7]})
dic = {'a': {1: "label1", 2: "label2", 3: "label3"},
'b': {8: 'label8', 7: 'label7'}}
df = df.replace(dic)
allowed = {k: v.values() for k, v in dic.items()}
for col_name, allowed_col_vals in allowed.items():
# Let's replace not allowed values by NaN
df[col_name][~df[col_name].isin(allowed_col_vals)] = np.nan
最终会像这样结束:
df