我是python的新手。在我的项目中,我需要连接pandas数据框的多个列以创建派生列。我的数据框包含几列只有TRUE& FALSE值。我使用以下代码进行连接操作
df_input["combined"] = [' '.join(row) for row in df_input[df_input.columns[0:]].values]
我在运行代码时遇到以下错误
TypeError: sequence item 3: expected str instance, bool found
您能否请专家帮我解决问题?
先谢谢
答案 0 :(得分:2)
让我们试试astype
:
df_input["combined"] = [' '.join(row.astype(str)) for row in df_input[df_input.columns[0:]].values]
答案 1 :(得分:1)
您可以使用Bool
转换astype(str)
列,并使用矢量化版本来连接列,如下所示
from StringIO import StringIO
import pandas as pd
st = """
col1|col2|col3
1|hello|True
4|world|False
7|!|True
"""
df = pd.read_csv(StringIO(st), sep="|")
print("my sample dataframe")
print(df.head())
print("current columns data types")
print(df.dtypes)
print("combining all columns with mixed datatypes")
df["combined"] = df["col1"].astype(str)+" "+df["col2"]+ " " +df["col3"].astype(str)
print("here's how the data looks now")
print(df.head())
print("here are the new columns datatypes")
print(df.dtypes)
脚本的输出:
my sample dataframe
col1 col2 col3
0 1 hello True
1 4 world False
2 7 ! True
current columns data types
col1 int64
col2 object
col3 bool
dtype: object
combining all columns with mixed datatypes
here's how the data looks now
col1 col2 col3 combined
0 1 hello True 1 hello True
1 4 world False 4 world False
2 7 ! True 7 ! True
here are the new columns datatypes
col1 int64
col2 object
col3 bool
combined object
dtype: object
正如您所看到的,新的combined
包含连接数据。
要动态执行连接,以下是编辑上一个示例的方法:
from StringIO import StringIO
import pandas as pd
st = """
col1|col2|col3
1|hello|True
4|world|False
7|!|True
"""
df = pd.read_csv(StringIO(st), sep="|")
print("my sample dataframe")
print(df.head())
print("current columns data types")
print(df.dtypes)
print("combining all columns with mixed datatypes")
#df["combined"] = df["col1"].astype(str)+" "+df["col2"]+ " " +df["col3"].astype(str)
all_columns = list(df.columns)
df["combined"] = ""
for index, column_name in enumerate(all_columns):
print("current column {column_name}".format(column_name=column_name))
df["combined"] = df["combined"] + " " +df[column_name].astype(str)
print("here's how the data looks now")
print(df.head())
print("here are the new columns datatypes")
print(df.dtypes)