Question

在一个DataFrame中，DataFrame中有一些列。我想使用索引对列值进行“ /”拆分。下面是我要拆分数据的列的列表。

Eg:- split_columns = ['Fuel', 'Air Pollution Score', 'City MPG', 'Hwy MPG', 'Cmb MPG', 'Greenhouse Gas Score']

如果“燃料”中包含数据，则输出应类似于“乙醇/气体”。

这是我的代码-

split_columns = ['Fuel', 'Air Pollution Score', 'City MPG', 'Hwy MPG', 'Cmb MPG', 'Greenhouse Gas Score']

for c in split_columns:
  df1[c] = df1[c].apply(lambda x: x.split("/")[0])
  df2[c] = df2[c].apply(lambda x: x.split("/")[1])

当我执行上述代码时，我发现了一个错误“索引超出范围”。

Answer 1

这里只是意味着有时其他几列中没有"/"。因此，当没有"/"时，split将仅具有一个元素。但是，您正在访问x.split("/")[1]。这导致索引错误。要解决此问题，只需检查x中是否存在"/"或检查拆分的长度即可。如果大于1，则表示存在"/"。

Answer 2

我建议使用Series.str.split和索引str[0]和str[1]来选择第一和第二嵌套列表。

如果/不存在，则输出为NaN值，而不是IndexOutOfBoundsException。

for c in split_columns:
  df1[c] = df1[c].astype(str).str.split("/").str[0]
  df2[c] = df2[c].astype(str).str.split("/").str[1]

Answer 3

它有一个索引问题：我找到2个解决方案： 1）我将此分成两部分（在Jupyter的2个单元中），该错误消失了。

对于split_columns中的c： df1 [c] = df1 [c] .apply（lambda x：x.split（“ /”）[0]）对于split_columns中的c： df2 [c] = df2 [c] .apply（lambda x：x.split（“ /”）[1]）

2）我重命名第二个索引对于split_columns中的c： df1 [c] = df1 [c] .apply（lambda x：x.split（“ /”）[0]） df2 [c] = df2 [c] .apply（lambda x：x.split（“ /”）[0]）

如何在python中使用索引按“ /”分隔符拆分？

3 个答案: