Question

我有3个数据框：

df1

A B C
1 1 1
2 2 2

df2

A B C
3 3 3
4 4 4

df3

A B
5 5

所以我想将所有数据帧合并为以下数据帧：

我尝试同时使用pd.concat([df1,df2,df3])和axis=0的{{1}}，但是它们都没有按预期的方式工作。

Answer 1

  import java.util.*;

  public class JavaFiddle
  {
    public static void main(String[] args)
    {
      // starting here with the result of your regex operation
      String[] yourOutputArray = {"(z", "/", "(2+t))", "/", "((2.0+var)", "*", "(x/y))"};

      List<String> listy = new ArrayList<String>();
      String s = "";
      for(int i = 0; i < yourOutputArray.length; i++){
          s += yourOutputArray[i];

          if(characterCount(s,"(") > 0 && characterCount(s,"(") == characterCount(s,")")){
              listy.add(s);
              s = "";
          }else if(s.length() > 0 && characterCount(s,"(") == 0){
              // this must be an operator we want to split on
              listy.add(s);
              s = "";
          }
      }

      // listy is the result you are looking for
      System.out.println(listy); // [(z/(2+t)), /, ((2.0+var)*(x/y))]
    }

    public static int characterCount(String input, String character){
        return input.length() - input.replace(character, "").length();
    }
  }

Answer 2

如果公用列名称相同，则工作正常-公用列已正确对齐：

print (df1.columns.tolist())
['A', 'B', 'C']
print (df2.columns.tolist())
['A', 'B', 'C']
print (df3.columns.tolist())
['A', 'B']

如果可能有som尾随空格，可以使用str.strip：

print (df1.columns.tolist())
['A', 'B ', 'C']

df1.columns = df1.columns.str.strip()

print (df1.columns.tolist())
['A', 'B', 'C']

参数ignore_index=True也是concat之后的默认RangeIndex，以避免索引重复，而参数sort则是避免FutureWarning：

df = pd.concat([df1,df2,df3], ignore_index=True, sort=True)
print (df)
   A  B    C
0  1  1  1.0
1  2  2  2.0
2  3  3  3.0
3  4  4  4.0
4  5  5  NaN

Answer 3

我认为您需要告诉Lead()忽略concat：

index

列不对齐时串联多个熊猫数据框

3 个答案: