我有3个数据框:
df1
A B C
1 1 1
2 2 2
df2
A B C
3 3 3
4 4 4
df3
A B
5 5
所以我想将所有数据帧合并为以下数据帧:
A B C
1 1 1
2 2 2
3 3 3
4 4 4
5 5 NaN
我尝试同时使用pd.concat([df1,df2,df3])
和axis=0
的{{1}},但是它们都没有按预期的方式工作。
答案 0 :(得分:2)
import java.util.*;
public class JavaFiddle
{
public static void main(String[] args)
{
// starting here with the result of your regex operation
String[] yourOutputArray = {"(z", "/", "(2+t))", "/", "((2.0+var)", "*", "(x/y))"};
List<String> listy = new ArrayList<String>();
String s = "";
for(int i = 0; i < yourOutputArray.length; i++){
s += yourOutputArray[i];
if(characterCount(s,"(") > 0 && characterCount(s,"(") == characterCount(s,")")){
listy.add(s);
s = "";
}else if(s.length() > 0 && characterCount(s,"(") == 0){
// this must be an operator we want to split on
listy.add(s);
s = "";
}
}
// listy is the result you are looking for
System.out.println(listy); // [(z/(2+t)), /, ((2.0+var)*(x/y))]
}
public static int characterCount(String input, String character){
return input.length() - input.replace(character, "").length();
}
}
答案 1 :(得分:1)
如果公用列名称相同,则工作正常-公用列已正确对齐:
print (df1.columns.tolist())
['A', 'B', 'C']
print (df2.columns.tolist())
['A', 'B', 'C']
print (df3.columns.tolist())
['A', 'B']
如果可能有som尾随空格,可以使用str.strip
:
print (df1.columns.tolist())
['A', 'B ', 'C']
df1.columns = df1.columns.str.strip()
print (df1.columns.tolist())
['A', 'B', 'C']
参数ignore_index=True
也是concat
之后的默认RangeIndex,以避免索引重复,而参数sort
则是避免FutureWarning
:
df = pd.concat([df1,df2,df3], ignore_index=True, sort=True)
print (df)
A B C
0 1 1 1.0
1 2 2 2.0
2 3 3 3.0
3 4 4 4.0
4 5 5 NaN
答案 2 :(得分:1)
我认为您需要告诉Lead()
忽略concat
:
index