Question

在我的代码中，我收到了这样的结果：

A B C
1 1 1
A B C
2 2 2
A B C
3 3 3

我需要将这些列（数据帧）合并到一个大数据帧像

要合并来自不同文件的数据帧，请轻松像pd.merge(df1,df2)那样但是当数据帧在一个文件中时如何进行？谢谢你的建议！

编辑：接收我的数据我转换了我的数据集中的行来获取数据帧，我已经在一个输出中收到每行的每个数据集。我的代码：

def coordinates():
    with open('file.txt') as file:
        for lines in file:
            lines =StringIO(lines[35:61]) #i need only those fields in each line
            abc=pd.read_csv(lines,sep=' ',header=None)
            abc.columns=['A', 'B', 'C','D','E','F']
            print abc

coordinates()

EDIT2：来自s_vishnu的命题对于具有相同多个标头的prapared文件是唯一的好处。但在我的情况下，我为文件生成了多个DataFrames，并且标题后面的每一行都有0值。它有很多数据帧，每个数据帧只有一行。

EDIT3：在我的file.txt我有大量的行，大约有80个字母排成一行：

AAA S S SSDAS ASDJAI A 234 33 43 234 2342999 2.31 22 33 SSS S D W2UUQ Q231WQ A 222 11 23 123 1231299 2.31 22 11

从这些行我只需要部分信息，这就是我为什么lines =StringIO(lines[35:61])获取此信息的原因。在这个例子中，我将需要字母 [30:55]并使用columns=['A', 'B', 'C','D','E','F'] with sep=' '

创建数据框

Answer 1

my_test.csv：

A, B, C
1, 1 ,1
A, B, C
2, 2, 2
A, B, C
3, 3, 3

使用列表切片。

import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]
print(df)

输出：

   A    B   C
0  1   1    1
2  2    2   2
4  3    3   3

df = df [:: 2] 这是高级列表切片。在df[::2]中，2表示从0增加到2步。

但注意指数值。它们也是2的步骤。即0,2,4,..改变索引只是这样做。

import pandas as pd
df = pd.read_csv("my_test.csv")
df=df[::2]

df.index = range(len(df['A']))
print(df)

输出：

   A    B   C
0  1   1    1
1  2    2   2
2  3    3   3

所以你得到了你想要的价值。

Answer 2

我找到了解决方案，我在开始时更改了代码，这很有帮助：

def coordinates():
abc=open('file.txt')
lines=abc.readlines()
        for line in lines:
        abc2=line[20:-7] #i just cut the lines from the begining and from the end, and i dont need to take data from the middle
        abc3=abc2.split()
        pd.DataFrame(abc3) 
        print abc3

coordinates()

如何在一个文件python中合并多个数据帧

2 个答案: