我有以下数据集:
1 8 15 22
2 9 16 23
3 10 17 24
4 11 18 25
5 12 19 26
6 13 20 27
7 14 21 28
我想得到以下结果:
1
2
3
4
5
6
7
8
...
23
24
25
26
27
28
所以我想遍历数据集的所有列,并将每一列连接到第一列。
import pandas as pd
df = pd.read_csv("data.csv", delimiter=";", header=-1)
number_of_columns= len(df.columns)
print(number_of_columns)
for i in range (1,number_of_columns):
df1 = df.iloc[:,i]
df2 = pd.concat([df,df1], ignore_index=True)
print(df2)
只有最后一列被连接在最终数据框中。我知道df2在for循环的每次迭代中都会被覆盖。
那我该如何在每个for循环之后“保存” df2,以便每一列都被浓缩?
非常感谢!
答案 0 :(得分:2)
逐列显示
stack
+ tolist
df.stack().tolist()
[1,
8,
15,
22,
2,
9,
16,
23,
3,
10,
17,
24,
4,
11,
18,
25,
5,
12,
19,
26,
6,
13,
20,
27,
7,
14,
21,
28]
逐行
melt
df.melt().value.tolist()
[1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28]
unstack
+ tolist
df.unstack().tolist()
#outputs same as above
答案 1 :(得分:1)
您也可以这样做:
txt = '''1 8 15 22
2 9 16 23
3 10 17 24
4 11 18 25
5 12 19 26
6 13 20 27
7 14 21 28'''
arr1 = np.fromstring(txt, dtype=int, sep=' ')
arr1.reshape(7,-1).flatten(order = 'F') # for column wise, 'C' can be used for row wise.
答案 2 :(得分:1)
您可能不需要第三方库。您可以使用标准库中的csv
和itertools
模块来返回数字列表:
from io import StringIO
from itertools import chain
import csv
mystr = StringIO("""1 8 15 22
2 9 16 23
3 10 17 24
4 11 18 25
5 12 19 26
6 13 20 27
7 14 21 28""")
with mystr as fin:
reader = csv.reader(mystr, skipinitialspace=True, delimiter=' ')
res = list(map(int, chain.from_iterable(zip(*reader))))
print(res)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28]
答案 3 :(得分:1)
简单地np.flatten(),
pd.Series(df.values.flatten())
(or)
pd.Series(df.unstack().values)