格式化在Python中单调增加的数据

时间:2018-08-16 13:19:51

标签: python python-3.x pandas dataframe data-manipulation

我已经根据需要格式化了数据。现在,我的最终数据或数据帧不是单调递增的,而输入数据根据第一列字段(freq)是单调递增的。这是Data_input_truncated.txt的{​​{3}}。我的python代码在下面:

import pandas as pd

#create DataFrame from csv with columns f and v 
df = pd.read_csv('Data_input.txt', sep="\s+", names=['freq','v'])

#boolean mask for identify columns of new df   
m = df['v'].str.endswith(')')
#new column by replace NaNs by forward filling
df['g'] = df['v'].where(m).ffill()
#get original ordering for new columns
cols = df['g'].unique()
#remove rows with same values in v and g columns
df = df[df['v'] != df['g']]
#reshape by pivoting with change ordering of columns by reindex
df = df.pivot('freq', 'g', 'v').rename_axis(None, axis=1).reindex(columns=cols).reset_index()

df.columns = [x.replace('(','').replace(')','').replace(',',':') for x in df.columns]
df.to_csv('target.txt', index=False, sep='\t')

现在创建的target.txt不是单调的。这是target.txt的{​​{3}}。保存为文件之前,如何使其单调?

我正在使用Spyder 3.2.6(Anaconda),其中嵌入了python 3.6.4 64位。

1 个答案:

答案 0 :(得分:0)

问题在于您的数据是str而不是float,并且在进行数据透视时,数据按字母顺序重新排序。一种选择是将freq列的类型更改为float,然后,如果格式化为科学数字很重要,则可以在float_format期间设置to_csv参数:

### same code before
#remove rows with same values in v and g columns
df = df[df['v'] != df['g']]
# convert to float
df['freq']= df['freq'].astype(float)

#reshape by pivoting with change ordering of columns by reindex
df = df.pivot('freq', 'g', 'v').rename_axis(None, axis=1).reindex(columns=cols).reset_index()

df.columns = [x.replace('(','').replace(')','').replace(',',':') for x in df.columns]
df.to_csv('target.txt', index=False, sep='\t', float_format='%.17E' ) # add float_format='%.17E'

请注意,float_format='%.17E'的含义是科学计数法,在输入中.后面有17个数字,但如果您不重要,则可以将其更改为所需的任何人。

编辑:我在target.txt(前5行和3列)中得到此结果

freq    R1:1    R1:2
0.00000000000000000E+00 4.07868642871600962E0   3.12094533520232087E-13
1.00000000000000000E+06 4.43516799439728793E0   4.58503433913467795E-3
2.00000000000000000E+06 4.54224931058591253E0   1.21517855438593236E-2
3.00000000000000000E+06 4.63952376349496909E0   2.10017318391844077E-2
4.00000000000000000E+06 4.74002677709486608E0   3.05258806632440871E-2