可以说我有一个具有以下格式的Excel文档。我正在阅读与熊猫说的excel文档,并使用matplotlib和numpy绘制数据。一切都很棒!
Buttttt .....我再也没有约束了。现在,我想约束我的数据,以便仅对特定的天顶角和方位角进行排序。更具体地说:我只想在30到90之间的天顶,而我只想在30到330之间的方位角
Air Quality Data
Azimuth Zenith Ozone Amount
230 50 12
0 81 10
70 35 7
110 90 17
270 45 23
330 45 13
345 47 6
175 82 7
220 7 8
这是我正在寻找的那种约束的例子。
Air Quality Data
Azimuth Zenith Ozone Amount
230 50 12
70 35 7
110 90 17
270 45 23
330 45 13
175 82 7
以下是我的代码:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime
P_file = file1
out_file = file2
out_file2 = file3
data = pd.read_csv(file1,header=None,sep=' ')
df=pd.DataFrame(data=data)
df.to_csv(file2,sep=',',header = [19 headers. The three that matter for this question are 'DateTime', 'Zenith', 'Azimuth', and 'Ozone Amount'.]
df=pd.read_csv(file2,header='infer')
mask = df[df['DateTime'].str.contains('20141201')] ## In this line I'm sorting for anything containing the locator for the given day.
mask.to_csv(file2) ##I'm now updating file 2 so that it only has the data I want sorted for.
data2 = pd.read_csv(file2,header='infer')
df2=pd.DataFrame(data=data2)
def tojuliandate(date):
return.... ##give a function that changes normal date of format %Y%m%dT%H%M%SZ to julian date format of %y%j
def timeofday(date):
changes %Y%m%dT%H%M%SZ to %H%M%S for more narrow views of data
df2['Time of Day'] = df2['DateTime'].apply(timeofday)
df2.to_csv(file2) ##adds a column for "timeofday" to the file
因此,基本上到此为止,这就是制作要排序的csv的所有代码。我将如何进行排序
'Zenith' and 'Azimuth'
如果它们符合我上面指定的条件?
我知道我需要if语句来执行此操作。 我尝试了类似的方法,但是它没有用,我在寻找一些帮助:
答案 0 :(得分:3)
您可以使用series between:
df[(df['Zenith'].between(30, 90)) & (df['Azimuth'].between(30, 330))]
收益:
Azimuth Zenith Ozone Amount
0 230 50 12
2 70 35 7
3 110 90 17
4 270 45 23
5 330 45 13
7 175 82 7
请注意,默认情况下,这些上限和下限是包含在内的(inclusive=True
)。
答案 1 :(得分:2)
您只能将符合您的边界条件的数据框的那些条目写入文件中
# replace the line df.to_csv(...) in your example with
df[((df['Zenith'] >= 3) & (df['Zenith'] <= 90)) and
((df['Azimuth'] >= 30) & (df['Azimuth'] <= 330))].to_csv('my_csv.csv')
答案 2 :(得分:2)
df_new = df.query('30 <= Zenith <= 90 and 30 <= Azimuth <= 330')
print(df_new)
Azimuth Zenith OzoneAmount
0 230 50 12
2 70 35 7
3 110 90 17
4 270 45 23
5 330 45 13
7 175 82 7
答案 3 :(得分:2)
df[(df["Zenith"]>30) & (df["Zenith"]<90) & (df["Azimuth"]>30) & (df["Azimuth"]<330)]
基本上是Efficient way to apply multiple filters to pandas DataFrame or Series的副本