这个问题也许太基础了,对此表示歉意。
但是我正在尝试创建一个for循环,该循环将根据条件在熊猫数据框中输入值1或0。
import pandas as pd
def checkHour6(time):
val = 0
if time == 6:
val = 1
return val
def checkHour7(time):
val = 0
if time == 7:
val = 1
return val
def checkHour8(time):
val = 0
if time == 8:
val = 1
return val
def checkHour9(time):
val = 0
if time == 9:
val = 1
return val
def checkHour10(time):
val = 0
if time == 10:
val = 1
return val
我正在尝试的此for循环将从0到23计数,并且我正在尝试在循环过程中构建pandas数据帧,该数据帧将适当地输入值1或0,但是我缺少一些基本的东西作为最终df结果是一个空的数据框。
创建空df:
df = pd.DataFrame({'hour_6':[], 'hour_7':[], 'hour_8':[], 'hour_9':[], 'hour_10':[]})
对于循环:
hour = -1
for i in range(24):
stuff = []
hour = hour + 1
stuff.append(checkHour6(hour))
stuff.append(checkHour7(hour))
stuff.append(checkHour8(hour))
stuff.append(checkHour9(hour))
stuff.append(checkHour10(hour))
df.append(stuff)
答案 0 :(得分:1)
我建议以下内容:
checkHour()
函数进行小时设置,pandas.DataFrame.append()
documentation, other 参数必须是 DataFrame或类似Series / dict的对象或这些对象的列表,因此列表无法使用,代码如下:
def checkHour(time, hour):
val = 0
if time == hour:
val = 1
return val
df = pd.DataFrame({'hour_6':[], 'hour_7':[], 'hour_8':[], 'hour_9':[], 'hour_10':[]})
hour = -1
for i in range(24):
stuff = {}
hour = hour + 1
stuff['hour_6'] = checkHour(hour, 6)
stuff['hour_7'] = checkHour(hour, 7)
stuff['hour_8'] = checkHour(hour, 8)
stuff['hour_9'] = checkHour(hour, 9)
stuff['hour_10'] = checkHour(hour, 10)
df = df.append(stuff, ignore_index=True)
结果如下:
>>> print(df)
hour_6 hour_7 hour_8 hour_9 hour_10
0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 0.0 0.0 0.0
6 1.0 0.0 0.0 0.0 0.0
7 0.0 1.0 0.0 0.0 0.0
8 0.0 0.0 1.0 0.0 0.0
9 0.0 0.0 0.0 1.0 0.0
10 0.0 0.0 0.0 0.0 1.0
11 0.0 0.0 0.0 0.0 0.0
12 0.0 0.0 0.0 0.0 0.0
13 0.0 0.0 0.0 0.0 0.0
14 0.0 0.0 0.0 0.0 0.0
15 0.0 0.0 0.0 0.0 0.0
16 0.0 0.0 0.0 0.0 0.0
17 0.0 0.0 0.0 0.0 0.0
18 0.0 0.0 0.0 0.0 0.0
19 0.0 0.0 0.0 0.0 0.0
20 0.0 0.0 0.0 0.0 0.0
21 0.0 0.0 0.0 0.0 0.0
22 0.0 0.0 0.0 0.0 0.0
23 0.0 0.0 0.0 0.0 0.0
编辑:
正如@Parfait所述,在for循环中使用pandas.DataFrame.append()
是不好的,因为它会导致二次复制。为避免这种情况,您可以创建一个词典列表(将来的数据框行),然后调用pd.DataFrame()
从中创建一个数据框。代码如下:
def checkHour(time, hour):
val = 0
if time == hour:
val = 1
return val
data = []
hour = -1
for i in range(24):
stuff = {}
hour = hour + 1
stuff['hour_6'] = checkHour(hour, 6)
stuff['hour_7'] = checkHour(hour, 7)
stuff['hour_8'] = checkHour(hour, 8)
stuff['hour_9'] = checkHour(hour, 9)
stuff['hour_10'] = checkHour(hour, 10)
data.append(stuff)
df = pd.DataFrame(data)
结果如下:
>>> print(df)
hour_6 hour_7 hour_8 hour_9 hour_10
0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
6 1 0 0 0 0
7 0 1 0 0 0
8 0 0 1 0 0
9 0 0 0 1 0
10 0 0 0 0 1
11 0 0 0 0 0
12 0 0 0 0 0
13 0 0 0 0 0
14 0 0 0 0 0
15 0 0 0 0 0
16 0 0 0 0 0
17 0 0 0 0 0
18 0 0 0 0 0
19 0 0 0 0 0
20 0 0 0 0 0
21 0 0 0 0 0
22 0 0 0 0 0
23 0 0 0 0 0
答案 1 :(得分:1)
另一个非常简单的解决方案,如何创建数据框是使用pandas.get_dummies()
函数,如下所示:
Info::~Info()
{
uiMain->actionInfo->setEnabled(true);
delete ui;
}
答案 2 :(得分:0)
快速浏览一下空白问题
hour = -1
stuff = []
for i in range(24):
hour = hour + 1
stuff.append(checkHour6(hour))
stuff.append(checkHour7(hour))
stuff.append(checkHour8(hour))
stuff.append(checkHour9(hour))
stuff.append(checkHour10(hour))
df.append(stuff)
不过,这可能是整个过程的更好解决方案。
答案 3 :(得分:0)
从数据列开始(现在是几小时) 然后可以从中查询所有其他比较。
import pandas as pd
df = pd.DataFrame(range(24), columns= ['data'])
for time in range(6,11):
df[f'hour_{time}'] = df['data']%24==time
df = df.astype(int)
如果需要,可以稍后删除数据列。
data hour_6 hour_7 hour_8 hour_9 hour_10
0 0 0 0 0 0 0
1 1 0 0 0 0 0
2 2 0 0 0 0 0
3 3 0 0 0 0 0
4 4 0 0 0 0 0
5 5 0 0 0 0 0
6 6 1 0 0 0 0
7 7 0 1 0 0 0
8 8 0 0 1 0 0
9 9 0 0 0 1 0
10 10 0 0 0 0 1
11 11 0 0 0 0 0
12 12 0 0 0 0 0
13 13 0 0 0 0 0
14 14 0 0 0 0 0
15 15 0 0 0 0 0
16 16 0 0 0 0 0
17 17 0 0 0 0 0
18 18 0 0 0 0 0
19 19 0 0 0 0 0
20 20 0 0 0 0 0
21 21 0 0 0 0 0
22 22 0 0 0 0 0
23 23 0 0 0 0 0
答案 4 :(得分:0)
由于{
"followupEventInput": {
"name": "event-name",
"parameters": {
"parameter-name-1": "parameter-value-1",
"parameter-name-2": "parameter-value-2"
},
"languageCode": "en-US"
}
}
和numpy
中的对象模型与一般的Python不同,请考虑避免像使用pandas
或list
这样的简单可迭代对象那样在循环中构建对象。
实际上,您可以使用DataFrame.pivot
来处理您的设置,该列具有24个连续整数,没有任何函数或循环!实际上,您可以轻松返回更多的小时列(即 hour_0 - hour_24 ),也可以返回reindex
获得所需的五列:
数据
dict
枢轴
df = (pd.DataFrame({'hour': ['hour' for _ in range(24)]})
.assign(hour = lambda x: x['hour'] + '_' + pd.Series(range(24)).astype('str'),
num = 1)
)
df3.head(5)
# hour num
# 0 hour_0 1
# 1 hour_1 1
# 2 hour_2 1
# 3 hour_3 1
# 4 hour_4 1