我有一个40MB的数据框'dfScore'我写信给.xlsx。 代码如下,
<?php
require 'config.php';
$form_type = $_POST['form_type'];
if ($form_type == 'MCCV-F2'){
$region = $_POST['region'];
$province = $_POST['province'];
$municipality = $_POST['municipality'];
$barangay = $_POST['barangay'];
$period = $_POST['period'];
$form_type = $_POST['form_type'];
echo "NON COMPLIANT IN EDUCATION<br>";
echo "<br><br>MUNICIPALITY: ".$municipality;
echo "<br><br>BARANGAY: ".$barangay;
echo "<br><br>PERIOD: ".$period;
?>
<form name="get_forms_f2" action="" method="post">
<br><br>
<center><table border = 1 style =2 width=1800>
<tr>
<td><center><b>Household ID </center></td>
<td><center><b>Member ID </center></td>
<td><center><b>Name</center></td>
<td><center><b>Sex</center></td>
<td><center><b>HH Status</center></td>
<td><center><b>Grade Level </center></td>
<td><center><b>School ID</center></td>
<td><center><b>Name Of Dominant School</center></td>
<td><center><b>CV Remarks</center></td>
<td><center><b>Reason</center></td>
<td><center><b>Other Reason</center></td>
<td><center><b>Intervention</center></td>
</tr>
<?php
$sql = "SELECT A.family_id, A.barangay, A.person_id, A.gender, A.family_status, A.current_grade_level,
A.school_facility_id, A.school_facility_name, A.municipality, CONCAT(B.last_name, ', ',B.first_name) as 'name',
B.person_id,B.cv_remarks, B.reason, B.other_reason, B.intervention, B.status FROM roster AS A RIGHT JOIN compliance AS B ON A.person_id = B.person_id
WHERE B.period='$period' AND B.form_type='$form_type' AND A.municipality='$municipality' AND A.barangay='$barangay'";
$query=$conn->prepare($sql);
$query->execute();
$result= $query->fetchALL(PDO::FETCH_ASSOC);
$count=(int)$query->rowCount();
foreach ($result as $row){
$person_id[] = $row['person_id'];
echo "<tr>";
echo "<td>".$row['family_id']."</td>";
echo "<td>".$row['person_id']."</td>";
echo "<td>".$row['name']."</td>";
echo "<td>".$row['gender']."</td>";
echo "<td>".$row['family_status']."</td>";
echo "<td>".$row['current_grade_level']."</td>";
echo "<td>".$row['school_facility_id']."</td>";
echo "<td>".$row['school_facility_name']."</td>";
echo "<td><input type='text' name='cv_remarks[]' value='".$row['cv_remarks']."'></td>";
echo "<td><select name='reason[]'>";
if (is_null($row['reason'])){
$sql2= "SELECT reason_code, reason_desc FROM reasons WHERE form_type ='2' ORDER BY reason_code ASC";
echo "<option value=''>SELECT REASON FOR Non-Compliance</option>";
foreach($conn->query($sql2) as $row2){
echo "<option value='".$row2['reason_desc']."'>".$row2['reason_code']." - ".$row2['reason_desc']."</option>";
}
}
if (!is_null($row['reason'])){
$sql2= "SELECT reason_code, reason_desc FROM reasons WHERE form_type ='2' ORDER BY reason_code ASC";
echo "<option value='".$row['reason']."'>".$row['reason']." (SELECTED)"."</option>";
foreach($conn->query($sql2) as $row2){
echo "<option value='".$row2['reason_desc']."'>".$row2['reason_code']." - ".$row2['reason_desc']."</option>";
}
}
echo "</select></td>";
echo "<td><input type='text' name='other_reason[]' value='".$row['other_reason']."'></td>";
echo "<td><input type='text' name='intervention[]' value='".$row['intervention']."'></td>";
echo "</tr>";
}
}
?>
</table></center><br><br>
<input type="submit" name="submit" value="Save Data">
<?php
$sql3 = "UPDATE compliance SET reason='{$reason}' WHERE person_id='{$person_id}' AND form_type='$form_type' AND period='$period'";
$query = $conn->prepare($sql3);
$query->execute();
?>
</form>
代码writer = pandas.ExcelWriter('test.xlsx', engine='xlsxwriter')
dfScore.to_excel(writer,sheet_name='Sheet1')
writer.save()
花费差不多一个小时,dfScore.to_excel
需要一个小时。这是正常的吗?有不到10分钟的好方法吗?
我已经在stackoverflow中搜索了,但似乎有些建议没有解决我的问题。
答案 0 :(得分:1)
为什么不把它保存为.csv? 我在个人笔记本电脑上使用过较重的DataFrames,写同xlsx也有同样的问题。
your_dataframe.to_csv('my_file.csv',encoding='utf-8',columns=list_of_dataframe_columns)
然后您可以使用MS Excel或在线转换器将其简单地转换为.xlsx。
答案 1 :(得分:0)
代码dfScore.to_excel花了将近一个小时,代码writer.save()需要一个小时。这是正常的吗?
听起来有点太高了。我运行了一个XlsxWriter测试,编写1,000,000行x 5列,耗时约100秒。时间将根据测试机器的CPU和内存而有所不同,但1小时慢了36倍,这似乎不对。
注意,Excel和XlsxWriter每个工作表仅支持1,048,576行,因此您实际上会丢弃3/4的数据并浪费时间去做。
有不到10分钟的好方法吗?
对于纯XlsxWriter程序pypy提供了很好的加速。例如,用pypy重新运行我的1,000,000行x 5列测试用例,时间从99.15秒到16.49秒。我不知道熊猫是否与pypy一起工作。