预测频繁值的出现

时间:2018-05-31 02:27:24

标签: python python-3.x

您好我获得了一系列销售数据,并被要求根据当前价值或我经常出售的销售数据预测可以进行多少类似的销售。

我打算使用excel和tableau但是想利用python的强大功能完成任务。

<html>
  <head></head>
  <body>
    <h2>SALES DATA</h2>
<table width="211">
<tbody>
<tr>
<td>SALE 1</td>
<td>12</td>
</tr>
<tr>
<td>SALE 2</td>
<td>14</td>
</tr>
<tr>
<td>SALE 3</td>
<td>2</td>
</tr>
<tr>
<td>SALE 4</td>
<td>18</td>
</tr>
<tr>
<td>SALE 5</td>
<td>23</td>
</tr>
<tr>
<td>SALE 6</td>
<td>19</td>
</tr>
<tr>
<td>SALE 7</td>
<td>25</td>
</tr>
<tr>
<td>SALE 8</td>
<td>17</td>
</tr>
<tr>
<td>SALE 9</td>
<td>9</td>
</tr>
<tr>
<td>SALE 10</td>
<td>26</td>
</tr>
<tr>
<td>SALE 11</td>
<td>16</td>
</tr>
<tr>
<td>SALE 12</td>
<td>34</td>
</tr>
<tr>
<td>SALE 13</td>
<td>3</td>
</tr>
<tr>
<td>SALE 14</td>
<td>31</td>
</tr>
<tr>
<td>SALE 15</td>
<td>25</td>
</tr>
<tr>
<td>SALE 16</td>
<td>9</td>
</tr>
<tr>
<td>SALE 17</td>
<td>6</td>
</tr>
<tr>
<td>SALE 18</td>
<td>24</td>
</tr>
<tr>
<td>SALE 19</td>
<td>15</td>
</tr>
<tr>
<td>SALE 20</td>
<td>14</td>
</tr>
<tr>
<td>SALE 21</td>
<td>36</td>
</tr>
<tr>
<td>SALE 22</td>
<td>32</td>
</tr>
<tr>
<td>SALE 23</td>
<td>35</td>
</tr>
<tr>
<td>SALE 24</td>
<td>14</td>
</tr>
<tr>
<td>SALE 25</td>
<td>19</td>
</tr>
<tr>
<td>SALE 26</td>
<td>20</td>
</tr>
<tr>
<td>SALE 27</td>
<td>6</td>
</tr>
<tr>
<td>SALE 28</td>
<td>1</td>
</tr>
<tr>
<td>SALE 29</td>
<td>24</td>
</tr>
<tr>
<td>SALE 30</td>
<td>3</td>
</tr>
<tr>
<td>SALE 31</td>
<td>16</td>
</tr>
<tr>
<td>SALE 32</td>
<td>13</td>
</tr>
<tr>
<td>SALE 33</td>
<td>14</td>
</tr>
<tr>
<td>SALE 34</td>
<td>35</td>
</tr>
<tr>
<td>SALE 35</td>
<td>34</td>
</tr>
<tr>
<td>SALE 36</td>
<td>15</td>
</tr>
<tr>
<td>SALE 37</td>
<td>24</td>
</tr>
<tr>
<td>SALE 38</td>
<td>19</td>
</tr>
<tr>
<td>SALE 39</td>
<td>9</td>
</tr>
<tr>
<td>SALE 40</td>
<td>12</td>
</tr>
<tr>
<td>SALE 41</td>
<td>9</td>
</tr>
<tr>
<td>SALE 42</td>
<td>5</td>
</tr>
<tr>
<td>SALE 43</td>
<td>32</td>
</tr>
<tr>
<td>SALE 44</td>
<td>13</td>
</tr>
<tr>
<td>SALE 45</td>
<td>33</td>
</tr>
<tr>
<td>SALE 46</td>
<td>4</td>
</tr>
<tr>
<td>SALE 47</td>
<td>25</td>
</tr>
<tr>
<td>SALE 48</td>
<td>7</td>
</tr>
<tr>
<td>SALE 49</td>
<td>16</td>
</tr>
<tr>
<td>SALE 50</td>
<td>32</td>
</tr>
<tr>
<td>SALE 51</td>
<td>33</td>
</tr>
<tr>
<td>SALE 52</td>
<td>14</td>
</tr>
<tr>
<td>SALE 53</td>
<td>31</td>
</tr>
<tr>
<td>SALE 54</td>
<td>18</td>
</tr>
<tr>
<td>SALE 55</td>
<td>17</td>
</tr>
<tr>
<td>SALE 56</td>
<td>5</td>
</tr>
<tr>
<td>SALE 57</td>
<td>25</td>
</tr>
<tr>
<td>SALE 58</td>
<td>22</td>
</tr>
<tr>
<td>SALE 59</td>
<td>8</td>
</tr>
<tr>
<td>SALE 60</td>
<td>18</td>
</tr>
<tr>
<td>SALE 61</td>
<td>23</td>
</tr>
<tr>
<td>SALE 62</td>
<td>10</td>
</tr>
</tbody>
</table>
  </body>
</html>

你能帮忙吗?

1 个答案:

答案 0 :(得分:1)

这可能会有所帮助

from xml.etree.ElementTree import ElementTree
from collections import defaultdict

mydoc=ElementTree(file='Stack.html')
value=0
sales=[]
values=[]

for e in mydoc.findall('.//tr'):
    sales.append(e.find('.//td[1]').text)
    values.append(e.find('.//td[2]').text)
mylist=list(zip(values,sales))
mydict=defaultdict(list)
for values,sales in mylist:
    mydict[values].append(sales)
print (mydict.items())

您可以使用itervalue(首选)获取您可能使用的网站价值BeutifulSoupNLTK