Category;currency;sellerRating;Duration;endDay;ClosePrice;OpenPrice;Competitive?
Music/Movie/Game;US;3249;5;Mon;0,01;0,01;No
Music/Movie/Game;US;3249;5;Mon;0,01;0,01;No
Music/Movie/Game;US;3249;5;Mon;0,01;0,01;No
Music/Movie/Game;US;3249;5;Mon;0,01;0,01;No
Music/Movie/Game;US;3249;5;Mon;0,01;0,01;No
Music/Movie/Game;US;3249;5;Mon;0,01;0,01;No
Music/Movie/Game;US;3249;5;Mon;0,01;0,01;No
Automotive;US;3115;7;Tue;0,01;0,01;No
Automotive;US;3115;7;Tue;0,01;0,01;No
Automotive;US;3115;7;Tue;0,01;0,01;Yes
实际文件中没有任何空白,否则将显示错误。我想从每个类别计算标准除法。
我试图用这个: statistics.stdev(),但是不起作用。 谁能帮助我,当您拥有遮阳篷时,可以解释一下,以便我学习。
from csv import DictReader
from collections import defaultdict
from statistics import median
from locale import setlocale
from locale import LC_ALL
from locale import atof
setlocale(LC_ALL, 'Dutch_Netherlands.1252')
median_names = 'sellerRating', 'Duration', 'ClosePrice', 'OpenPrice'
print ("Mediaan : ")
data = defaultdict(list)
with open('bijlage.txt') as f:
csvreader = DictReader(f, delimiter=';')
for dic in csvreader:
for header, value in dic.items():
data[header].append(value)
for median_name in median_names:
med = median(map(atof, data[median_name]))
print('{:<13} {:>10}'.format(median_name, med))
from collections import defaultdict
import csv
import locale
import statistics
from pprint import pprint, pformat
import locale
locale.setlocale(locale.LC_ALL, 'Dutch_Netherlands.1252')
avg_names = 'sellerRating', 'Duration', 'ClosePrice', 'OpenPrice'
averages = {avg_name: 0 for avg_name in avg_names}
seller_ratings = defaultdict(list)
num_values = 0
with open('bijlage.txt', newline='') as bestand:
csvreader = csv.DictReader(bestand, delimiter=';')
for row in csvreader:
num_values += 1
for avg_name in avg_names:
averages[avg_name] += locale.atof(row[avg_name])
seller_ratings[row['Category']].append(locale.atof(row['sellerRating']))
for avg_name, total in averages.items():
averages[avg_name] = total / num_values
print()
print('Averages:')
for avg_name in avg_names:
rounded = locale.format_string('%.2f', round(averages[avg_name], 2),
grouping=True)
print(' {:<13} {:>10}'.format(avg_name, rounded))
modes = {}
for category, values in seller_ratings.items():
try:
modes[category] = statistics.mode(values)
except statistics.StatisticsError:
modes[category] = None # No unique mode.
print()
print('Modes:')
for category, mode in modes.items():
if mode is None:
print(' {:<20} {:>10}'.format(category, '-'))
else:
rounded = locale.format_string('%.2f', round(mode, 2), grouping=True)
print(' {:<20} {:>10}'.format(category, rounded))
答案 0 :(得分:2)
In your previous questions, it was already described how to get the average, median and stuff like that: https://stackoverflow.com/a/54021108/8181134
Using the same, but than the .std()
function, you can get the standard deviation:
import pandas as pd
df = pd.read_csv('bijlage.csv', delimiter=';', decimal=',') # 'bijlage.txt' in your case
sellerRating_std = df['sellerRating'].std()
print('Seller rating standard deviation: {}'.format(sellerRating_std)
答案 1 :(得分:0)
First of all, please note that median_names = 'sellerRating', 'Duration', 'ClosePrice', 'OpenPrice'
does not do what you probably expect here.
What you need is to assign a tuple over which you iterate later, like this:
median_names = ('sellerRating', 'Duration', 'ClosePrice', 'OpenPrice')
having done that, you can compute the standard deviation just like you've computed the median:
from csv import DictReader
from collections import defaultdict
from statistics import median
from locale import setlocale
from locale import LC_ALL
from locale import atof
setlocale(LC_ALL, 'Dutch_Netherlands.1252')
stddev_names = ('sellerRating', 'Duration', 'ClosePrice', 'OpenPrice')
print ("std dev : ")
data = defaultdict(list)
with open('bijlage.txt') as f:
csvreader = DictReader(f, delimiter=';')
for dic in csvreader:
for header, value in dic.items():
data[header].append(value)
for name in stddev_name:
stddev_val = stdev(map(atof, data[name]))
print('{:<13} {:>10}'.format(name, stddev_val))
答案 2 :(得分:0)
要使用statistics
模块,您要走的第一条路(用于中位数):
setlocale(LC_ALL, 'Dutch_Netherlands.1252')
median_names = 'sellerRating', 'Duration', 'ClosePrice', 'OpenPrice'
print ("Mediaan : ")
data = defaultdict(list)
with open('bijlage.txt') as f:
csvreader = DictReader(f, delimiter=';')
for dic in csvreader:
for header, value in dic.items():
data[header].append(value)
for median_name in median_names:
med = median(map(atof, data[median_name]))
print('{:<13} {:>10}'.format(median_name, med))
这部分没有更改,您只需要在其后立即处理stdev,因为您可以使用相同的data
列表字典:
from statistics import stdev
print("\nStd Dev (sample)")
for median_name in median_names:
std= stdev(map(atof, data[median_name]))
print('{:<13} {:>10}'.format(median_name, std))