如何计算置信区间并将其绘制在条形图上

时间:2019-08-01 20:58:32

标签: matlab plot bar-chart confidence-interval

如何在

中绘制条形图

data = 1x10 cell ,其中单元格中的每个值都有不同的尺寸,例如3x100、3x40、66x2等。

我的目标是得到一个条形图,其中我将有10组条形,每组中每个值三个条形。在栏上,我希望将其显示为值的中位数,并希望计算置信区间并另外显示它。

picture of how I want it to look

在此示例中,没有一组条形图,但我的意思是向您展示如何显示置信区间。在site上,我找到了这个示例,他们提供了具有此命令行的解决方案

e1 = errorbar(mean(data), ci95);

但是我有一个问题,它找不到任何 ci95

那么,在没有安装或下载其他服务的情况下,还有其他有效的方法吗?

2 个答案:

答案 0 :(得分:1)

由于我不确定您的数据看起来如何,因为在您的问题中您说过单元格的元素包含维度不同的数据,例如

  

3x100、3x40、66x2

我假设您的数据可以按列或行排列,并且并非所有数据都需要三个条形。

由于您没有提供一小段数据供我们测试,因此我会生成一些人工数据:

data = cell(1,10);

% Random length of the data
l = randi(500, 10, 1) + 50;  

% Random "width" of the data, with 3 more likely
w = randi(4, 10, 1);
w(w==4) = 3;
% random "direction" of the data
d = randi(2, 10, 1);

% sigma of the data (in fraction of mean)
sigma = rand(10,1) / 3;

% means of the data
dmean = randi(150,10,1);
dsigma = dmean.*sigma;

for c = 1 : 10
    if d(c) == 1
        data{c} = randn(l(c), w(c)) .* dsigma(c) + dmean(c);
    else
        data{c} = randn(w(c), l(c)) .* dsigma(c) + dmean(c);
    end
end

下一件事是

  

在栏上,我希望它显示为值的中位数,并且我想计算置信区间并另外显示它。

您确定要绘制中位数吗?某些数据的中位数与数据的方差无关,因此不需要任何类型的误差线。我想你想表明平均值。如果您真的想显示中位数,则box plot可能是更好的选择。

以下代码以条形图计算并绘制均值:

means = zeros(numel(data),3);
stds = zeros(numel(data),3);
n = zeros(numel(data),3);
for c = 1:numel(data)
    d = data{c};
    if size(d,1) < size(d,2)
        d = d';
    end
    cols = size(d,2);
    means(c, 1:cols) = nanmean(d);
    stds(c, 1:cols) = nanstd(d);
    n(c, 1:cols) = sum(~isnan((d)));
end

b = bar(means);

现在,我们需要计算误差线的长度。典型的选择是standard deviation of the data(已经由上面的代码计算出,存储在stds中),standard error或95%置信区间(假设标准误差为1.96倍)基础数据遵循normal distribution)。

% for standard deviation use stds

% for standard error
ste = stds./sqrt(n);

% for 95% confidence interval
ci95 = 1.96 * ste;

最后是绘制误差线。在这里,我按照您在问题中的要求选择了ci95,如果要更改它,只需将调用中的变量更改为errorbar

for c = 1:3
    size(means(:, c))
    size(b(c).XData)
    e = errorbar(b(c).XData + b(c).XOffset, means(:,c), ci95(:, c));
    e.LineStyle = 'none';
end

enter image description here

答案 1 :(得分:1)

我发现Patrick Happel的答案不起作用,因为随后调用b清除了图形窗口(因此变量errorbar)。只需添加hold on命令即可​​解决此问题。为避免混淆,这是一个新的答案,该答案重现了Patrick的所有原始代码以及我的一些小改动:

%% Old answer
%Just to be safe, let's clear everything
clear all

data = cell(1,10);

% Random length of the data
l = randi(500, 10, 1) + 50;  

% Random "width" of the data, with 3 more likely
w = randi(4, 10, 1);
w(w==4) = 3;
% random "direction" of the data
d = randi(2, 10, 1);

% sigma of the data (in fraction of mean)
sigma = rand(10,1) / 3;

% means of the data
dmean = randi(150,10,1);
dsigma = dmean.*sigma;

for c = 1 : 10
    if d(c) == 1
        data{c} = randn(l(c), w(c)) .* dsigma(c) + dmean(c);
    else
        data{c} = randn(w(c), l(c)) .* dsigma(c) + dmean(c);
    end
end
%============================================
%Next thing is 
%    On the bar, I want it to be shown the median of the values, and I
%    want to calculate the confidence interval and show it additionally.
%
%Are you really sure you want to plot the median? The median of some data
%is not connected to the variance of the data, and hus no type of error
%bars are required. I guess you want to show the mean. If you really want
%to show the median, a box plot might be a better alternative.
%
%The following code computes and plots the mean in a bar plot:
%============================================
means = zeros(numel(data),3);
stds = zeros(numel(data),3);
n = zeros(numel(data),3);
for c = 1:numel(data)
    d = data{c};
    if size(d,1) < size(d,2)
        d = d';
    end
    cols = size(d,2);
    means(c, 1:cols) = nanmean(d);
    stds(c, 1:cols) = nanstd(d);
    n(c, 1:cols) = sum(~isnan((d)));
end

b = bar(means);

%% New code
%This ensures that b continues to reference existing data in the next for
%loop, as the graphics objects can otherwise be deleted.  
hold on
%% Continuing Patrick Happel's answer
%============================================
%Now, we need to compute the length of the error bars. Typical choices are
%the standard deviation of the data (already computed by the code above,
%stored in stds), the standard error or the 95% confidence interval (which
%is the 1.96fold of the standard error, assuming the underlying data
%follows a normal distribution).
%============================================
% for standard deviation use stds

% for standard error
ste = stds./sqrt(n);

% for 95% confidence interval
ci95 = 1.96 * ste;
%============================================
%Last thing is to plot the error bars. Here I chose the ci95 as you asked
%in your question, if you want to change that, simply change the variable
%in the call to errorbar:
%============================================
for c = 1:3
    size(means(:, c))
    size(b(c).XData)
    e = errorbar(b(c).XData + b(c).XOffset, means(:,c), ci95(:, c));
    e.LineStyle = 'none';
end