Mean

0.1 Introduction

Measure of: Central tendency

0.2 Frequentist

AKA: Arithmetic mean; average; \(\bar{x}\) (sample mean); \(\mu\) (population mean); \(\mu_x\) (population mean)

Distinct from: Geometric mean (GM); Harmonic mean (HM); generalized mean/ Power mean; weighted arithmetic mean

English: Take a list of numbers, sum those numbers, and then divide by the number of numbers.

Formalization:

\[ \bar{x}=\frac{1}{n}(\sum^{n}_{i=1}x_i)=\frac{x_1+x_2+...+x_n}{n} \]

Cites: Wikipedia ; Wikidata ; Wolfram

Code

Documentation: mean: Arithmetic Mean

Examples:

x = c(1,2,3,4)
x
[1] 1 2 3 4
#Algorithm
x_bar = sum(x, na.rm=T)/length(x)
x_bar
[1] 2.5
#Base Function
x_bar = mean(x, na.rm=T)
x_bar
[1] 2.5

Documentation: numpy.mean

Examples:

x = [1,2,3,4]
print(x)
[1, 2, 3, 4]
#Algorithm
x_bar= sum(x)/len(x)
x_bar
2.5
#statistics Function
import statistics
x_bar = statistics.mean(x)
x_bar
2.5
#scipy Function
#<string>:1: DeprecationWarning: scipy.mean is deprecated and will be removed in SciPy 2.0.0, use numpy.mean instead
import scipy
x_bar = scipy.mean(x) 
<string>:1: DeprecationWarning: scipy.mean is deprecated and will be removed in SciPy 2.0.0, use numpy.mean instead
x_bar
2.5
#numpy Function
import numpy as np
x = np.array(x)
x_bar = x.mean()
x_bar
2.5

Documentation: PostgreSQL AVG Function

library(DBI)
# Create an ephemeral in-memory RSQLite database
#con <- dbConnect(RSQLite::SQLite(), dbname = ":memory:")
#dbListTables(con)
#dbWriteTable(con, "mtcars", mtcars)
#dbListTables(con)

#Configuration failed because libpq was not found. Try installing:
#* deb: libpq-dev libssl-dev (Debian, Ubuntu, etc)
#install.packages('RPostgres')
#remotes::install_github("r-dbi/RPostgres")
#Took forever because my file permissions were broken
#pg_lsclusters
require(RPostgres)
Loading required package: RPostgres
# Connect to the default postgres database
#I had to follow these instructions and create both a username and database that matched my ubuntu name
#https://www.digitalocean.com/community/tutorials/how-to-install-postgresql-on-ubuntu-20-04-quickstart
con <- dbConnect(RPostgres::Postgres())

DROP TABLE IF EXISTS t1;

CREATE TABLE IF NOT EXISTS t1 (
    id serial PRIMARY KEY,
    amount INTEGER
);
INSERT INTO t1 (amount)
VALUES
    (10),
    (NULL),
    (30);
SELECT  * FROM  t1;
3 records
id amount
1 10
2 NA
3 30

SELECT AVG(amount)::numeric(10,2) 
FROM t1;
1 records
avg
20
import torch

0.3 Bayesian

Bayesian average; Solving an age-old problem using Bayesian Average; Of bayesian average and star ratings; Bayesian Average Ratings ;

English: The Bayesian average is the weighted average of a prior and the observed sample average. When would you want this? When you have strong beliefs about the true mean, or when sample size is too small to reliable calculate a mean. For example a movie rating website where a movie may have only a single 5 star rating and so would rank higher than the Godfather with over a 100 almost all 5 star ratings.

Formalization:

\[ \bar{x}=\frac{C*m+(\sum^{n}_{i=1}x_i)}{c+n} \]

Where \(m\) is a prior for true mean, and \(C\) is a constant representing how many elements would be necessary to reliably estimate a sample mean.

Code