audio compression

sample rate * bit depth * number of channels * time

44,100 samples/second * 16 bits * 2 * 200

1. How much space does a song take?

2 bytes of dynamic range * 2 channels * 44,100 samples/second * 210 seconds per song

(* 2 2 44100 210)

37044000

128 MB.

16 bits of dynamic range is probably overkill.

Human hearing is basically logarithmic:
you perceive a multiplicative increase in amplitude as an additive increase in volume.
Upshot: small differences in amplitude matter less at higher amplitudes.
So we could just throw away a lot of the resolution at higher amplitudes.

https://en.wikipedia.org/wiki/Mu-law_algorithm

In most songs, the two channels are going to be highly correlated.
We can instead encode them as a single channel plus a difference side-channel.
The difference channel is going to be mostly low-amplitude, so it can be highly compressed.
Optimistically, this gets us 2x compression.

Maybe some applications don't need 44,100 samples/second.

Plain ol' telephone service (POTS):

Information theory tells us that

COMPRESSIBLE = NOT RANDOM

Random samples constitute “noise”.
Theoretically, this means that we ought to be able to produce a model that predicts sample values with less data.
(Probably not perfectly.)
(And probably in small chunks.)
If the model is good enough, we can just forget the original signal.
Otherwise, we can store the model + the residue (difference between model and signal).

Maybe we just use splines?

This is roughly how lossless compression schemes like FLAC work.

A loud sound will mask quieter sounds at nearby frequencies: you won't hear them (well).

MP3 (and related schemes like Ogg Vorbis) rely heavily on this.