As I see it , there are three slightly overlapping frequency ranges of interest in what we call "music" ...
The first is what I would call "Temporal" - this is the frequency at which individual musical events take place. Intuitively this is perceived as tempo/beats/rhythm. Temporal events occur in a range of frequencies with a low end of essentially "0" - Cage's 4:22 for example
- to perhaps 10-20 Hz. Above a certain (fairly low) frequency the individual events are perceived as cyclic (the "second" range - see below) and are no longer really temporal in the sense I've identified here.
The second range is what most folks would call "Cyclic" - with fundamental frequencies in the range of 16Hz to 4200Hz. Virtually all musical instruments used in modern compositions produce sounds within this range. This is not to say this is the only range of sounds they produce - it is just to say that the fundamental frequency is within this range.
The final range is what I would call "Timbreal" (OK - I made up that word). This is the range of frequencies that comprise the overtones and initial metallic percussion sub harmonics. It spans maybe 2000Hz on the low end to 20KHz and possibly beyond. It is important to note here that every particular instrument generates it's own somewhat unique upper harmonic signature.
For the "Best Sampling Rate" question I think that both the Cyclic and the Tambreal ranges are involved.
The important distinction here is that nearly everyone can hear the Cyclic range "critically". By this I mean that most people can hear fairly subtle anomalies in this region. As a result, having a detailed representation of this frequency range is very important to just about everyone.
The Tambreal range is far more challenging to hear with the same "critical" ear. There are people in the world who can hear in this range with great precision. I'm just not one of them
IMO, for the vast majority of listeners the Cyclic range (16Hz-4200Hz) is way more important than the Tambreal range.
And it turns out, for example, that even at 44K1 Samples/Second all frequencies in the Cyclic range are captured with less than 0.5 dB error (off of peak) and that means what I claim we perceive as "music" is captured quite well at the 44K1 sample rate. Frequencies in the Tambreal range are not recorded with the same accuracy but the errors in this range are not nearly as important - as long as the fundamental frequencies are captured properly the music sounds pretty darn good.
So why would anybody want to sample at a higher rate? The answer I give these days is two-fold ...
The first consideration is "Transients, Cymbals and Synths" because these are the three "musical" things that have significant and potentially important energy in the Tambreal region. A fast-rising transient could have energy exceeding 20K. A spectrograph of a cymbal reveals lots of energy above 10K. The complex waveforms produced by modern synths often contain energy above 8K - particularly during filter "sweeps".
The second consideration is Transition Band Width - explaining this in detail is difficult and kind'a puts people to sleep. Briefly, the Transition Band Width is the difference between the highest frequency you want to hear (the "pass band") and the Nyquist-Shannon limit of the Sample Rate / 2. Check this out :
Code:
S/S Transition Band Cents dB/Oct Fltr
44k1 22050 / 20000 = 1.1025 168.9 568.27 48
48k 24000 / 20000 = 1.2000 315.6 304.14 26
60K 30000 / 20000 = 1.5000 702.0 136.76 12
64k 32000 / 20000 = 1.6000 813.7 117.98 10
88k2 44100 / 20000 = 2.2050 1368.9 70.13 6
96k 48000 / 20000 = 2.4000 1515.6 63.34 6
S/S - Samples per Second
Transition Band - region between Nyquist-Shannon limit (SR/2) and highest frequency in the pass band
Cents - Transition Band width in 100ths of a semitone
dB/Oct - for aliasing to be -80dB
Fltr - Number of 12dB stages for -80dB anti-aliasing
Note that going from 44K1 to 48K almost doubles the size of the Transition Band Width. Going from 44K1 to 96K is nearly a 9X improvement. These improvements translate into less complex anti-aliasing filters as shown above. I should add that modern DACs use internal oversampling to reduce the complexity of the anti-aliasing filters and this seems to help a lot; at least on paper
peace y'all
pj