Lab 6 (Volume Algorithm)

For 6th SPO600 lab, I will be looking at various ways to change volume of sound using C programming language. This task may sound trivial, but choosing the right approach can save energy and extend battery life on some devices.

Let’s first take a look at digital representation of sound. Uncompressed digital sound is typically stored as signed 16-bit integer signal samples with 2 sound streams: one for the left and one for the right stereo channels, at typical sample rates of 44.1 or 48 thousand samples per second, for a total of 88.2 or 96 thousand samples per second. That is a lot of information and scaling it up or down can take some time (and energy!). I don’t know why, but I have that annoying habit of constantly changing volume while listening to some music with my cell phone (now I know why my phone battery doesn’t last that long). Changing the volume of sound is done by scaling each sample by a volume factor that can range from 0.0000 to 1.0000. You can imagine that applying this operation even to 1-minute track can put some pressure on CPU.

For this lab, I will present 3 algorithms, give some performance recordings and test code on both AArch64 & x86_64 machines. I will use 500 000 000 length array that stores some ‘fake’ sound samples and scale each sample by the volume factor 0.75.

Naive Approach

I will start with the simplest algorithm that multiplies audio samples by the floating point volume factor 0.75. The code does nothing truly special, it simply populates array with random data and scales every value. Note that I added redundant 3rd loop that sums all the sample values in order to prevent compiler optimiser from removing my scaling code.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#define LENGTH 500000000
#define VOLUME 0.75

int main() {

        srand(1);
        int16_t* arr;

        size_t i;
        long long sum = 0;

        arr = calloc(LENGTH, sizeof(int16_t));
        // Populate Array with Audio Samples
        for (i = 0; i< LENGTH; i++) {
                arr[i] = rand();
        }
        
        start();
        // 'Bring Volume Down'
        for (i=0;i<LENGTH;i++){
                arr[i] = (int16_t)(arr[i] * VOLUME);
        }
        stop();
        
        // Prevent compiler from eliminating my efforts
        for (i=0;i<LENGTH;i++) {
                sum += arr[i];
        }
                
        printf("Sum :  %ld \n", sum);
        free(arr);
        return 0;
}

Pre-calculated Look-up Table

For the second algorithm, I will create a lookup table of all possible sample values multiplied by the volume factor (0.75) and make code retrieve each sample in that table to get the scaled value. So, essentially, we move multiplication out of our second loop and use pre-calculated audio samples instead.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#define LENGTH 500000000
#define VOLUME 0.75
#define LOOKUP_SIZE 56636

int main() {

        srand(1);
        int16_t* arr;
        int16_t lookup[LOOKUP_SIZE];

        size_t i;

        long long sum = 0;
        arr = calloc(LENGTH, 16);

        // Populate Array with Audio Samples
        for (i = 0; i< LENGTH; i++) {
                arr[i] = rand();
        }

        // Pre-calculate Look-Up table
        for (i = 0; i< LOOKUP_SIZE; i++){
                lookup[i] = (int16_t)(i * VOLUME);
        }
        
        start();
        // Use lookup table to get scaled adudio samples
        for (i=0;i<LENGTH;i++){
                arr[i] = lookup[(u_int16_t)arr[i]];
        }
        stop();
        
        // Prevent compiler from eliminating my efforts
        for (i=0;i<LENGTH;i++) {
                sum += arr[i];
        }

        printf("Sum: %ld \n", sum);
        free(arr);
        return 0;
}

Note that int16_t lookup table is of size 65,535. This is because the range of unsigned uint16_t is 0… 65,535. All values from 0 to 65,535 get multiplied by the scalar factor (0.75). However, this creates a problem since our audio sample array can store both negative and positive values (signed data type range is -32,768 .. 32,767) and we can’t use negative integers as array index. In the second loop, audio sample value is cast to unsigned 16-bit int so it can be used as array index. After that, unsigned integer from the lookup table is cast back to signed audio sample.

Fix-point Integer Algorithm

For the last algorithm, I will move all the arithmetic back into the loop dedicated to volume calculations. I will convert volume scalar factor (0.75) to a fixed-point integer by multiplying it by 256 and then scale audio array as (arr[i]*volume_in_int)>>8 where right bit-shifting by 8 is meant to recover scaled value (previously shifted to the left). This should remove any floating-point calculations and thus result in better performance.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#define LENGTH 500000000
#define VOLUME 0.75

int main() {

        srand(1);
        int16_t* arr;
        size_t i;

        // Volume in int
        int16_t volume_in_int = VOLUME * 256;
        long long sum = 0;
        arr = calloc(LENGTH, 16);

        // Populate Array with Audio Samples
        for (i = 0; i< LENGTH; i++) {
                arr[i] = rand();
        }
        
        start();
        // Volume Calculation
        for (i=0;i<LENGTH;i++){
                arr[i] = ((arr[i]*volume_in_int)>>8);
        }
        stop();

        // Prevent compiler from eliminating my efforts
        for (i=0;i<LENGTH;i++) {
                sum += arr[i];
        }

        printf("Sum: %ld \n", sum);
        free(arr);
        return 0;
}

Performance Results:

Table below shows the runtime time results of 3 algorithms on 2 different architectures. It seems like fixed-point method was the fastest one : it performed better on both x86_64 and AArch64 which proves that floating-point based arithmetic can indeed hinder software performance. Look-up table solution showed slightly better results than the naive approach but it required additional memory for pre-calculated audio samples. Memory consumption may cause some problems on smaller devices, on the test aarch machine the algorithm did pretty well.

  x86_64 AArch64
Naive 1501 ms 1829 ms
Look-Up 1130 ms 1502 ms
Fix-Point Int 750 ms 763 ms
Written on November 20, 2017