Go to: Title Page Chapter 1 Appendix A Appendix J Copyright Chapter 2 Appendix B Appendix K Abstract Chapter 3 Appendix C Appendix L Acknowledgements Chapter 4 Appendix D References Table of Contents Chapter 5 Appendix E List of Figure Conclusions/ Appendix F List of Tables Future Directions Appendix G List of Audio Examples Appendix H List of Programs Appendix I
I. Motivation and Background
One of the first issues addressed in this thesis was how a C++ program should read/write from/to standard audio files. The simplicity, flexibility, and robustness of the supporting code to accomplish these task was of utmost importance, as most code (with the exception of real-time efforts) written in this thesis used AIFF soundfiles for both input and output of data.
Although much code already exists in C that deals with audio file manipulation, many programming interfaces involved with such code suffer from several problems: many architectures are not structured for real-time input/output; the complexity of the code dealing with file I/O is often too high; random access of soundfiles is not often supported, etc.
This appendix documents an object-oriented C++ class which hopes to alleviate some of these problems. Specifically, class Aiff_file is a C++ class which models a standard AIFF (Audio Interchange File Format) digital audio file. class Aiff_file can handle any sample rate, sample bit-width, and number of channels that are accommodated by the AIFF specification; furthermore, normalized and non-normalized values can be read/written from/to a file using overloaded member functions. class Aiff_file provides random access to the soundfile, with switchable forward and reverse reading of the file. class Aiff_file is internally buffered for faster input and output of data. class Aiff_file also employs a unique console-like interface so that the user can easily check human-readable error codes that are produced by class.
This code was written during Fall 1995, for
Music 103 (Analysis, Synthesis, and Perception of Timbre, taught
by Charles Dodge) at Dartmouth College.
II. Specific design issues
A) I/O Speed.
Many audio file support architectures call costly operating system routines for every single user disk I/O request. At the relatively high bandwidth of CD audio (44.1 kHz), the calls to the operating system become inefficient, and significantly decrease overall processing speed. The problem is most pronounced in the cases where a programmer wishes to hear the processing of an existing soundfile in real-time, as the output is being produced.
To alleviate this problem, an intermediate
buffer was introduced into the disk read/disk write procedures,
so that only relatively large blocks of samples are read from
or written to disk at a time; intermediate read/write requests
are either taken from or written to the intermediate buffer. This
way, the operating system only needs to be called a few times
for disk I/O. Buffering significantly increases performance and
makes real-time processing possible. However, the implementation
of the random access functions discussed below are more complex
in design due to this internal buffering.
B) Ease of audio sample handling.
Many applications require sample values to be of different types (int, long, float, double, etc.). For example, a microprocessor that handles floating point data well will probably execute signal processing code faster if presented with floating point samples. However, some applications run faster when the audio samples are presented as long integers. Therefore, in reading and writing these samples from and to disk, it would simplify programming if there were only one function, with one name, that handled all of these different types.
The C++ programming language provides a solution
to this common problem with concept of overloaded function writing.
Overloaded functions are functions with the same names that take
different types of arguments. Consequently, several different
versions of a procedure can be written, all with the same name.,
thus simplifying the naming schemes in a large software project.
A particular "version" of the procedure is called for
a particular combination of parameter types. In class Aiff_file,
therefore, several read_frames and write_frames functions have
been written, one for each type of parameter data. Since the AIFF
native type is two's complement integer, all of the read and write
procedures automatically convert from two's complement 16-bit
integer to the parameter type, without the user having to the
conversion himself. For example, the following code reads the
first 500 frames (one frame of audio contains one sample per channel
per sample period - a stereo frame is two samples, a mono frame
is one sample, etc.) of a 44.1kHz, 16-bit stereo audio file named
"in.aiff" into buffer1, reads the next 500 frames of
audio into buffer2, and the next 500 frames of audio into buffer3.
Note that although the three buffers have different types, the
same overloaded function (read_frames) is used to read the data
from the file. Also, all sample values in the buffers after successful
completion of the read_frames operations will contain converted
values (integers, floats, and normalized values between -1 and
1, respectively).
include "ac.h"
...
Aiff_file input_file("in.aiff",44100,STEREO,
AC_WIDTH_16_BIT_SAMPLE,AC_READ_ONLY)
...
width_16_bit_sample buffer1[STEREO*500];
float_sample buffer2[STEREO*500];
norm_float_sample buffer3[STEREO*500];
...
input_file.read_frames(buffer1,500)
input_file.read_frames(buffer2,500)
input_file.read_frames(buffer3,500)
C) Error-checking.
A major issue in all software design is how to test and debug code. Many architectures for audio manipulation do not provide good error handling mechanisms, or the error codes returned by audio procedures are often cryptic. class Aiff_file tries to alleviate this problem by providing a public "status" data member that continuously provides the status of the soundfile. Typically, the user can check the status variable for each instantiation of the Aiff_file object to see if a previous operation, such as a file open, read, or write was accomplished successfully.
The type of the status variable is significant;
unlike most error-handling routines, an error code is usually
some kind of flag or integer value, which a user must cross-reference
in a header file. In class Aiff_file, the status member is of
type char*; all errors are actually global string variables which
can simply be sent to the standard output or standard error for
easy identification. In order for this scheme to work, one must
compile a file which 1) includes the error code declarations (contained
in the header file ac.h), and 2) links with the compiled error
code definitions (defined in ac.error.cc). For example, the following
code tries to open a 22050 Hz, mono 16-bit file, and upon failure,
prints the error to standard error. The error will be a string
with a human-readable text message.
Aiff_file output_file("out.aiff",22050,MONO,
AC_WIDTH_16_BIT_SAMPLE,AC_WRITE_ONLY)
if (output_file.get_status() != AC_FILE_OPEN) {
cerr << "output file error: " <<
output_file.get_status();
exit(1);
}
III. General Usage
class Aiff_file is designed to be as simple
as possible; the best way to learn its usage, therefore, is to
refer to the included examples. The user must #include "ac.h"
to declare class Aiff_file and its member functions and data.
All example programs in this section can be compiled with the
following command. The audio libraries libaudio.a and libaudiofile.a,
available in standard releases of the Irix operating system, are
required for compilation:
g++ -o output_binary <filename.cc>
ac.cc ac.error.cc -laudio -laudiofile
ac.cc provides definitions for class Aiff_file's
member functions, and ac.error.cc provides definitions for the
error codes referenced in Part II.C, above. The output will be
output_binary, which can be executed directly at the command line.
A) Construction and destruction of Aiff_file objects.
Soundfiles are either opened for input or output
by creating the object with certain sets of parameters. Upon successful
completion, the status variable (obtained by calling member function
get_status() ) is set to AC_FILE_OPEN.
The destructors are called when the object goes out of scope.
However, it is important that the destructor is called either
directly or indirectly for write-only files, as the internal output
buffer is flushed by the destructor. Please note that there is
no Makefile for any of these programs; they need to be compiled
individually.
The legal constructions of Aiff_files are as
follows.
1) open a file for reading with a pre-determined
set of characteristics
Aiff_file::Aiff_file(char* _filename, unsigned _sr,
unsigned _num_channels, unsigned _sample_width, AC_READ_ONLY)
2) open a file for reading with unknown characteristics
Aiff_file::Aiff_file(char* _filename, int _mode)
3) open a file for writing with a pre-determined
set of characteristics
Aiff_file::Aiff_file(char* _filename, unsigned _sr,
unsigned _num_channels, unsigned _sample_width, AC_READ_ONLY)
B) Getting soundfile information and status.
A soundfile that is opened with constructor A.2 above must be queried to find what sample rate, number of channels, and sample width the file has, so that appropriate buffers can be set up for reading and writing sample data. The following information routines are available to query for this and other information:
The status variable is not altered for these
functions.
Get the sample rate.
float
get_sr() { return sr; }
Get the number of channels.
unsigned
get_num_channels() { return num_channels; }
Get the sample width, in bits.
unsigned
get_sample_width() { return sample_width; }
Get the filename.
char*
get_filename() { return filename; }
Get the status of the file. Recall that ac_error is actually just char*, so the output can be printed out directly for debugging purposes.
ac_error
get_status() { return status; }
Return which mode the file is in: read-only or write-only.
int
get_mode() { return mode; }
Return the current position, in frames.
unsigned
long get_now_in_frames() { return now; }
.
Return the current position, in seconds
float
get_now_in_seconds(){return ((float)now/(float)sr);}
Return the soundfile's total length in frames.
unsigned
long get_length_in_frames () { return length; }
Return the soundfiles total length in seconds.
float
get_length_in_seconds()
{return((float)length/(float)sr);}
Return the current position, in terms of percentage of the way through the file.
float
get_now_in_percent()
{return(((float)now)/((float)length));}
Return the frame size, in bytes.
unsigned
get_frame_size() { return frame_size; }
For a write-only file, return how many samples have been clipped because they were out of range for the soundfile's bitwidth. Typically, this value is non-zero if the user is writing normalized samples that have a magnitude greater than 1.
unsigned long
get_samples_out_of_range(){return samples_out_of_range;}
2) Reading and writing sample data
As mentioned above in II.B, all read and write functions are overloaded for several different types. Therefore, in the function prototypes listed below, the <type> can be replaced with the following types: width_6_bit_sample, width_6_bit_sample, width_8_bit_sample, width_12_bit_sample, width_16_bit_sample, width_20_bit_sample, width_24_bit_sample, width_30_bit_sample, width_32_bit_sample, float_sample, and norm_float_sample. Note that the width* types are two's complement integers, while the float_sample is actually just a float. The norm_float_sample is a double; however, all values contained by a norm_float_sample are intended to be normalized between -1.0 and +1.0. Conversion to the destination (passed) type is done automatically by the overloaded function.
Also note that one frame of audio is equivalent to x samples of audio, where x is the number of channels of audio in the file. Therefore, in allocating memory that is passed into and out of these functions, allocate (num_channels * desired_size) elements to the buffer you intend to pass to the function. Frames of audio are passed and returned sequentially, with each channel of audio being presented sequentially in each frame of audio (e.g. for a stereo buffer using floating point sample values, the buffer will contain these samples, in order: 0L, 0R, 1L, 1R, 2L, 2R, etc., where 0,1,2 represent the frame number, and L, R represent the channel number).
All of these functions return the actual amount
of sound data read, in appropriate units: frames of audio read/written
or seconds of audio read/written.
Read multiple frames of audio into buffer.
unsigned long
read_frames(<type>* buffer, unsigned long frames);
Read one frame of audio into buffer.
unsigned long
read_frame(<type>* buffer);
Read a designated amount of time in seconds into buffer.
float
read_time(<type>* buffer, float time);
3) Transport controls.
These controls are designed to model a standard
tape deck's transport controls. The user should get_status() after
these calls to see if the transport was successful, went past
the beginning or the end of the file, etc.
Go to a designated frame number.
void
goto_frame(unsigned long target_frame);
Go to a specific time in seconds.
void
goto_time(float target_time);
Go forward or backward a specified number of frames. Use a negative value for backwards, use a positive value for forwards.
void
shuttle_frames(long long delta_frames);
Go forward or backward a specified number of seconds. Use a negative value for backwards, use a positive value for forwards.
void
shuttle_time(double delta_time);
Go to the beginning of the soundfile.
void
goto_start() { goto_frame(1); }
Go to the end of the soundfile.
void
goto_end() { goto_frame(length + 1); }
Go to a normalized time value, where 0 represents the beginning of the file, and 1 represents the end of the file.
void
goto_normalized_time(double target_percentage);
Set the current transport direction, FORWARD, or REVERSE.
void
set_direction(int _direction)
{ direction = _direction; }
Get the current transport direction, FORWARD or REVERSE.
int
get_direction() { return direction; }
IV. Examples
The following table lists some examples that
make exclusive use of class Aiff_file.
V. Future directions
An improvement on this code would be the ability
of a file to have an append mode, where a user could both read
from and write to an AIFF file. As Silicon Graphics audio libraries
stand right now, there are no facilities that exist that can alter
an existing soundfile in place.
Go to: Title Page Chapter 1 Appendix A Appendix J Copyright Chapter 2 Appendix B Appendix K Abstract Chapter 3 Appendix C Appendix L Acknowledgements Chapter 4 Appendix D References Table of Contents Chapter 5 Appendix E List of Figure Conclusions/ Appendix F List of Tables Future Directions Appendix G List of Audio Examples Appendix H List of Programs Appendix I