Defines a ring buffer and some utility functions to prepare the input audio samples.
It maintains a Ring Buffer to hold input audio data. Clients could feed input audio data via `load` methods and access the aggregated audio samples via `getTensorBuffer` method.
Note that this class can only handle input audio in Float (in AudioFormat.ENCODING_PCM_16BIT
) or Short (in AudioFormat.ENCODING_PCM_FLOAT
). Internally it converts and stores all the audio
samples in PCM Float encoding.
Typical usage in Kotlin
val tensor = TensorAudio.create(format, modelInputLength) tensor.load(newData) interpreter.run(tensor.getTensorBuffer(), outputBuffer);
Another sample usage with AudioRecord
val tensor = TensorAudio.create(format, modelInputLength) Timer().scheduleAtFixedRate(delay, period) { tensor.load(audioRecord) interpreter.run(tensor.getTensorBuffer(), outputBuffer) }
Nested Classes
class | TensorAudio.TensorAudioFormat | Wraps a few constants describing the format of the incoming audio samples, namely number of channels and the sample rate. |
Public Methods
static TensorAudio |
create(AudioFormat format, int sampleCounts)
Creates a
TensorAudio instance with a ring buffer whose size is sampleCounts *
format.getChannelCount() . |
static TensorAudio |
create(TensorAudio.TensorAudioFormat format, int sampleCounts)
Creates a
AudioRecord instance with a ring buffer whose size is sampleCounts * format.getChannels() . |
TensorAudio.TensorAudioFormat | |
TensorBuffer |
getTensorBuffer()
Returns a float
TensorBuffer holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT i.e. |
void |
load(short[] src)
Converts the input audio samples
src to ENCODING_PCM_FLOAT, then stores it in the ring
buffer. |
void |
load(float[] src, int offsetInFloat, int sizeInFloat)
Stores the input audio samples
src in the ring buffer. |
void |
load(short[] src, int offsetInShort, int sizeInShort)
Converts the input audio samples
src to ENCODING_PCM_FLOAT, then stores it in the ring
buffer. |
int |
load(AudioRecord record)
Loads latest data from the
AudioRecord in a non-blocking way. |
void |
load(float[] src)
Stores the input audio samples
src in the ring buffer. |
Inherited Methods
Public Methods
public static TensorAudio create (AudioFormat format, int sampleCounts)
Creates a TensorAudio
instance with a ring buffer whose size is sampleCounts
*
format.getChannelCount()
.
Parameters
format | the AudioFormat required by the TFLite model. It defines
the number of channels and sample rate. |
---|---|
sampleCounts | the number of samples to be fed into the model |
public static TensorAudio create (TensorAudio.TensorAudioFormat format, int sampleCounts)
Creates a AudioRecord
instance with a ring buffer whose size is sampleCounts
* format.getChannels()
.
Parameters
format | the expected TensorAudio.TensorAudioFormat of audio data loaded into this class. |
---|---|
sampleCounts | the number of samples to be fed into the model |
public TensorBuffer getTensorBuffer ()
Returns a float TensorBuffer
holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT
i.e. values are in the range of [-1, 1].
public void load (short[] src)
Converts the input audio samples src
to ENCODING_PCM_FLOAT, then stores it in the ring
buffer.
Parameters
src | input audio samples in AudioFormat.ENCODING_PCM_16BIT . For
multi-channel input, the array is interleaved.
|
---|
public void load (float[] src, int offsetInFloat, int sizeInFloat)
Stores the input audio samples src
in the ring buffer.
Parameters
src | input audio samples in AudioFormat.ENCODING_PCM_FLOAT . For
multi-channel input, the array is interleaved. |
---|---|
offsetInFloat | starting position in the src array |
sizeInFloat | the number of float values to be copied |
Throws
IllegalArgumentException | for incompatible audio format or incorrect input size |
---|
public void load (short[] src, int offsetInShort, int sizeInShort)
Converts the input audio samples src
to ENCODING_PCM_FLOAT, then stores it in the ring
buffer.
Parameters
src | input audio samples in AudioFormat.ENCODING_PCM_16BIT . For
multi-channel input, the array is interleaved. |
---|---|
offsetInShort | starting position in the src array |
sizeInShort | the number of short values to be copied |
Throws
IllegalArgumentException | if the source array can't be copied |
---|
public int load (AudioRecord record)
Loads latest data from the AudioRecord
in a non-blocking way. Only
supporting ENCODING_PCM_16BIT and ENCODING_PCM_FLOAT.
Parameters
record | an instance of AudioRecord |
---|
Returns
- number of captured audio values whose size is
channelCount * sampleCount
. If there was no new data in the AudioRecord or an error occurred, this method will return 0.
Throws
IllegalArgumentException | for unsupported audio encoding format |
---|---|
IllegalStateException | if reading from AudioRecord failed |
public void load (float[] src)
Stores the input audio samples src
in the ring buffer.
Parameters
src | input audio samples in AudioFormat.ENCODING_PCM_FLOAT . For
multi-channel input, the array is interleaved.
|
---|