Defines a ring buffer and some utility functions to prepare the input audio samples.
It maintains a Ring Buffer to hold input audio data. Clients could feed input audio data via `load` methods and access the aggregated audio samples via `getTensorBuffer` method.
Note that this class can only handle input audio in Float (in AudioFormat.ENCODING_PCM_16BIT) or Short (in AudioFormat.ENCODING_PCM_FLOAT). Internally it converts and stores all the audio
samples in PCM Float encoding.
Typical usage in Kotlin
val tensor = TensorAudio.create(format, modelInputLength) tensor.load(newData) interpreter.run(tensor.getTensorBuffer(), outputBuffer);
Another sample usage with AudioRecord
val tensor = TensorAudio.create(format, modelInputLength)
Timer().scheduleAtFixedRate(delay, period) {
tensor.load(audioRecord)
interpreter.run(tensor.getTensorBuffer(), outputBuffer)
}
Nested Classes
| class | TensorAudio.TensorAudioFormat | Wraps a few constants describing the format of the incoming audio samples, namely number of channels and the sample rate. | |
Public Methods
| static TensorAudio |
create(AudioFormat format, int sampleCounts)
Creates a
TensorAudio instance with a ring buffer whose size is sampleCounts *
format.getChannelCount(). |
| static TensorAudio |
create(TensorAudio.TensorAudioFormat format, int sampleCounts)
Creates a
AudioRecord instance with a ring buffer whose size is sampleCounts * format.getChannels(). |
| TensorAudio.TensorAudioFormat | |
| TensorBuffer |
getTensorBuffer()
Returns a float
TensorBuffer holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT i.e. |
| void |
load(short[] src)
Converts the input audio samples
src to ENCODING_PCM_FLOAT, then stores it in the ring
buffer. |
| void |
load(float[] src, int offsetInFloat, int sizeInFloat)
Stores the input audio samples
src in the ring buffer. |
| void |
load(short[] src, int offsetInShort, int sizeInShort)
Converts the input audio samples
src to ENCODING_PCM_FLOAT, then stores it in the ring
buffer. |
| int |
load(AudioRecord record)
Loads latest data from the
AudioRecord in a non-blocking way. |
| void |
load(float[] src)
Stores the input audio samples
src in the ring buffer. |
Inherited Methods
Public Methods
public static TensorAudio create (AudioFormat format, int sampleCounts)
Creates a TensorAudio instance with a ring buffer whose size is sampleCounts *
format.getChannelCount().
Parameters
| format | the AudioFormat required by the TFLite model. It defines
the number of channels and sample rate. |
|---|---|
| sampleCounts | the number of samples to be fed into the model |
public static TensorAudio create (TensorAudio.TensorAudioFormat format, int sampleCounts)
Creates a AudioRecord instance with a ring buffer whose size is sampleCounts * format.getChannels().
Parameters
| format | the expected TensorAudio.TensorAudioFormat of audio data loaded into this class. |
|---|---|
| sampleCounts | the number of samples to be fed into the model |
public TensorBuffer getTensorBuffer ()
Returns a float TensorBuffer holding all the available audio samples in AudioFormat.ENCODING_PCM_FLOAT i.e. values are in the range of [-1, 1].
public void load (short[] src)
Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring
buffer.
Parameters
| src | input audio samples in AudioFormat.ENCODING_PCM_16BIT. For
multi-channel input, the array is interleaved.
|
|---|
public void load (float[] src, int offsetInFloat, int sizeInFloat)
Stores the input audio samples src in the ring buffer.
Parameters
| src | input audio samples in AudioFormat.ENCODING_PCM_FLOAT. For
multi-channel input, the array is interleaved. |
|---|---|
| offsetInFloat | starting position in the src array |
| sizeInFloat | the number of float values to be copied |
Throws
| IllegalArgumentException | for incompatible audio format or incorrect input size |
|---|
public void load (short[] src, int offsetInShort, int sizeInShort)
Converts the input audio samples src to ENCODING_PCM_FLOAT, then stores it in the ring
buffer.
Parameters
| src | input audio samples in AudioFormat.ENCODING_PCM_16BIT. For
multi-channel input, the array is interleaved. |
|---|---|
| offsetInShort | starting position in the src array |
| sizeInShort | the number of short values to be copied |
Throws
| IllegalArgumentException | if the source array can't be copied |
|---|
public int load (AudioRecord record)
Loads latest data from the AudioRecord in a non-blocking way. Only
supporting ENCODING_PCM_16BIT and ENCODING_PCM_FLOAT.
Parameters
| record | an instance of AudioRecord |
|---|
Returns
- number of captured audio values whose size is
channelCount * sampleCount. If there was no new data in the AudioRecord or an error occurred, this method will return 0.
Throws
| IllegalArgumentException | for unsupported audio encoding format |
|---|---|
| IllegalStateException | if reading from AudioRecord failed |
public void load (float[] src)
Stores the input audio samples src in the ring buffer.
Parameters
| src | input audio samples in AudioFormat.ENCODING_PCM_FLOAT. For
multi-channel input, the array is interleaved.
|
|---|