C API — Mozilla DeepSpeech 0.9.3 documentation (original) (raw)

See also the list of error codes including descriptions for each error in Error codes.

int DS_CreateModel(const char *aModelPath, ModelState **retval)

An object providing an interface to a trained DeepSpeech model.

Return

Zero on success, non-zero on failure.

Parameters

void DS_FreeModel(ModelState *ctx)

Frees associated resources and destroys model object.

int DS_EnableExternalScorer(ModelState *aCtx, const char *aScorerPath)

Enable decoding using an external scorer.

Return

Zero on success, non-zero on failure (invalid arguments).

Parameters

int DS_DisableExternalScorer(ModelState *aCtx)

Disable decoding using an external scorer.

Return

Zero on success, non-zero on failure.

Parameters

int DS_AddHotWord(ModelState *aCtx, const char *word, float boost)

Add a hot-word and its boost.

Return

Zero on success, non-zero on failure (invalid arguments).

Parameters

int DS_EraseHotWord(ModelState *aCtx, const char *word)

Remove entry for a hot-word from the hot-words map.

Return

Zero on success, non-zero on failure (invalid arguments).

Parameters

int DS_ClearHotWords(ModelState *aCtx)

Removes all elements from the hot-words map.

Return

Zero on success, non-zero on failure (invalid arguments).

Parameters

int DS_SetScorerAlphaBeta(ModelState *aCtx, float aAlpha, float aBeta)

Set hyperparameters alpha and beta of the external scorer.

Return

Zero on success, non-zero on failure.

Parameters

int DS_GetModelSampleRate(const ModelState *aCtx)

Return the sample rate expected by a model.

Return

Sample rate expected by the model for its input.

Parameters

char *DS_SpeechToText(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize)

Use the DeepSpeech model to convert speech to text.

Return

The STT result. The user is responsible for freeing the string using DS_FreeString(). Returns NULL on error.

Parameters

Metadata *DS_SpeechToTextWithMetadata(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize, unsigned int aNumResults)

Use the DeepSpeech model to convert speech to text and output results including metadata.

Return

Metadata struct containing multiple CandidateTranscript structs. Each transcript has per-token metadata including timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.

Parameters

int DS_CreateStream(ModelState *aCtx, StreamingState **retval)

Create a new streaming inference state. The streaming state returned by this function can then be passed to DS_FeedAudioContent() and DS_FinishStream().

Return

Zero for success, non-zero on failure.

Parameters

void DS_FeedAudioContent(StreamingState *aSctx, const short *aBuffer, unsigned int aBufferSize)

Feed audio samples to an ongoing streaming inference.

Parameters

char *DS_IntermediateDecode(const StreamingState *aSctx)

Compute the intermediate decoding of an ongoing streaming inference.

Return

The STT intermediate result. The user is responsible for freeing the string using DS_FreeString().

Parameters

Metadata *DS_IntermediateDecodeWithMetadata(const StreamingState *aSctx, unsigned int aNumResults)

Compute the intermediate decoding of an ongoing streaming inference, return results including metadata.

Return

Metadata struct containing multiple candidate transcripts. Each transcript has per-token metadata including timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.

Parameters

char *DS_FinishStream(StreamingState *aSctx)

Compute the final decoding of an ongoing streaming inference and return the result. Signals the end of an ongoing streaming inference.

Return

The STT result. The user is responsible for freeing the string using DS_FreeString().

Note

This method will free the state pointer (aSctx).

Parameters

Metadata *DS_FinishStreamWithMetadata(StreamingState *aSctx, unsigned int aNumResults)

Compute the final decoding of an ongoing streaming inference and return results including metadata. Signals the end of an ongoing streaming inference.

Return

Metadata struct containing multiple candidate transcripts. Each transcript has per-token metadata including timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.

Note

This method will free the state pointer (aSctx).

Parameters

void DS_FreeStream(StreamingState *aSctx)

Destroy a streaming state without decoding the computed logits. This can be used if you no longer need the result of an ongoing streaming inference and don’t want to perform a costly decode operation.

Note

This method will free the state pointer (aSctx).

Parameters

void DS_FreeMetadata(Metadata *m)

Free memory allocated for metadata information.

void DS_FreeString(char *str)

Free a char* string returned by the DeepSpeech API.

char *DS_Version()

Returns the version of this library. The returned version is a semantic version (SemVer 2.0.0). The string returned must be freed with DS_FreeString().

Return

The version string.