Data Structures | Defines | Typedefs | Enumerations | Functions

src/libpocketsphinx/acmod.h File Reference

Acoustic model structures for PocketSphinx. More...

#include <stdio.h>
#include <cmd_ln.h>
#include <logmath.h>
#include <fe.h>
#include <feat.h>
#include <bitvec.h>
#include <err.h>
#include "ps_mllr.h"
#include "bin_mdef.h"
#include "tmat.h"
#include "hmm.h"

Go to the source code of this file.

Data Structures

struct  ps_mllr_s
 Feature space linear transform structure. More...
struct  ps_mgaufuncs_s
struct  ps_mgau_s
struct  acmod_s
 Acoustic model structure. More...

Defines

#define ps_mgau_base(mg)   ((ps_mgau_t *)(mg))
#define ps_mgau_frame_eval(mg, senscr, senone_active, n_senone_active, feat, frame, compallsen)
#define ps_mgau_transform(mg, mllr)   (*ps_mgau_base(mg)->vt->transform)(mg, mllr)
#define ps_mgau_free(mg)   (*ps_mgau_base(mg)->vt->free)(mg)

Typedefs

typedef enum acmod_state_e acmod_state_t
 States in utterance processing.
typedef struct ps_mgau_s ps_mgau_t
 Acoustic model parameter structure.
typedef struct ps_mgaufuncs_s ps_mgaufuncs_t
typedef struct acmod_s acmod_t

Enumerations

enum  acmod_state_e { ACMOD_IDLE, ACMOD_STARTED, ACMOD_PROCESSING, ACMOD_ENDED }
 

States in utterance processing.

More...

Functions

acmod_tacmod_init (cmd_ln_t *config, logmath_t *lmath, fe_t *fe, feat_t *fcb)
 Initialize an acoustic model.
ps_mllr_tacmod_update_mllr (acmod_t *acmod, ps_mllr_t *mllr)
 Adapt acoustic model using a linear transform.
int acmod_set_mfcfh (acmod_t *acmod, FILE *logfh)
 Start logging MFCCs to a filehandle.
int acmod_set_rawfh (acmod_t *acmod, FILE *logfh)
 Start logging raw audio to a filehandle.
void acmod_free (acmod_t *acmod)
 Finalize an acoustic model.
int acmod_start_utt (acmod_t *acmod)
 Mark the start of an utterance.
int acmod_end_utt (acmod_t *acmod)
 Mark the end of an utterance.
int acmod_rewind (acmod_t *acmod)
 Rewind the current utterance, allowing it to be rescored.
int acmod_advance (acmod_t *acmod)
 Advance the frame index.
int acmod_set_grow (acmod_t *acmod, int grow_feat)
 Set memory allocation policy for utterance processing.
int acmod_process_raw (acmod_t *acmod, int16 const **inout_raw, size_t *inout_n_samps, int full_utt)
 TODO: Set queue length for utterance processing.
int acmod_process_cep (acmod_t *acmod, mfcc_t ***inout_cep, int *inout_n_frames, int full_utt)
 Feed acoustic feature data into the acoustic model for scoring.
int acmod_process_feat (acmod_t *acmod, mfcc_t **feat)
 Feed dynamic feature data into the acoustic model for scoring.
int16 const * acmod_score (acmod_t *acmod, int *inout_frame_idx)
 Score one frame of data.
int acmod_best_score (acmod_t *acmod, int *out_best_senid)
 Get best score and senone index for current frame.
void acmod_clear_active (acmod_t *acmod)
 Clear set of active senones.
void acmod_activate_hmm (acmod_t *acmod, hmm_t *hmm)
 Activate senones associated with an HMM.

Detailed Description

Acoustic model structures for PocketSphinx.

Author:
David Huggins-Daines <dhuggins@cs.cmu.edu>

Definition in file acmod.h.


Define Documentation

#define ps_mgau_frame_eval (   mg,
  senscr,
  senone_active,
  n_senone_active,
  feat,
  frame,
  compallsen 
)
Value:
(*ps_mgau_base(mg)->vt->frame_eval)                                 \
    (mg, senscr, senone_active, n_senone_active, feat, frame, compallsen)

Definition at line 113 of file acmod.h.


Enumeration Type Documentation

States in utterance processing.

Enumerator:
ACMOD_IDLE 

Not in an utterance.

ACMOD_STARTED 

Utterance started, no data yet.

ACMOD_PROCESSING 

Utterance in progress.

ACMOD_ENDED 

Utterance ended, still buffering.

Definition at line 66 of file acmod.h.


Function Documentation

int acmod_advance ( acmod_t acmod  ) 

Advance the frame index.

This function moves to the next frame of input data. Subsequent calls to acmod_score() will return scores for that frame, until the next call to acmod_advance().

Returns:
New frame index.

Definition at line 781 of file acmod.c.

References acmod_s::feat_outidx, ps_mgau_s::frame_idx, acmod_s::mgau, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, and acmod_s::output_frame.

acmod_t* acmod_init ( cmd_ln_t *  config,
logmath_t *  lmath,
fe_t *  fe,
feat_t *  fcb 
)

Initialize an acoustic model.

Parameters:
config a command-line object containing parameters. This pointer is not retained by this object.
lmath global log-math parameters.
fe a previously-initialized acoustic feature module to use, or NULL to create one automatically. If this is supplied and its parameters do not match those in the acoustic model, this function will fail. This pointer is not retained.
fe a previously-initialized dynamic feature module to use, or NULL to create one automatically. If this is supplied and its parameters do not match those in the acoustic model, this function will fail. This pointer is not retained.
Returns:
a newly initialized acmod_t, or NULL on failure.

Definition at line 225 of file acmod.c.

References acmod_free(), acmod_s::compallsen, acmod_s::config, acmod_s::fcb, acmod_s::fe, acmod_s::feat_buf, acmod_s::lmath, acmod_s::log_zero, acmod_s::mdef, acmod_s::mfc_buf, acmod_s::n_feat_alloc, acmod_s::n_mfc_alloc, acmod_s::senone_active, acmod_s::senone_active_vec, acmod_s::senone_scores, and acmod_s::state.

Referenced by ps_reinit().

int acmod_process_cep ( acmod_t acmod,
mfcc_t ***  inout_cep,
int *  inout_n_frames,
int  full_utt 
)

Feed acoustic feature data into the acoustic model for scoring.

Parameters:
inout_cep In: Pointer to buffer of features Out: Pointer to next frame to be read
inout_n_frames In: Number of frames available Out: Number of frames remaining
full_utt If non-zero, this block represents a full utterance and should be processed as such.
Returns:
Number of frames of data processed.

Definition at line 647 of file acmod.c.

References ACMOD_ENDED, ACMOD_STARTED, acmod_s::fcb, acmod_s::feat_buf, acmod_s::feat_outidx, acmod_s::grow_feat, acmod_s::mfcfh, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, and acmod_s::state.

int acmod_process_feat ( acmod_t acmod,
mfcc_t **  feat 
)

Feed dynamic feature data into the acoustic model for scoring.

Unlike acmod_process_raw() and acmod_process_cep(), this function accepts a single frame at a time. This is because there is no need to do buffering when using dynamic features as input. However, if the dynamic feature buffer is full, this function will fail, so you should either always check the return value, or always pair a call to it with a call to acmod_score().

Parameters:
feat Pointer to one frame of dynamic features.
Returns:
Number of frames processed (either 0 or 1).

Definition at line 739 of file acmod.c.

References acmod_s::fcb, acmod_s::feat_buf, acmod_s::feat_outidx, acmod_s::grow_feat, acmod_s::n_feat_alloc, and acmod_s::n_feat_frame.

int acmod_process_raw ( acmod_t acmod,
int16 const **  inout_raw,
size_t *  inout_n_samps,
int  full_utt 
)

TODO: Set queue length for utterance processing.

This function allows multiple concurrent passes of search to operate on different parts of the utterance. Feed raw audio data to the acoustic model for scoring.

Parameters:
inout_raw In: Pointer to buffer of raw samples Out: Pointer to next sample to be read
inout_n_samps In: Number of samples available Out: Number of samples remaining
full_utt If non-zero, this block represents a full utterance and should be processed as such.
Returns:
Number of frames of data processed.

Definition at line 578 of file acmod.c.

References acmod_s::fe, acmod_s::mfc_buf, acmod_s::mfc_outidx, acmod_s::n_mfc_alloc, acmod_s::n_mfc_frame, and acmod_s::rawfh.

Referenced by ps_process_raw().

int acmod_rewind ( acmod_t acmod  ) 

Rewind the current utterance, allowing it to be rescored.

After calling this function, the internal frame index is reset, and acmod_score() will return scores starting at the first frame of the current utterance. Currently, acmod_set_grow() must have been called to enable growing the feature buffer in order for this to work. In the future, senone scores may be cached instead.

Returns:
0 for success, <0 for failure (if the utterance can't be rewound due to no feature or score data available)

Definition at line 762 of file acmod.c.

References acmod_s::feat_outidx, ps_mgau_s::frame_idx, acmod_s::mgau, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, acmod_s::output_frame, and acmod_s::senscr_frame.

int16 const* acmod_score ( acmod_t acmod,
int *  inout_frame_idx 
)

Score one frame of data.

Parameters:
inout_frame_idx Input: frame index to score, or -1 or NULL to obtain scores for the most recent frame. Output: frame index corresponding to this set of scores.
Returns:
Array of senone scores for this frame, or NULL if no frame is available for scoring (such as if a frame index is requested that is not yet or no longer available). The data pointed to persists only until the next call to acmod_score() or acmod_advance().

Definition at line 793 of file acmod.c.

References acmod_s::compallsen, acmod_s::feat_buf, acmod_s::feat_outidx, acmod_s::mgau, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, acmod_s::n_senone_active, acmod_s::output_frame, acmod_s::senone_active, acmod_s::senone_scores, and acmod_s::senscr_frame.

Referenced by ngram_fwdflat_search(), and ngram_fwdtree_search().

int acmod_set_grow ( acmod_t acmod,
int  grow_feat 
)

Set memory allocation policy for utterance processing.

Parameters:
grow_feat If non-zero, the internal dynamic feature buffer will expand as necessary to encompass any amount of data fed to the model.
Returns:
previous allocation policy.

Definition at line 384 of file acmod.c.

References acmod_s::grow_feat, and acmod_s::n_feat_alloc.

Referenced by ps_process_raw(), and ps_reinit().

int acmod_set_mfcfh ( acmod_t acmod,
FILE *  logfh 
)

Start logging MFCCs to a filehandle.

Parameters:
acmod Acoustic model object.
logfh Filehandle to log to.
Returns:
0 for success, <0 on error.

Definition at line 346 of file acmod.c.

References acmod_s::mfcfh.

Referenced by ps_start_utt().

int acmod_set_rawfh ( acmod_t acmod,
FILE *  logfh 
)

Start logging raw audio to a filehandle.

Parameters:
acmod Acoustic model object.
logfh Filehandle to log to.
Returns:
0 for success, <0 on error.

Definition at line 358 of file acmod.c.

References acmod_s::rawfh.

Referenced by ps_start_utt().

ps_mllr_t* acmod_update_mllr ( acmod_t acmod,
ps_mllr_t mllr 
)

Adapt acoustic model using a linear transform.

Parameters:
mllr The new transform to use, or NULL to update the existing transform. The decoder retains ownership of this pointer, so you should not attempt to free it manually. Use ps_mllr_retain() if you wish to reuse it elsewhere.
Returns:
The updated transform object for this decoder, or NULL on failure.

Definition at line 335 of file acmod.c.

References acmod_s::mgau, acmod_s::mllr, and ps_mllr_free().

Referenced by ps_update_mllr().