Skip to content

utt2dur generation success when train TDNN starting from scratch , but utt2dur generation failure when do TDNN fine tune with pre-trained model #3297

Closed
@ben-8878

Description

@ben-8878

@danpovey
When i train a 'chain' model with starting from scratch, everything went smoothly.
##########################training info##########################################
LOG (copy-transition-model[5.5]:main():copy-transition-model.cc:62) Copied transition model.
2019-04-24 09:20:33,566 [steps/nnet3/chain/train.py:337 - train - INFO ] Initializing a basic network for estimating preconditioning matrix
2019-04-24 09:20:33,731 [steps/nnet3/chain/train.py:359 - train - INFO ] Generating egs
steps/nnet3/chain/get_egs.sh --frames-overlap-per-eg 0 --cmd run.pl --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir --left-context 23 --right-context 23 --left-context-initial -1 --right-context-final -1 --left-tolerance 5 --right-tolerance 5 --frame-subsampling-factor 3 --alignment-subsampling-factor 3 --stage -10 --frames-per-iter 1500000 --frames-per-eg 150,110,90 --srand 0 data/train_hires exp/chain/tdnn_1a_sp exp/tri4_sp_lats exp/chain/tdnn_1a_sp/egs
steps/nnet3/chain/get_egs.sh: creating egs. To ensure they are not deleted later you can do: touch exp/chain/tdnn_1a_sp/egs/.nodelete
steps/nnet3/chain/get_egs.sh: feature type is raw
tree-info exp/chain/tdnn_1a_sp/tree
steps/nnet3/chain/get_egs.sh: working out number of frames of training data
steps/nnet3/chain/get_egs.sh: working out feature dim
steps/nnet3/chain/get_egs.sh: creating 236 archives, each with 16600 egs, with
steps/nnet3/chain/get_egs.sh: 150,110,90 labels per example, and (left,right) context = (23,23)
steps/nnet3/chain/get_egs.sh: Getting validation and training subset examples in background.
steps/nnet3/chain/get_egs.sh: Generating training examples on disk
... Getting subsets of validation examples for diagnostics and combination.
steps/nnet3/chain/get_egs.sh: recombining and shuffling order of archives on disk
steps/nnet3/chain/get_egs.sh: removing temporary archives
steps/nnet3/chain/get_egs.sh: removing temporary alignments, lattices and transforms
steps/nnet3/chain/get_egs.sh: Finished preparing training examples
2019-04-24 09:52:51,544 [steps/nnet3/chain/train.py:408 - train - INFO ] Copying the properties from exp/chain/tdnn_1a_sp/egs to exp/chain/tdnn_1a_sp
2019-04-24 09:52:51,578 [steps/nnet3/chain/train.py:422 - train - INFO ] Computing the preconditioning matrix for input features
2019-04-24 09:53:23,626 [steps/nnet3/chain/train.py:431 - train - INFO ] Preparing the initial acoustic model.
2019-04-24 09:53:24,830 [steps/nnet3/chain/train.py:465 - train - INFO ] Training will run for 4.0 epochs = 944 iterations
2019-04-24 09:53:24,831 [steps/nnet3/chain/train.py:507 - train - INFO ] Iter: 0/943 Epoch: 0.00/4.0 (0.0% complete) lr: 0.002000
2019-04-24 09:54:51,813 [steps/nnet3/chain/train.py:507 - train - INFO ] Iter: 1/943 Epoch: 0.00/4.0 (0.1% complete) lr: 0.001997
2019-04-24 09:56:09,426 [steps/nnet3/chain/train.py:507 - train - INFO ] Iter: 2/943 Epoch: 0.01/4.0 (0.1% complete) lr: 0.001994
##########################training info##########################################
But i do fine tune with pre-trained gmm model with new speech-data, using script "egs/rm/s5/local/chain/tuning/run_tdnn_wsj_rm_1a.sh", met follows errors, and the reason is 'utt2dur' file generation failed.
##########################errors################################################
LOG (copy-transition-model[5.5]:main():copy-transition-model.cc:62) Copied transition model.
2019-04-30 19:58:58,871 [steps/nnet3/chain/train.py:359 - train - INFO ] Generating egs
steps/nnet3/chain/get_egs.sh --frames-overlap-per-eg 0 --cmd run.pl --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir --left-context 23 --right-context 23 --left-context-initial -1 --right-context-final -1 --left-tolerance 5 --right-tolerance 5 --frame-subsampling-factor 3 --alignment-subsampling-factor 3 --stage -10 --frames-per-iter 1000000 --frames-per-eg 150 --srand 0 data/train_hires exp/chain/tdnn_1a exp/tri3_lats exp/chain/tdnn_1a/egs
utils/data/get_frame_shift.sh: data/train_hires/utt2dur does not exist: creating it
utils/data/get_utt2dur.sh: segments file does not exist so getting durations from wave files
cat: write error: Broken pipe
utils/data/get_utt2dur.sh: could not get utterance lengths from sphere-file headers, using wav-to-duration
steps/nnet3/chain/train.py --stage -10 --cmd run.pl --use-gpu=wait --trainer.input-model exp/chain/tdnn_1a/input.raw --feat.online-ivector-dir --chain.xent-regularize 0.1 --feat.cmvn-opts --norm-means=false --norm-vars=false --chain.xent-regularize 0.1 --chain.leaky-hmm-coefficient 0.1 --chain.l2-regularize 0.00005 --chain.apply-deriv-weights false --chain.lm-opts=--num-extra-lm-states=200 --egs.dir --egs.stage -10 --egs.opts --frames-overlap-per-eg 0 --egs.chunk-width 150 --trainer.num-chunk-per-minibatch=128 --trainer.frames-per-iter 1000000 --trainer.num-epochs 2 --trainer.optimization.num-jobs-initial=2 --trainer.optimization.num-jobs-final=4 --trainer.optimization.initial-effective-lrate=0.005 --trainer.optimization.final-effective-lrate=0.0005 --trainer.max-param-change 2.0 --cleanup.remove-egs true --feat-dir data/train_hires --tree-dir exp/chain/tri4_5a_tree --lat-dir exp/tri3_lats --dir exp/chain/tdnn_1a
['steps/nnet3/chain/train.py', '--stage', '-10', '--cmd', 'run.pl', '--use-gpu=wait', '--trainer.input-model', 'exp/chain/tdnn_1a/input.raw', '--feat.online-ivector-dir', '', '--chain.xent-regularize', '0.1', '--feat.cmvn-opts', '--norm-means=false --norm-vars=false', '--chain.xent-regularize', '0.1', '--chain.leaky-hmm-coefficient', '0.1', '--chain.l2-regularize', '0.00005', '--chain.apply-deriv-weights', 'false', '--chain.lm-opts=--num-extra-lm-states=200', '--egs.dir', '', '--egs.stage', '-10', '--egs.opts', '--frames-overlap-per-eg 0', '--egs.chunk-width', '150', '--trainer.num-chunk-per-minibatch=128', '--trainer.frames-per-iter', '1000000', '--trainer.num-epochs', '2', '--trainer.optimization.num-jobs-initial=2', '--trainer.optimization.num-jobs-final=4', '--trainer.optimization.initial-effective-lrate=0.005', '--trainer.optimization.final-effective-lrate=0.0005', '--trainer.max-param-change', '2.0', '--cleanup.remove-egs', 'true', '--feat-dir', 'data/train_hires', '--tree-dir', 'exp/chain/tri4_5a_tree', '--lat-dir', 'exp/tri3_lats', '--dir', 'exp/chain/tdnn_1a']

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions