Skip to main content

kaldi editing nnet3 chain model - adding a softmax layer on top of the chain output

I had to do one more thing: to edit a trained kaldi nnet3 chain model and add a softmax layer on top of the chain model. The reason for this is to get "probability" like output directly from the chain model

First, let's look at the nnet structure:

nnet3-am-info final.mdl
input-dim: 20
ivector-dim: -1
num-pdfs: 6105
prior-dimension: 0
# Nnet info follows.
left-context: 15
right-context: 15
num-parameters: 15499085
modulus: 1
input-node name=input dim=20
component-node name=L0_fixaffine component=L0_fixaffine input=Append(Offset(input, -1), input, Offset(input, 1)) input-dim=60 output-dim=60
component-node name=Tdnn_0_affine component=Tdnn_0_affine input=L0_fixaffine input-dim=60 output-dim=625
component-node name=Tdnn_0_relu component=Tdnn_0_relu input=Tdnn_0_affine input-dim=625 output-dim=625
component-node name=Tdnn_0_renorm component=Tdnn_0_renorm input=Tdnn_0_relu input-dim=625 output-dim=625
component-node name=Tdnn_1_affine component=Tdnn_1_affine input=Append(Offset(Tdnn_0_renorm, -1), Tdnn_0_renorm, Offset(Tdnn_0_renorm, 1)) input-dim=1875 output-dim=625
component-node name=Tdnn_1_relu component=Tdnn_1_relu input=Tdnn_1_affine input-dim=625 output-dim=625
component-node name=Tdnn_1_renorm component=Tdnn_1_renorm input=Tdnn_1_relu input-dim=625 output-dim=625
component-node name=Tdnn_2_affine component=Tdnn_2_affine input=Append(Offset(Tdnn_1_renorm, -1), Tdnn_1_renorm, Offset(Tdnn_1_renorm, 1)) input-dim=1875 output-dim=625
component-node name=Tdnn_2_relu component=Tdnn_2_relu input=Tdnn_2_affine input-dim=625 output-dim=625
component-node name=Tdnn_2_renorm component=Tdnn_2_renorm input=Tdnn_2_relu input-dim=625 output-dim=625
component-node name=Tdnn_3_affine component=Tdnn_3_affine input=Append(Offset(Tdnn_2_renorm, -3), Tdnn_2_renorm, Offset(Tdnn_2_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_3_relu component=Tdnn_3_relu input=Tdnn_3_affine input-dim=625 output-dim=625
component-node name=Tdnn_3_renorm component=Tdnn_3_renorm input=Tdnn_3_relu input-dim=625 output-dim=625
component-node name=Tdnn_4_affine component=Tdnn_4_affine input=Append(Offset(Tdnn_3_renorm, -3), Tdnn_3_renorm, Offset(Tdnn_3_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_4_relu component=Tdnn_4_relu input=Tdnn_4_affine input-dim=625 output-dim=625
component-node name=Tdnn_4_renorm component=Tdnn_4_renorm input=Tdnn_4_relu input-dim=625 output-dim=625
component-node name=Tdnn_5_affine component=Tdnn_5_affine input=Append(Offset(Tdnn_4_renorm, -3), Tdnn_4_renorm, Offset(Tdnn_4_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_5_relu component=Tdnn_5_relu input=Tdnn_5_affine input-dim=625 output-dim=625
component-node name=Tdnn_5_renorm component=Tdnn_5_renorm input=Tdnn_5_relu input-dim=625 output-dim=625
component-node name=Tdnn_6_affine component=Tdnn_6_affine input=Append(Offset(Tdnn_5_renorm, -3), Tdnn_5_renorm, Offset(Tdnn_5_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_6_relu component=Tdnn_6_relu input=Tdnn_6_affine input-dim=625 output-dim=625
component-node name=Tdnn_6_renorm component=Tdnn_6_renorm input=Tdnn_6_relu input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_chain_affine component=Tdnn_pre_final_chain_affine input=Tdnn_6_renorm input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_chain_relu component=Tdnn_pre_final_chain_relu input=Tdnn_pre_final_chain_affine input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_chain_renorm component=Tdnn_pre_final_chain_renorm input=Tdnn_pre_final_chain_relu input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_xent_affine component=Tdnn_pre_final_xent_affine input=Tdnn_6_renorm input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_xent_relu component=Tdnn_pre_final_xent_relu input=Tdnn_pre_final_xent_affine input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_xent_renorm component=Tdnn_pre_final_xent_renorm input=Tdnn_pre_final_xent_relu input-dim=625 output-dim=625
component-node name=Final_affine component=Final_affine input=Tdnn_pre_final_chain_renorm input-dim=625 output-dim=6105
output-node name=output input=Final_affine dim=6105 objective=linear
component-node name=Final-xent_affine component=Final-xent_affine input=Tdnn_pre_final_xent_renorm input-dim=625 output-dim=6105
component-node name=Final-xent_log_softmax component=Final-xent_log_softmax input=Final-xent_affine input-dim=6105 output-dim=6105
output-node name=output-xent input=Final-xent_log_softmax dim=6105 objective=linear
...


The solution is a bit more complicated then the solution for editing task in the previous blog post.
First, I have to add some components to the network to add softmax layer, then to remove and rename the added softmax layer. This done trough creating nnet-config and then using edits provided by nnet3-am-copy:

echo "component-node name=Final_log_softmax component=Final_log_softmax input=Final_affine
output-node name=output-softmax input=Final_log_softmax objective=linear
component name=Final_log_softmax type=LogSoftmaxComponent dim=6105" > add.chain.softmax.cfg

cat final.mdl | nnet3-am-copy --nnet-config=add.chain.softmax.cfg - - | nnet3-am-copy --edits='remove-output-nodes name=output;rename-node old-name=output-softmax new-name=output;remove-orphans' - final.chain.softmax.mdl


LOG (nnet3-am-copy:main():nnet3-am-copy.cc:156) Copied neural net from - to -
LOG (nnet3-am-copy:ReadEditConfig():nnet-utils.cc:687) Removing 1 output nodes.
LOG (nnet3-am-copy:RemoveSomeNodes():nnet-nnet.cc:885) Removed 1 orphan nodes.
LOG (nnet3-am-copy:RemoveOrphanComponents():nnet-nnet.cc:810) Removing 0 orphan components.
LOG (nnet3-am-copy:main():nnet3-am-copy.cc:156) Copied neural net from - to final.chain.softmax.mdl

Finally, let's look at the modified structure:

nnet3-am-info final.chain.softmax.mdl
input-dim: 20 
ivector-dim: -1
num-pdfs: 6105
prior-dimension: 0
# Nnet info follows.
left-context: 15
right-context: 15
num-parameters: 15499085
modulus: 1
input-node name=input dim=20
component-node name=L0_fixaffine component=L0_fixaffine input=Append(Offset(input, -1), input, Offset(input, 1)) input-dim=60 output-dim=60
component-node name=Tdnn_0_affine component=Tdnn_0_affine input=L0_fixaffine input-dim=60 output-dim=625
component-node name=Tdnn_0_relu component=Tdnn_0_relu input=Tdnn_0_affine input-dim=625 output-dim=625
component-node name=Tdnn_0_renorm component=Tdnn_0_renorm input=Tdnn_0_relu input-dim=625 output-dim=625
component-node name=Tdnn_1_affine component=Tdnn_1_affine input=Append(Offset(Tdnn_0_renorm, -1), Tdnn_0_renorm, Offset(Tdnn_0_renorm, 1)) input-dim=1875 output-dim=625
component-node name=Tdnn_1_relu component=Tdnn_1_relu input=Tdnn_1_affine input-dim=625 output-dim=625
component-node name=Tdnn_1_renorm component=Tdnn_1_renorm input=Tdnn_1_relu input-dim=625 output-dim=625
component-node name=Tdnn_2_affine component=Tdnn_2_affine input=Append(Offset(Tdnn_1_renorm, -1), Tdnn_1_renorm, Offset(Tdnn_1_renorm, 1)) input-dim=1875 output-dim=625
component-node name=Tdnn_2_relu component=Tdnn_2_relu input=Tdnn_2_affine input-dim=625 output-dim=625
component-node name=Tdnn_2_renorm component=Tdnn_2_renorm input=Tdnn_2_relu input-dim=625 output-dim=625
component-node name=Tdnn_3_affine component=Tdnn_3_affine input=Append(Offset(Tdnn_2_renorm, -3), Tdnn_2_renorm, Offset(Tdnn_2_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_3_relu component=Tdnn_3_relu input=Tdnn_3_affine input-dim=625 output-dim=625
component-node name=Tdnn_3_renorm component=Tdnn_3_renorm input=Tdnn_3_relu input-dim=625 output-dim=625
component-node name=Tdnn_4_affine component=Tdnn_4_affine input=Append(Offset(Tdnn_3_renorm, -3), Tdnn_3_renorm, Offset(Tdnn_3_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_4_relu component=Tdnn_4_relu input=Tdnn_4_affine input-dim=625 output-dim=625
component-node name=Tdnn_4_renorm component=Tdnn_4_renorm input=Tdnn_4_relu input-dim=625 output-dim=625
component-node name=Tdnn_5_affine component=Tdnn_5_affine input=Append(Offset(Tdnn_4_renorm, -3), Tdnn_4_renorm, Offset(Tdnn_4_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_5_relu component=Tdnn_5_relu input=Tdnn_5_affine input-dim=625 output-dim=625
component-node name=Tdnn_5_renorm component=Tdnn_5_renorm input=Tdnn_5_relu input-dim=625 output-dim=625
component-node name=Tdnn_6_affine component=Tdnn_6_affine input=Append(Offset(Tdnn_5_renorm, -3), Tdnn_5_renorm, Offset(Tdnn_5_renorm, 3)) input-dim=1875 output-dim=625
component-node name=Tdnn_6_relu component=Tdnn_6_relu input=Tdnn_6_affine input-dim=625 output-dim=625
component-node name=Tdnn_6_renorm component=Tdnn_6_renorm input=Tdnn_6_relu input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_chain_affine component=Tdnn_pre_final_chain_affine input=Tdnn_6_renorm input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_chain_relu component=Tdnn_pre_final_chain_relu input=Tdnn_pre_final_chain_affine input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_chain_renorm component=Tdnn_pre_final_chain_renorm input=Tdnn_pre_final_chain_relu input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_xent_affine component=Tdnn_pre_final_xent_affine input=Tdnn_6_renorm input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_xent_relu component=Tdnn_pre_final_xent_relu input=Tdnn_pre_final_xent_affine input-dim=625 output-dim=625
component-node name=Tdnn_pre_final_xent_renorm component=Tdnn_pre_final_xent_renorm input=Tdnn_pre_final_xent_relu input-dim=625 output-dim=625
component-node name=Final_affine component=Final_affine input=Tdnn_pre_final_chain_renorm input-dim=625 output-dim=6105
component-node name=Final-xent_affine component=Final-xent_affine input=Tdnn_pre_final_xent_renorm input-dim=625 output-dim=6105
component-node name=Final-xent_log_softmax component=Final-xent_log_softmax input=Final-xent_affine input-dim=6105 output-dim=6105
output-node name=output-xent input=Final-xent_log_softmax dim=6105 objective=linear
component-node name=Final_log_softmax component=Final_log_softmax input=Final_affine input-dim=6105 output-dim=6105
output-node name=output input=Final_log_softmax dim=6105 objective=linear

...

component name=Final_log_softmax type=LogSoftmaxComponent, dim=6105
...

Cool, it worked!
Post a Comment

Popular posts from this blog

how the make HCL and G graphs, and on the fly compositon of HCL and G for KALDI

Well, I had again to do something ;-) The task is to generate/create/update a decoding graph for KALDI on the fly. In my case, I aim at changing a G (grammar) in the context of a dialogue system.

One can generate a new HCLG but this would take a lot of time as this involves FST determinization, epsilon-removal, minimization, etc. Therefore, I tried to use on-the-fly composition of statically prepared HCL and G. At first, I struggled with it but later I made it work. See https://github.com/jpuigcerver/kaldi-decoders/issues/1

Here is a short summary:

At the end, I managed to get LabelLookAheadMatcher to work. It is mostly based on the code and examples in opendcd, e.g. https://github.com/opendcd/opendcd/blob/master/script/makegraphotf.sh.

First, Here is how I build and prepare the HCL and G. Please not that OpenFST must be compiled with --enable-lookahead-fsts, see http://www.openfst.org/twiki/bin/view/FST/ReadMe.

#--------------- fstdeterminize ${lang}/L_disambig.fst | fstarcsort >…

Temporal-Difference Learning Policy Evaluation in Python

In the code bellow, is an example of policy evaluation for very simple task. Example is taken from the book: "Reinforcement Learning: An Introduction, Surto and Barto".
#!/usr/local/bin/python

"""
This is an example of policy evaluation for a random walk policy.

Example 6.2: Random Walk from the book:
"Reinforcement Learning: An Introduction, Surto and Barto"

The policy is evaluated by dynamic programing and TD(0).

In this example, the policy can start in five states 1, 2, 3, 4, 5 and end in
two states 0 and 6. The allowed transitions between the states are as follwes:

0 <-> 1 <-> 2 <-> 3 <-> 4 <-> 5 <-> 6

The reward for ending in the state 6 is 1.
The reward for ending in the state 0 is 0.

In any state, except the final states, you can take two actions: 'left' and 'right'.
In the final states the policy and episodes end.

Because this example implements the random walk policy then both actions can be
taken with th…

Viterbi Algorithm in C++ and using STL

To practice my C++ and STL skills, I implemented the Viterbi algorithm example from the Wikipedia page: http://en.wikipedia.org/wiki/Viterbi_algorithm. The original algorithm was implemented in Python. I reimplemented the example in C++ and I used STL (mainly vector and map classes).  This code is in public-domain. So, use it as you want. 
The complete solution for MS Visual C++ 2008 can be found at http://filip.jurcicek.googlepages.com/ViterbiSTL.rar

// ViterbiSTL.cpp : is an C++ and STL implementatiton of the Wikipedia example // Wikipedia: http://en.wikipedia.org/wiki/Viterbi_algorithm#A_concrete_example
// It as accurate implementation as it was possible

#include "stdafx.h"
#include "string" #include "vector" #include "map" #include "iostream"
using namespace std;
//states = ('Rainy', 'Sunny') //  //observations = ('walk', 'shop', 'clean') //  //start_probability = {'Rainy': 0.6, 'Sunny': 0.…