The Gradient

Language is not just words.

Fork me on GitHub

Dynamic Programming (DP) is ubiquitous in NLP, such as Minimum Edit Distance, Viterbi Decoding, forward/backward algorithm, CKY algorithm, etc.

Read more »

The main aim of conv op is to extract useful features for downstream tasks. And different filters could intuitionally extract different aspect of features via backprop during training. Afterward, all the extracted features are combined to make decisions.

Read more »