본문 바로가기

728x90

분류 전체보기

(424)
[Math] Groups, Vector Spaces and Vector Subspaces **My Personal opinionIn linear algebra, finding solution x from Ax = 0 is very important.And possible solution set x forms a vector subspace which is included in R^n vector space.Before defining the solution space, What we will do is to look into what group, vector space, vector subspaces are.Simply, group gurantees that the results from the operation still lives in the same group(has more speci..
[Math] Finding Solutions of a System of Linear Equations How do we get solutions x from a system of linear equations, Ax = b.If we have a particular solution x_p such that Ax_p = B, and a homogenous solution x_h such that Ax_h = 0, then any scalar t.A(x_p + t*x_h) = bx = x_p + t*x_h Thus, adding any multiple of a homogeneous solution to a particular solution still satisfies the original solution.Notice that neither the general nor the particular solut..
[PyTorch] Modify Gradient while Backward Propagation Using Hook Hooks in PyTorch allow you to execute custom functions at specific points during the forward or backward passes of your neural network. 1. Understanding the Gradient Flow In PyTorchWhen you call loss.backward(), the gradients are computed and stored in the grad attribute of each parameter.Forward Pass: You pass your input through the network to get the output.input tensor Xmodel parameter tensor..
[Math] Viewing Deep Learning From Maximum Likelihood Estimation Perspective We can see finding deep learning model's parameters from maximum likelihood estimation perspective.1. Normal distribution1.1 SettingLet's say we have a model μ = wx+band have a dataset {(x₁, y₁), ..., (x₁, y₅), (x₂, y₆), ... , (x₂, y₁₁), (x₃, y₁₂), ...} consists of n samples.1.2 Assume a probability distributionLet's assume the observed value y, given x, follows normal distribution with mean μ=w..
[Paper Review] Robust Speech Recognition via Large-Scale Weak Supervision Robust Speech Recognition via Large-Scale Weak SupervisionWe study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual and multitask supervision, the resulting models generalize well to standardarxiv.org0. AbstractSuggest large-scale and weakly-supervised speech processing mode..
[Paper Review] Conformer: Convolution-augmented Transformer for Speech Recognition Conformer: Convolution-augmented Transformer for Speech RecognitionRecently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing content-based global interacarxiv.org0. AbstractTransformer models are good at capturing content-based ..
[Paper Review] Sequence Transduction with Recurrent Neural Networks https://arxiv.org/pdf/1211.37110. AbstractMany machine learning tasks can be expressed as the transformation—or transduction —of input sequences into output sequences: speech recognition, machine translation and so on One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential distortions such as shrinki..
[Paper Review] Neural RRT*: Learning-Based Optimal Path Planning Neural RRT*: Learning-Based Optimal Path PlanningRapidly random-exploring tree (RRT) and its variants are very popular due to their ability to quickly and efficiently explore the state space. However, they suffer sensitivity to the initial solution and slow convergence to the optimal solution, which meanieeexplore.ieee.org0. AbstractRapidly random-exploring tree (RRT) is popular path planning al..

728x90