The need for sampling temperature and differences between whisper, GPT-3, and probabilistic model's temperature

Introduction When I published my new work OverFlow to arXiv and added the model to the Coqui-TTS framework. Someone on the discord channel asked me that our system OverFlow had a parameter called ...

Jan 27, 2023 machine-learning

Deriving categorical cross entropy and softmax

Introduction Recently, on the Pytorch discussion forum, someone asked the question about the derivation of categorical cross entropy and softmax. So I thought it would be a good idea to write a bl...

Jan 10, 2023 machine-learning, math

Welcome to the blog

Hello and welcome to the blog (again)! Again because I used to post on shivammehta.me/blog but I decided to move to a new platform. I will try to migrate my old posts to here to keep everything...

Dec 28, 2022 undefined

Universal approximation theorem - The intuition

(Migrated from old blog) Recently, I attended a course on Deep Learning and found this very nice intuition for how the Universal Approximation Theorem works. What is Universal Approximation Theor...

Jun 10, 2021 deep-learning, math

PyTorch - Computation graph

(Migrated from old blog) I started with Deep Learning when PyTorch was already a big name and with extensive community support, which is just growing every day the more I use it the more I fall in...

Dec 6, 2020 deep-learning, pytorch

Data structure Trie - Prefix trees, spell checkers

(Migrated from old blog) Ever wonder? How Microsoft Word checks that the spelling that you wrote is correct or not? So there can be various language models that can be used but one of the most maj...

Feb 13, 2020 programming

An idea to test programming solution

New Edit I have moved from emacs to VSCode + Vim key bindings IDE wise, but the idea is still the same and useful. (Migrated from my old blog) Alright! Today I just had a cool idea while trying ...

Jan 31, 2020 programming

Your own mini google search - inverted indexes and boolean retrieval

(Migrated from my old blog) Ever wondered how google gets relevant documents for your query within milliseconds despite of such a huge amount of information it contains. Recently, I was looking i...

Jan 8, 2020 machine-learning, programming

Paper summary - Distractor generation for multiple choice question using learning to rank

(Migrated from old blog) A paper by Chen Liang, Xiao Yang, Neisarg Dave, Drew Wham, Bart Pursel, C. Lee Giles from Pennsylvania State University. Recently, I was starting with such topics in Fiel...

Dec 13, 2019 deep-learning, paper-summary

Kadane's algorithm - Maximum subarray problem

(Migrated from my old blog) Okay so, I was brushing up my Algorithm skills at CodeSignal and I found this maxSubArrayProblem: https://app.codesignal.com/challenge/LrAwpTnYZR6NMCbfs So we will sol...

Dec 9, 2019 programming

1
2
3
4
1 / 4