Hi everyone,
I’m trying to understand how to add theoretical justification to an AI/ML paper.
My background is mostly in empirical modeling, so I’m comfortable with experiments, results, and analysis. But I often see papers that include formal elements like theorems, lemmas, and proofs, and I’m not sure how to approach that side.
For example, I’m exploring an idea about measuring uncertainty in the attention mechanism by looking at the outputs of different attention heads. Intuitively it makes sense to me, but I don’t know how to justify it theoretically or frame it in a rigorous way.
I’ve also noticed that some papers reference existing theorems or build on theory that I haven’t really studied during my postgrad courses which makes it harder to follow.
So my questions are:
- How do you go from an intuitive idea to a theoretical justification?
- Do you need a strong math background to do this, or can it be learned along the way?
- Any tips, resources, or examples for bridging empirical work with theory?
Appreciate any guidance!
[link] [comments]