freshman in ML: how do you identify actually open research problems? [D]

Reddit r/MachineLearning / 4/27/2026

💬 OpinionSignals & Early TrendsIdeas & Deep Analysis

Key Points

  • A ML freshman asks seasoned researchers how to distinguish genuinely open research problems from topics that merely appear open or are already solved, noting a lack of “middle ground” in what they see.
  • They describe uncertainty about whether a “future work includes X” claim means X is truly open, potentially already done privately, still unpublished, infeasible, or dependent on equipment they lack.
  • They also mention the challenge of duplicate naming across communities, where searching the wrong terminology can make a problem seem open when it is not.
  • Beyond discovery, they ask how researchers deal with the feeling that ideas are either already done or not good enough without becoming paralyzed.
  • Their motivation is to make AI-for-science faster while also lowering cost, and they cite examples of hardware-aligned ML topics they’ve already considered as potentially already explored.

Hi, I am a freshman who is trying to break into research.

I got into a well known university research lab in my country for the upcoming summer, and the prof said I am "better positioned than numerous others" for hardware-aligned machine learning topics. I am facing a couple of problems, and I would like to know how seasoned researchers deal with them:

  1. How do you develop the intuition for what's open vs. what just looks open? When I look at a research space, everything either looks already solved or impossibly vague. There's no middle ground visible to me, yet. This bothers me.

  2. How do you handle the feeling that every idea is either already done or not good enough, without it paralyzing you?

Ideas that I have "thought" of but have been done already: PQCache, async KVCache prefetching, roofline modeling for GQA decode phase.. etc.

A paper that says "future work includes X" BUT it is not the same as X being open, right? Someone may have done X last month and not published yet, or X may be open but intractable, or X may be open but require equipment which I don't have. I would have no way to know which. Morever the thing I want to work on might exist under three different names across three different communities, and if you search the wrong name you conclude it's open when it isn't. (LLMs with Web Search seems to help a bit)


Reddit threads that I have already looked into:

  1. https://www.reddit.com/r/MachineLearning/comments/1sayptq/d_physicistturnedmlengineer_looking_to_get_into/
  2. https://www.reddit.com/r/MachineLearning/comments/1nsvdqk/d_machine_learning_research_no_longer_feels/
  3. https://www.reddit.com/r/MachineLearning/comments/kw9xk7/d_has_anyone_else_lost_interest_in_ml_research/

My motivation to work on this field is to speed up ai-for-science initiatives, while making it more affordable.

submitted by /u/Shonku_
[link] [comments]