Please I really need your help on this guys [D]

Reddit r/MachineLearning / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

A student describes solving a machine learning time-series classification task by first achieving a public leaderboard score of 0.85, then finding the exact dataset used in the competition externally and obtaining a perfect 1.00 score.
They ask whether it’s possible to recreate the same submission predictions (ID-to-label mapping) using only the provided train and test datasets, without relying on the externally found dataset.
The student wants guidance on how to “learn” or reverse-engineer the submission output strictly from the original files, ideally using proper machine learning methods rather than external data.
They clarify that for their successful submission they had access to the full feature set (not just IDs and labels), and they’re willing to share train/test or the submission file if needed.
The post is essentially a feasibility and methodology question about dataset leakage, reproducibility, and generating identical competition outputs from the given splits.

My teacher gave us a machine learning time series classification problem.

At first, I tried solving it normally and got a public score of 0.85. But then I searched for the dataset used in the competition and managed to find it. Using that dataset, I generated a submission file that scored 1.00.

Now my question is:

Is it possible to recreate the submission file using only the provided train and test datasets, without relying on the external dataset I found?

In other words, I want to understand if there is a way to learn or reverse-engineer how to produce the same submission output (ID → label mapping) using only the original train/test files. I’m not sure if “reverse engineering the submission” is the correct term, but I want to figure out how to get the same result properly using machine learning rather than external data.

Also, I want to clarify that for the submission I made, I actually had access to the full feature set—not just IDs and labels, meaning the other feature of the sub file

I would really appreciate any help or guidance. If needed, I can share the train/test files or the submission file that achieved the 1.00 score.

Thanks in advance!

submitted by /u/Djistino
[link] [comments]

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

AI agents have no identity — we built the open registry that gives them one

Dev.to

Democratic Governance of AI Is the Real Solution

Reddit r/artificial

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

SentinelOne's AI-powered EDR autonomously claims blocking a Claude Zero Day Supply Chain Attack

Dev.to

Please I really need your help on this guys [D]

Key Points

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

AI agents have no identity — we built the open registry that gives them one

Democratic Governance of AI Is the Real Solution

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

SentinelOne's AI-powered EDR autonomously claims blocking a Claude Zero Day Supply Chain Attack

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer