This weekend I was looking for a dataset on major air crashes (I like planes) containing the text of their final reports. Surprisingly I was unable to find even a single open source dataset matching this criteria. Anyway I started collecting a few reports and was in the stage of extracting and finalising the cleaning pipeline that I realized that I don't really have a clear idea what to do with this data. Perhaps build a RAG but what benefit would that have? Has anyone worked with such reports?
[link] [comments]
