How does it works ?
There is a background job that pull comments for reddit r/adventofcode solution mega thread that tries to classify the comments programming language. The heuristic is mainly consist of multiple way from detecting the format, links and try to guess the programming language. To detect filename & extension, we use Github's linguist and dayvojersen/linguist to use it with Go
By Comment's format
Check if the first word is programming language ex:
Python # Python **Python** [Python](https://...)
Code blocks
It looks into code blocks in comment. The detection using this method is not very reliable
# with 4 space ` ` import java.net.HttpURLConnection import java.net.URL # with markdown tripple backtick (```) ``` import java.net.HttpURLConnection import java.net.URL ```
Link detection
- Github & Gitlab links - detect the file extensions
- nopaste.ml - Detect file extension or content (source code)
- topaz.github.io - Detect file extension or content (source code)