How does it works ?

There is a background job that pull comments for reddit r/adventofcode solution mega thread that tries to classify the comments programming language. The heuristic is mainly consist of multiple way from detecting the format, links and try to guess the programming language. To detect filename & extension, we use Github's linguist and dayvojersen/linguist to use it with Go

By Comment's format
Check if the first word is programming language ex:

    Python

    # Python

    **Python**

    [Python](https://...)
    

Code blocks
It looks into code blocks in comment. The detection using this method is not very reliable

    # with 4 space `    `

        import java.net.HttpURLConnection
        import java.net.URL


    # with markdown tripple backtick (```)
    ```
        import java.net.HttpURLConnection
        import java.net.URL
    ```
    

Link detection
  • Github & Gitlab links - detect the file extensions
  • nopaste.ml - Detect file extension or content (source code)
  • topaz.github.io - Detect file extension or content (source code)