Skip to content

Latest commit

 

History

History
54 lines (34 loc) · 1.89 KB

dave-thomas-software-archaeology.md

File metadata and controls

54 lines (34 loc) · 1.89 KB

Dave Thomas (author of The Pragmatic Programmer) on software archaeology

General point: read for fun and education, not just to fix bugs. E.g., when you notice your software has an interesting feature, look at the source code to see how that feature is implemented.

Static software reading

  • Can use tree . or tree -d . to show directory structure.

  • Initially, open all of the files in the codebase in a text editor with a tiny font, in order to see the high level structure of the code (without actually reading it). Scroll through this a few times.

  • Find the application code, and start reading in more detail there. This is usually in the largest file, e.g., a file that has a 400 line function is probably controlling the application.

      - Use grep -n to count domain knowledge words (e.g., "postage" in a
        shipping program).
    
      - Use grep | grep -v (invert match) to filter an avalanche of grep
        matches.
    
      - Learn awk. Learn command line tools for source code parsing in
        general.
    
  • Put the code in a VCS, and add annotations (in the form of comments) to the code as you work through it.

  • If the project has build scripts, read them to get an idea of how the source is pieced together in the binary.

Dynamic software reading

  • If the project doesn't have proper build scripts, write a build script as you figure out how to get it running.

  • Can use debugger/printf to isolate specific known problems.

  • For widepread coverage, set exceptions (asserts) at interesting points to look at the backtrace.

  • Execution profiler to see program flow.

  • If unit tests exist, run those, and read them to see how they work. Debug failures (within reason).

  • Write your own tests for the code. As you read, come up with hypotheses about how the code works. Write unit tests to test those hypotheses, and of course check those tests into the VCS.