With the growing use of continuous integration and static analysis tools hooked into those processes, there is a particular analysis tool that is very useful but rarely mentioned: phploc, the PHP Lines of Code tool. This has been a feature of PHPUnit for some time but has been released as a separate project in the phpunit pear channel. The nature of PHPUnit means that many of these statistics can be collected while the tests are running, which is why it was added to that tool in the first instance.
The phploc project does more than just counting lines of code, it counts a whole selection of features of a codebase and provides these as a report. As an example I ran the tool over a wordpress installation, with this command:
phploc /www/var/wordpress
This gives the following output:
phploc 1.3.2 by Sebastian Bergmann.
Directories: 29
Files: 295
Lines of Code (LOC): 138661
Cyclomatic Complexity / Lines of Code: 0.19
Comment Lines of Code (CLOC): 43498
Non-Comment Lines of Code (NCLOC): 95163
Interfaces: 0
Classes: 168
Abstract: 0 (0.00%)
Concrete: 168 (100.00%)
Lines of Code / Number of Classes: 377
Methods: 1973
Scope:
Non-Static: 1972 (99.95%)
Static: 1 (0.05%)
Visibility:
Public: 1964 (99.54%)
Non-Public: 9 (0.46%)
Lines of Code / Number of Methods: 32
Cyclomatic Complexity / Number of Methods: 5.44
Functions: 1599
Constants: 272
Global constants: 272
Class constants: 0
Straight away we can start to form some impressions about this code. It uses OOP, since there are classes. But look closers, and we see that its not super-theoretical OOP with lots of complicated inheritance since neither abstract classes or interfaces are declared (although the non-public declarations show that some PHP 5 feature are in use). I was particularly impressed by the averages they include, for example giving me a feel for how big their methods are. It is also useful to see how many lines of comment there are in comparison to the number of lines; it seems quite generous on this project but that’s definitely a positive feature of a publicly-released codebase. There’s also a complexity measure – the number as it is means very little to me but I’m sure if this tool was used against a few familiar projects, I’d soon get a feel for what the various values indicate.
Using static analysis tools like these can tell us a lot about the topology of a software project, and it can be interesting to watch how the numbers change over time which is what makes them such useful inclusions in regularly-run batches, such as a continuous integration setup. Having an idea of what your project looks like, and what that means, will help you to understand the project moving forward.

9 comments


9 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.
Would CLOC include both docblocks as well as inline comments. IHMO docblocks are import, but inline comments even more so from a maintenance point of view.. Would be nice to this as well (plus maybe the code/comment ratio)
If I understand correctly, “Cyclomatic Complexity / Number of Methods” essentially tells you the minimum number of tests (on average) a method in this project needs for sufficient code coverage.
Continuing the Discussion