[Collapse]Collaborator 10
 [Expand]General Information/viewarticle/80040/
 [Expand]Getting Started/viewarticle/80359/
 [Expand]Collaborator Server/viewarticle/80060/
 [Expand]Web Client/viewarticle/80107/
 [Expand]Desktop Clients/viewarticle/80151/
 [Expand]Version Control Integration/viewarticle/80048/
 [Expand]External Integrations/viewarticle/80340/
 [Collapse]Techniques & Best Practices/viewarticle/80403/
   Invite A Colleague/viewarticle/80083/
   Review Pools/viewarticle/80092/
   Multiple Change Changelists/viewarticle/80307/
   Optimal Review Size/viewarticle/80404/
   Metrics: Definitions/viewarticle/80338/
   Metrics: Analysis/viewarticle/80407/
   Tips and Tricks/viewarticle/80402/
   Improving Performance/viewarticle/80121/
 [Expand]Appendices/viewarticle/80117/
Updated: 12/23/2016 Applies To: Collaborator 10 Rating: No votes Click to rate: PoorNot badAverageGoodExcellent

Techniques & Best Practices

Top |Previous |Next

Metrics: Definitions

Collaborator collects a variety of raw metrics automatically. This section defines these metrics; a later section discusses what these metrics can tell us.

Lines of Code

The most obvious raw metric is "number of lines of source code". This is "lines" in a text-file context. Often this is abbreviated "LOC".

Collaborator does not distinguish between different kinds of lines. For example, it does not separately track source lines versus comment lines versus whitespace lines.

For code review metrics, often you usually want to use general lines of code and not break it down by type. Often the code comments are just as much a part of the review as the code itself -- check for consistency and ensuring that other developers will be able to understand what is happening and why.

The lines of code metrics (LOC metrics) are calculated only for source code and other text-based files and only if the files are stored in a version control system. Otherwise, for other types of review materials (Word, Excel, PDF or Image files), or when the files were uploaded from the Web Client, the metrics are not calculated.

The LOC metrics displayed on the Review Summary page include added lines (icon-lines-added), changed lines (icon-lines-changed) and removed lines (icon-lines-removed). If the Overlay view is selected (default), the LOC metrics are calculated comparing the latest uploaded revision of file against the baseline revision of that file. Here, the baseline revision stands for the revision at the moment the review was created. If the Separate view is selected, the LOC metrics are calculated comparing each individual file revision against its previous revision.

The Customizable Review Reports may provide you with more line-related metrics. Additionally to added, changed and removed lines, they can display total number of all lines of all uploaded files (LOC Uploaded), number of reworked lines (sum of added, changed and removed lines) (LOC Reworked) and the difference between number of added and removed lines (LOC Delta).

Keep in mind, that "Ignore Whitespace", "Ignore Sequence Number" and other Diff Viewer settings do not affect on how line metrics are calculated. They only affect on how line differences are displayed.

Time in Review

How much time (person-hours) did each person spend doing the review? Collaborator computes this automatically. This raw metric is useful in several other contexts, usually when compared to the amount of file content reviewed.

Developers (rightly) hate using stopwatches to track their activity, but how can Collaborator -- a web server -- automatically compute this number properly?

Our technique for accurately computing person-hours came from an empirical study we did at a mid-sized customer site. The goal was to create a heuristic for predicting on-task person-hours from detailed web logs alone.

We gave all review authors and reviewers physical stop-watches and had them carefully time their use of the tool. Start the stopwatch when they began a review, pause if they break for any reason -- email, bathroom, instant messenger. The times were recorded with each review and brought together in a spreadsheet.

At the same time, we collected detailed logs of web server activity. Who accessed which pages, when, and so forth. Log data could easily be correlated with reviews and people so we could "line up" this amalgamation of server data with the empirical stopwatch times.

Then we sat down to see if we could make a heuristic. We determined two interesting things:

First, a formula did appear. It goes along these lines: If a person hits a web page, then 7 seconds later hits another page, it is clear that the person was on-task on the review for the whole 7 seconds. If a person hits a web page, then 4 hours later hits another page, it is clear that the person was not doing the review for the vast majority of that time. By playing with various threshold values for timings, we created a formula that worked very well -- error on the order of 15%.

Second, it turns out that humans are awful at collecting timing metrics. The stopwatch numbers were all over the map. People constantly forgot to start them and to stop them. Then they would make up numbers that "felt right," but it was clear upon close inspection that their guesses were wrong. Some people intentionally submitted different numbers, thinking this would make them look good (that is, "Look how fast I am at reviewing!").

So the bottom line is: Our automated technique is not only accurate, it is more accurate than actually having reviewers use stopwatches. The intrinsic error of the prediction heuristic is less than the error humans introduce when asked to do this themselves.

Total Person-Time

The total of all recorded time that all the users were looking at review. Total Person-Time is an aggregate value for all users taking part in a review, while Time in Review is counted for each separate user.

Reviewer Time and Author Time are subsets of Total Person-Time, limited to the time that was spent in the reviewer and author roles, respectively.

Defect Count

How many defects did we find during this review? Because reviewers explicitly create defects during reviews, it is easy for the server to maintain a count of how many defects were found.

Furthermore, the system administrator can establish any number of custom fields for each defect, usually in the form of a drop-down list. This can be used to subdivide defects by severity, type, phase-injected, and so on.

File Count

How many files did we review? Usually the LOC metric is a better measure of "how much did we review," but sometimes having both LOC and number of files is helpful together.

For example, a review of 100 files, each with a one-line change, is quite different from a review of one file with 100 lines changed. In the former case, this might be a relatively simple refactoring; with tool support, this might require only a brief scan by a human. In the latter case, several methods might have been added or rewritten; this would require much more attention from a reviewer.

Wall-Clock Time, Review Wall-Clock Duration

How much time has passed since the review was created and till the review was completed (or now, if the review is still in progress). This is a useful metric if you want to make sure all reviews are completed in a timely manner.




© 2019 SmartBear Software. All rights reserved.
Email Send feedback on this document