Modeling Metrics for UML Diagrams

UML Design Quantity Metrics

Quantity metrics are counts of the diagram and model types contained in the UML model. The model types are further subdivided into counts of entities and counts of relationships ¹.

The 9 UML diagram types are:

Number of...

Use case diagrams
Activity diagrams
Class diagrams
Sequence Diagrams
Interaction diagrams
State diagrams
Component diagrams
Distribution diagrams
Design diagrams

The 17 entity types are:

Number of...

Sub-Systems
Use cases
Actuators
Components
Interfaces
Classes
Basic/Super Classes
Methods
Parameter
Attributes
Activities
Objects
States
Rules
Stereotypes
Design Entities
Design entities referenced

The 10 relationship types are:

Number of...

Usages
Associations
Generalizations
Interactions
Class hierarchy levels
Method Invocations
Activity Flows
State transitions
Test cases
Design relationships

These quantities or element counts have been selected on the basis of their relation to the goals of object-oriented system design in accordance with the goal-question-metric method of Basili and Rombach ².

UML Complexity Metrics

Complexity metrics are calculations to determine selected complexities ³. Complexity is defined here as the ratio of entities to relationships. The size of a set is determined by the number of elements in that set. The complexity of a set is a matter of the number of relationships between the elements in that set. The more connections or dependencies there are in relation to the number of elements, the greater the complexity is ⁴. The complexity of a single entity is determined by the number of subentities relative to the number of relationships between those subentities. The overall complexity of the design can be simply stated as follows:

Bearing this in mind, the following complexity types have been defined for UML models.

Object Interaction Complexity

The more interactions between objects and the more associations between classes there are, the higher will be the complexity. In this way both the abstract level of the class as well as the physical level of the objects is taken into consideration. This measure is an inverse coupling metric. It is based on empirical evidence that systems with many dependencies among their parts are difficult to maintain ⁵.

Complexity of the class hierarchy

The more hierarchical levels there are in the class hierarchies, the more dependent the lower level classes are on the higher level ones. Deep inheritance has often been criticized for leading to increased complexity. This metric corresponds to the depth of tree metric from Chidamer and Kemerer ⁶. It is based on empirical evidence that object-oriented systems with deep inheritance trees (e.g.> 3) are more error prone than others.

Complexity of the class data

The more data attributes a class has the higher its complexity. This corresponds to the class attribute metric in the Mood metrics ⁷. The design goal is to have many classes, each with a few data attributes, as opposed to having a few classes, each with many attributes. This goal is based on the assumption that it is easier to test and maintain smaller sets of data.

Complexity of the class functions

The more methods, i.e. functions, a class has, the higher its complexity, whereby it is assumed that each class has at least two implicit functions – a constructor and a destructor. This corresponds to the Number of Methods metric of Chidamer and Kemerer. The design goal is to have many classes, each with a minimum number of functions, as opposed to having a few classes, each with many methods. This goal is based on the assumption that it is easier to maintain and test a system which is broken down into many small chunks of functionality.

Complexity of the object state

Objects are instances of a class. Objects have states. The more they have, the more complex they are. A simple class is a singleton with one object that has a static state. A complex class is one with multiple objects, each with several potential states. Neither the CK nor the MOOD metrics consider state complexity, even though it is a principle driver of test effort together with the cyclomatic complexity of the methods. The design goal is to have many classes, each with a minimum number of functions, as opposed to have as few object states as possible, but this is determined by the application. If an object such as an account has many states, e.g. opened, balanced, overdrawn, suspended, closed, etc., they all have to be created and tested.

State Transition Complexity

The connection lines of a state diagram represent the transitions from one state to another. A given state can have any number of successor states. The more there are, the higher the complexity of the state transition graph. As with the McCabe cyclomatic complexity measure, we are actually measuring here the relation of edges to nodes in a graph ⁸. Only here the nodes are not statements but states and the edges are not branches but transitions. The design goal is to have as few transitions as possible, since every state transition has to be tested at least once and that drives the test costs up.

Activity Control Flow Complexity

The connection lines of an activity diagram represent the flow of control from one activity to another. They can be conditional or non-conditional. Conditional flows add to the complexity of the process being modeled. An activity can have any number of successor activities. The more there are and the more conditional ones there are, the higher the complexity of the process.

Use Case Complexity

Use cases as coined by Ivar Jacobson are instances of system usage ⁹. A user or system actor invokes a use case. This is a case of usage. The relationships between use cases may have different meanings. They can mean usage or extension or include or inherits. The more relations there are, the higher the usage complexity. The design goal is to reduce complexity by restricting the number of dependencies between use cases. On the other hand, if the application requires it then they have to be included. Otherwise the complexity is only pushed off to another layer.

Complexity of the actor interaction

System actors trigger the use cases. Any one actor can start one or more use cases. The more use cases there are per actor, the more complex is the relation between actors and the system. From the viewpoint of an actor, a system is complex if he has to deal with many use cases. The more use cases there are per actor, the higher the complexity. A system which has only one use case per actor is simple because it is partitioned in accordance with the actors. The design goal is to restrict the number of use cases per actor. Of course by having more actors, the size of the system in use case points increases.

Overall Design Complexity

The overall design complexity is computed as the relation between the sum of all design entities and the sum of all design relationships.

A design in which each entity has only a few relationships can be considered less complex than a system design in which the number of relationships per entity is high. This reflects complexity as the relation of the number of relationships between elements of a set and the number of elements in a set. The more elements there are the larger the size of the set. The more relationships there are, the higher the complexity of the set. The design goal is to minimize the number of relationships between design entities.

UML Quality Metrics

The design quality metrics are computations for calculating selected qualities. Quality is defined here as the relation of that state the model is in, relative to the state it should be in ¹⁰. Quality measurement presupposes a standard for the UML model. The actual state of the model is then compared with that standard. The closer the model is to fulfilling that standard, the higher is its quality. In German the overall design quality can be simply expressed by the ratio:

The upper limit of the metric is 1. If the ACTUAL value exceeds the TARGET value, the quality target has been exceeded. A quotient coefficient of 0.5 indicates average quality. It should be remembered that quality is relative. Taken by itself, the quotient may not mean that much ¹¹. However, when compared to the quotient derived from another design in exactly the same way, it indicates that one design is of better or worse quality than the other, at least with respect to the quality characteristic being measured. Since there is no absolute quality scale, the quality of one system design can only be judged in relation to the quality of another¹². The following quality characteristics were selected for assessing the quality of a UML model.

Degree of Class Coupling

Class Coupling is the inverse of Interaction Complexity. It is computed by the equation:

The more interactions and associations there are between objects and classes, the greater the interdependence of these objects and classes. This interdependence is called coupling. Classes with a high coupling have a larger range of effect. If they are changed, the other classes are more likely to be affected. The design goal is to have as few interdependencies as possible, i.e., the coupling should be low. This quality attribute is based on the empirical finding that high coupling is associated with a larger impact domain, with a higher error rate, and with a higher maintenance cost ¹³.

Degree of Class Cohesion

Class Cohesion is measured in terms of the number of data attributes in a class relative to the number of class methods. It is computed by the equation:

The term cohesion refers to the degree to which the functions of a module belong together ¹⁴. Functions belong together if they process the same data. This can be called data coupling. So the less data used by the same functions, the better. Classes with high cohesion have many methods and few attributes. Classes with many attributes and few methods have lower cohesion. The design goal is to have as few shared attributes as possible for the same methods. Underlying this quality attribute is the hypothesis that high cohesion is associated with high maintainability. This hypothesis has never really been proven.

Degree of Modularity

Modularity is a measure of decomposition. It expresses the degree to which a large system has been decomposed into many small parts. The theory is that it is easier to deal with smaller units of code ¹⁵. The modularity of classes is determined by the number of attributes and methods that a class has. It is expressed by the equation:

There is a prevailing belief undermined by numerous field experiments that many smaller units of code are easier to change than fewer larger ones. The old roman principle of “divide et imperum” applies to software as well. It has not been proven that smaller modules will necessarily be more error free. Therefore, the justification for modularity is based on the ease of change. In measuring code, modularity can be determined by comparing the actual size of the code units in statements to some predefined maximum size. In an object-oriented design, the elementary units are the methods. The number of methods per class should not exceed a defined limit. In measuring the modularity of UML it is recommended here to compare the total number of methods with the minimum number of methods per class multiplied by the total number of classes. The design goal here is to have as few methods as possible per class so as to encourage the designer to create more and smaller classes.

Degree of Portability

Portability at the design level is a measure of the ease with which the architecture can be ported to another environment. It is influenced by the way the design is packaged. Many small packages can be more easily ported then a few large ones. Therefore it is important to keep the size of the packages as small as possible. The package size is a question of the number of classes per package. At the same time packages should have only few dependencies on their environment. The fewer interfaces each package has, the better. The portability of a system is expressed in the equation:

The justification of this quality attribute goes along the same line as that of modularity. The number of classes per package should not exceed a given limit, nor should a package have more than a given number of interfaces with its environment, since interfaces bind a package with its environment. The design goal is to create packages with a minimum number of classes and interfaces.

Degree of Reusability

Reusability is a measure of the ease with which code or design units can be taken out of their original environment and transplanted into another environment. This means that there should be a minimum of dependencies between design units ¹⁶. Dependencies are expressed in UML as generalizations, associations, and interactions. Therefore, the equation for measuring the degree of dependency is:

The more generalizations, associations and interactions there are, the more difficult it is to take out individual classes and methods from the current architecture and to reuse them in another. As with plants, if their roots are entangled with the roots of neighboring plants, it is difficult to transplant them. The entangled roots have to be severed. This applies to software as well. The degree of dependency should be as low as possible. Inheritance and interaction with other classes raises the level of dependency and lowers the degree of reusability. The design goal here is to have as few dependencies as possible.

Degree of Testability

Testability is a measure of the effort required to test a system relative to the size of the system ¹⁷. The less effort required, the higher the degree of testability. Test effort is determined by the number of test cases to be tested and the width of the interfaces, where this width is expressed by the number of parameters per interface. The equation for calculating testability is:

The number of test cases required is computed based on the number of possible paths thru the system architecture. To test an interface the parameters of that interface have to be set to different combinations. The more parameters it contains, the more combinations have to be tested. Field experience has proven that it is easier to test several narrow interfaces, i.e. interfaces with few parameters, than to test a few wide interfaces, i.e. interfaces with many parameters. Thus, not only the number of test cases but also the width of the interfaces affects the test effort. The design goal here is to design an architecture which can be tested with the least possible effort. This can be achieved by minimizing the possible paths through the system and by modularizing the interfaces.

Degree of Conformity

Conformity is a measure of the extent to which design rules are adhered to. Every software project should have a convention for naming entities. There should be prescribed names for data attributes and interfaces as well as for classes and methods. It is the responsibility of the project management to see that these naming conventions are made available. It is the responsibility of the quality assurance to ensure that they are adhered to. The equation for conformity is very simple:

Incomprehensible names are the greatest barrier to code comprehension. No matter how well the code is structured it will remain incomprehensible as long as the code content is blurred by inadequate data and procedure names. The names assigned in the UML diagrams will be carried over into the code. Therefore, they should be selected with great care and conform to a rigid naming convention. The design goal here is to get the designers to use meaningful, standardized names in their design documentation.

Degree of Consistency

Consistency in design implies that the design documents agree with one another. One should not refer to a class or method in a sequence diagram, which is not also contained within a class diagram. To do so is to be inconsistent. The same applies to the methods in the activity diagrams. They should correspond to the methods in the sequence and class diagrams. The parameters passed in the sequence diagrams should also be the parameters assigned to the methods in the class diagrams. Thus, the class diagrams are the base diagrams. All of the other diagrams should agree with them. If not, there is a consistency problem. The equation for computing consistency is:

When we measure the degree of consistency, we encounter one of the major weaknesses of the UML design language. It is inherently inconsistent. This is because it has been glued together from many different design diagram types, all of which have their own origins. State diagrams, activity diagrams, and collaboration diagrams existed long before UML was born. They were inherited from structured design. The foundation of object-oriented design is Grady Booch's class diagram ¹⁸. Use case and sequence diagrams were added later by Ivar Jacobson. So there has never been a unified design of the UML language. The designer has the ability to create the different diagram types completely independently. If the UML design tool does not check this, it will lead to inconsistent naming. The design goal here is to force designers to use a common namespace for all diagrams and to ensure that the referenced methods, parameters, and attributes are defined in the class diagrams.

Degree of Completeness

Completeness of a design could mean that all of the requirements and use cases specified in the requirement document are covered by the design documentation. To check that would require a link with the requirement repository and to require that the same names are used for the same entities in the design as are used in the requirement text. Unfortunately the state of information technology is far removed from this ideal. Hardly any IT projects have a common name space for all of their documents let alone a common repository. Therefore, what is measured here is only formal completeness, i.e. that all of the diagrams required are also present. Degree of completeness is a simple relation of finished documents to required documents.

The design goal here is to ensure that all UML diagram types required for the project are actually available. As witnessed all of the UML projects ever tested by this author, the design is never completed. The pressure to start coding is too great and once the coding is started the design becomes obsolete.

Degree of Compliance

The ultimate quality of a system design is that it meets the requirements. Not everything that is measured is important and much of what is important is not measurable ¹⁹. That is certainly true here. The only way to determine if the user's requirements are really met is to test the final product against the requirements. The most that can be done is to compare the actors and use cases in the design with those in the requirements. Each functional requirement should be associated with a use case in the requirements document. If this is the case, the use cases in the requirements document should cover all functional requirements. If the number of use cases in the design matches the number of use cases in the requirements, we can consider the design to conform to the requirements, at least formally. This can be expressed by the coefficient:

If there are more use cases designed than were required, this only shows that the solution is greater than the problem. If there are less use cases in the design, then the design is obviously not compliant. The design goal here is to design a system which covers all requirements, at least at the use case level.

UML Design Size Metrics

The design size metrics are computed values for representing the size of a system. Of course what is being measured here is not the system itself, but a model of the system. The system itself will only be measurable when it is finished. One needs size measures at an early stage in order to predict the effort that will be required to produce and test a system. Those size measures can be derived from the requirements by analyzing the requirement texts or at design time by analyzing the design diagrams. Both measurements can, of course, be only as good as the requirements and/or the design being measured. Since the design is more detailed and more likely to be complete, the design size metrics will lead to a more reliable estimate. However, the design is complete much later than the requirements. That means the original cost estimation has to be based on the requirements. If the design based estimation surpasses the original one, it will be necessary to delete functionality, i.e. to leave out less important use cases and objects. If the design based estimation varies significantly from the original one, it will be necessary to stop the project and to renegotiate the proposed time and costs. In any case the project should be recalculated when the design is finished.

There are several methods for estimating software project costs ²⁰. Each is based on a different size metric. When estimating a project, one should always estimate using at least three different methods. For this reason, five size metrics are taken to give the estimator a choice. The five size measures used are:

Data points
Function points
Object Points
Use Case Points
Test cases

Data points

Data points is a size measure originally published by Sneed in 1990 ²¹. It is intended to measure the size of a system based solely on its data model, but including user interfaces. It is a product of 4th generation software development where applications are built around the existing data model. The data model in UML is expressed in the class diagrams. The user interfaces can be identified in the use case diagrams. This leads to the following calculation of data points:

Function points

Function points is a size measure originally introduced by Albrecht at IBM in 1979 ²². It is intended to measure the size of a system based on its inputs and outputs along with its data files and interfaces. Inputs are weighted from 3 to 6, outputs from 4 to 7, data files from 7 to 15, and system interfaces from 5 to 10. This method of system sizing is based on the structured system analysis and design technique. It has evolved over the years, but the basic counting scheme has remained unchanged ²³. It was never intended for object-oriented systems, but can be adapted. In a UML design, classes are the closest thing to logical files. The closest to user input and output are the interactions between actors and use cases. The interfaces between classes can be interpreted as system interfaces. With this rough approximation, we arrive at the following calculation of function points:

Object Points

Object-Points were designed specifically for measuring the size of object-oriented systems by Sneed in 1996. The idea was to find a size measure which could readily be taken from an object-design. As such it fits perfectly to the UML design. Object-Points are obviously the best size measure of an object model. Classes weigh 4 points, methods weigh 3 points, interfaces weigh 2 points and attributes/parameters weigh one point. That way, object-points are computed as:

UseCase Points

UC-Points were introduced in 1993 by a Swedish student working at Ericsson named G. Karner. The idea was to estimate the size of a software system based on the number of actors and the number and complexity of use cases. Both actors and use cases were divided into three levels - simple, medium and difficult. The actors were rated on a scale of 1 to 3, the use cases were now rated on a scale of 5 to 15. ²⁴. Both are multiplied together to get the unadjusted use case scores. This method is also suitable for measuring the size of a UML design, provided the use cases and actors are all specified. Here the median values are used to classify all actors and use cases, but extended by the number of interactions between actors and use cases.

Test cases

Test cases were first used by Sneed in 1978 as a measure of size to estimate the test effort for the Siemens Integrated Transport System - ITS. The motivation behind this was to calculate the module test on the basis of test cases. A test case was defined as the equivalent of a path through the test object. Much later, the method was revived to estimate the cost of testing systems ²⁵. When testing systems, a test case is equivalent to a path through the system. It starts at the interaction between an actor and the system and either follows a path through the activity diagrams or traverses the sequence diagrams via interactions between classes. There should be one test case for each path through the interaction diagrams and for each object state specified in the state diagrams. Thus, the number of test cases is the result of the use case interactions times the number of class interactions times the number of object states. It is calculated as follows:

Automated Analysis of UML Designs with UMLAudit

The UMLAudit tool was developed to measure UML designs. UMLAudit is a member of the SoftAudit toolset for automated quality assurance. This toolset also includes analysis tools for English and German requirements texts, as well as for all leading programming languages, the most popular database schema languages, and several languages for defining user and system interfaces. UMLAudit includes an XML parser that parses the XML files generated by the UML modeling tool to represent the diagrams.²⁶. The diagram and model instance types, names, and relationships are included as attributes that can be easily identified by their model types and names. The measurement object is the XML schema of the UML-2 model with its model types as specified by the OMG ²⁷

The first step of UMLAudit is to collect the design types and names from the XML files and store them in tables. The second step is to go through the tables and count them. The third step is to check the names against the name convention templates. The final step is to check the referential consistency by comparing the entities referenced with the entities defined. As a result two outputs are produced:

a UML deficiency report and
a UML metric report.

The UML Deficiency Report is a log of rule violations and discrepancies listed by diagram. At present there are only two types of deficiencies:

Inconsistent references and
Name rule violations.

If a diagram such as a state, activity or sequence diagram references a class, method or parameter not defined in the class diagram, an inconsistent reference is reported. If the name of an entity deviates from the naming rules for that entity type, a naming violation is reported. These deficiencies are summed up and compared with the number of model types and type names to give the design conformance.

The UML Metric Report lists out the quantity, complexity and quality metrics at the file and system level. The quantity metrics are further subdivided into diagram quantities, structural quantities, relationship quantities and size metrics.

Conclusion

Judging a software system by its design is like judging a book by its table of contents. If the table of contents is very fine-grained, you can judge the structure and layout of the book and make assumptions about its content. The same is true for UML. If the UML design is fine-grained down to a detailed level, it is possible to make an assessment and estimate costs based on the design ²⁸. If it is only coarse-grained, the assessment of the system will be superficial and the estimate unreliable. Measuring the size, complexity and quality of anything can only be as accurate as the thing being measured. UML is just a model and models do not necessarily reflect reality ²⁹. In fact, they rarely do. UML models are often incomplete and inconsistent in practice, making it difficult to build a test on them. This remains the biggest obstacle to model-based testing.

Briand, L./Morasca,S./Basili,V.: "Property-based Software Engineering Measurement", IEE Trans. On S.E. Vol. 22, No. 1, Jan. 1996, p. 68
Briand, L./Morasca,S./Basili,V.: "An Operational Process for Goal-Driven Definition of Measures", IEEE Trans. On S.E. Vol. 28, No. 12, Dec. 2002, p. 1106
Hausen, H.-L./Müllerburg, M.: "Über das Prüfen, Messen und Bewerten von Software", Informatik Spektrum, No. 10, 1987, p. 123
McCabe, T./Butler, C.: "Design Complexity Measurement and Testing", Comm. Of ACM, Vol.32, No. 12, Dec. 1989, p. 1415
Booch, G.: "Measuring Architectural Complexity", IEEE Software, July 2008, p. 14
Chidamer, S./Kemerer, C.: ""A Metrics Suite for object-oriented Design", IEEE Trans on S.E., Vol. 20, No. 6. 1994, p. 476
Harrison,R./Counsel,S./Reuben, V.: "An Evaluation of the MOOD Set of Object-oriented Software Metrics", IEEE Trans. On S.E., Vol. 24, No. 6, June 1998, p. 491
McCabe, T.: "A Complexity Measure," IEEE Trans S.E., Vol. 2, No. 6, 1976, p.308.
Jacobson, I., a.o: Object-Oriented Software Engineering - A Use Case driven Approach, Addison-Wesley Pub., Wokingham, G.B., 1993, p. 153.
Card, D./Glass,R.: Measuring Software Design Quality, Prentice-Hall, Englewood Cliffs, 1990, p. 42
Erdogmus, H.: "The infamous Ratio Measure", IEEE Software, May 2008, p. 4
Rombach, H.-D.; "Design Measurement - Lessons learned", IEEE Software, March 1990, p. 17
Gyimothy, T: "Metrics to measure Software Design Quality", Proc. of CSMR2008, IEEE Computer Society Press, Kaiserslautern, March 2009, p. 3.
Bieman, J./Ott, L.: "Measuring Functional Cohesion", IEEE Trans on S.E., Vol. 20, No. 8, August 1994, p. 644.
Sarkar,S./Rama,G./Kak,A.: "Information-theoretic Metrics for measuring the Quality of Software Modularization", Vol. 33, No. 1, Jan. 2007, p. 14
Sneed, H.M.: "Metrics for software systems reusability", in Informatikspektrum, Vol. 6, pp. 18-20 (1997).
Sneed, H./ Jungmayr, S.: "Product and Process Metrics for Software Testing", Informatikspektrum, Vol. 29, No. 1, p. 23, (2006).
Booch, G.: "Object-oriented Development" IEEE Trans. On S.E., Vol. 12, No. 2, March 1986, p. 211
Ebert, C./Dumke, R.: "Software Measurement, Springer Verlag, Berlin, 2007, p. 1
Sneed, H.: "Estimating the Development Costs of Object-Oriented Software", Informatikspektrum, Vol. 19, No. 3, June 1996, p. 133.
Sneed, H.: "The Data-Point Method", Online, Journal of Data Processing, No. 5, May 1990, p. 48.
Albrecht, A.: "Measuring Application Development Productivity", Proc of Joint SHARE, GUIDE and IBM Symposium, Philadephia, Oct. 1979, p. 83.
Garmus, D.; Herron, D.: Function-Point Analysis: Measurement Process for successful Software Projects, Addison-Wesley, Reading MA., December 15, 2000.
Ribu, K.: Estimating Object-Oriented Projects with Use Cases, Masters Thesis, University of Oslo, Norway, November 2001.
Sneed, H./Baumgartner,M./Seidl,R.: Der system testing, Hanser Verlag, München/Wien, 2008, p. 59
Bock, C.: "UML without Pictures", IEEE Software, Sept. 2003, p. 35
Object Management Group, "UML 2 XML Schema, Version 2.1, April 2003, www.omg.org/cgi-bin/doc?ad/03-04-02
Miller, J.: "What UML should be", Comm. Of ACM, Vol. 45, No. 11, Nov. 2002, p. 67
Selic, B.: "The Pragmatics of Model-Driven Development", IEEE Software, Sept. 2003, p. 19.

Modeling metrics for UML diagrams

UML Design Quantity Metrics

UML Complexity Metrics

Object Interaction Complexity

Complexity of the class hierarchy

Complexity of the class data

Complexity of the class functions

Complexity of the object state

State Transition Complexity

Activity Control Flow Complexity

Use Case Complexity

Complexity of the actor interaction

Overall Design Complexity

UML Quality Metrics

Degree of Class Coupling

Degree of Class Cohesion

Degree of Modularity

Degree of Portability

Degree of Reusability

Degree of Testability

Degree of Conformity

Degree of Consistency

Degree of Completeness

Degree of Compliance

UML Design Size Metrics

Data points

Function points

Object Points

UseCase Points

Test cases

Automated Analysis of UML Designs with UMLAudit

Conclusion