Files
opaque-lattice/papers_txt/Component-based-architectural-regression-test-select_2025_Journal-of-Systems.txt
2026-01-06 12:49:26 -07:00

1147 lines
139 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Journal of Systems Architecture 160 (2025) 103343
Contents lists available at ScienceDirect
Journal of Systems Architecture
journal homepage: www.elsevier.com/locate/sysarc
Component-based architectural regression test selection for modularized
software systems
Mohammed Al-Refai a ,, Mahmoud M. Hammad b
a Computer Science, Computer and Information Technology, Jordan university of science and technology, P.O. Box 3030, Irbid, 22110, Jordan
b
Software Engineering, Computer and Information Technology, Jordan university of science and technology, P.O. Box 3030, Irbid, 22110, Jordan
ARTICLE INFO ABSTRACT
Keywords: Regression testing is an essential part of software development, but it can be costly and require significant
Regression test selection computational resources. Regression Test Selection (RTS) improves regression testing efficiency by only re-
Static analysis executing the tests that have been affected by code changes. Recently, dynamic and static RTS techniques for
Component-based architecture
Java projects showed that selecting tests at a coarser granularity, class-level, is more effective than selecting
Java platform module system
tests at a finer granularity, method- or statement-level. However, prior techniques are mainly considering
Software architecture
Java object-oriented projects but not modularized Java projects. Given the explicit support of architectural
constructs introduced by the Java Platform Module System (JPMS) in the ninth edition of Java, these research
efforts are not customized for component-based Java projects. To that end, we propose two static component-
based RTS approaches called CORTS and its variant C2RTS tailored for component-based Java software
systems. CORTS leverages the architectural information such as components and ports, specified in the module
descriptor files, to construct module-level dependency graph and identify relevant tests. The variant, C2RTS,
is a hybrid approach in which it integrates analysis at both the module and class levels, employing module
descriptor files and compile-time information to construct the dependency graph and identify relevant tests.
We evaluated CORTS and C2RTS on 1200 revisions of 12 real-world open source software systems, and
compared the results with those of class-level dynamic (Ekstazi) and static (STARTS) RTS approaches. The
results showed that CORTS and C2RTS outperformed the static class-level RTS in terms of safety violation
that measures to what extent an RTS technique misses test cases that should be selected. Using Ekstazi as the
baseline, the average safety violation with respect to Ekstazi was 1.14% for CORTS, 2.21% for C2RTS, and
3.19% for STARTS. On the other hand, the results showed that CORTS and C2RTS selected more test cases
than Ekstazi and STARTS. The average reduction in test suite size was 22.78% for CORTS and 43.47% for
C2RTS comparing to the 68.48% for STARTS and 84.21% for Ekstazi. For all the studied subjects, CORTS
and C2RTS reduced the size of the static dependency graphs compared to those generated by static class-level
RTS, leading to faster graph construction and analysis for test case selection. Additionally, CORTS and C2RTS
achieved reductions in overall end-to-end regression testing time compared to the retest-all strategy.
1. Introduction in the overall test-suite execution time. This rapid increase poses a
challenge to manage, even for a company with extensive computing
Regression testing is the process of running the existing test cases resources [12]. Regression test selection (RTS) approaches are used to
on a new version of a software system to ensure that the performed improve regression testing efficiency [3,12]. RTS is defined as the ac-
modifications do not introduce new faults to previously tested code [1
tivity of selecting a subset of test cases from an existing test set to verify
3]. Regression testing is one of the most expensive activities performed
that the affected functionality of a program is still correct [3,12,13].
during the lifecycle of a software system with some studies [410]
estimating that it can take up to 80% of the testing budget and up to The RTS problem has been studied for over three decades [14,15].
50% of the software maintenance cost. For instance, Google reported Traditional code-based RTS approaches take four inputs: the two ver-
that their regression-testing system, TAP [11], experienced a linear sions (new and old) of a software system, the original test suite, and
growth in both the number of software changes and the average test- dependency information of the test cases on the old version. The output
suite execution time, which ultimately resulted in a quadratic rise
Corresponding author.
E-mail addresses: mnalrefai@just.edu.jo (M. Al-Refai), m-hammad@just.edu.jo (M.M. Hammad).
https://doi.org/10.1016/j.sysarc.2025.103343
Received 30 May 2024; Received in revised form 12 January 2025; Accepted 12 January 2025
Available online 18 January 2025
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
is the subset of test cases from an existing test set that must be graph size with respect to static class-level RTS techniques, (4) execu-
re-executed on the modified version of the software system [12]. tion time required to construct and analyze the static dependency graph
RTS techniques vary in the granularity at which they compute to select relevant test cases, and (5) reduction in the end-to-end regress-
test dependencies from test cases to code statements, basic blocks, ing testing time compared to the retest-all strategy. We compared the
methods, or classes. Recently, researchers showed that, for individual results obtained by CORTS and C2RTS with those of the state-of-the-
projects, class-level RTS can be more efficient and beneficial than iden- art class-level dynamic (Ekstazi [12]) and static (STARTS [18]) RTS
tifying changes and computing dependencies at lower granularities, approaches, using 1200 revisions of 12 real world Maven-based Java
e.g., statement and method levels [12,16,17]. Therefore, the current software systems.
trend [12,1620] is to focus on class-level RTS by (1) identifying This paper is organized as follows. Section 2 provides an illustrative
changes at the class level and (2) computing dependencies from test example to explain the work of our approach. Section 3 describes the
cases to the classes under test. In addition to supporting class-level RTS, proposed approaches, CORTS and C2RTS. Section 4 presents the eval-
these approaches consider a test class as a test case, and thus, select test uation. Section 5 describes the threats to the validity of our approach
classes instead of test methods [12,18,19].1 and results. Related work is summarized in Section 6. Conclusions and
Class-level RTS can be static or dynamic, by analyzing dependencies plans for future work are outlined in Section 7.
from test cases to classes under test statically or dynamically. A recent
extensive experimental evaluation of static class-level RTS [17,18] 2. Illustrative example
showed that it is comparable with the state-of-the-art dynamic class-
level RTS approach, called Ekstazi [12]. While such a dynamic RTS This section presents an illustrative example of a Java 9 Component-
approach requires code instrumentation and runtime information to Based (CB) application of a university system, which is adapted from
find affected tests, static class-level RTS does not require such infor- the example used in Hammad et al. [22]. We use this example in the
following section (i.e., Section 3) to demonstrate how our approaches,
mation, and instead, it builds a dependency graph of program types
CORTS and C2RTS, are used with a CB application.
based on compile-time information, and selects test cases that can reach
The university system example is developed according to the Java
changed types in the transitive closure of the dependency graph [17,
Platform Module System (JPMS) [21], which is a key feature of project
18]. However, static class-level RTS approaches can be unsafe, which
Jigsaw [23], designed to provide a scalable module system for Java.
means they might miss selecting test cases that are impacted by code
It enables developers to build applications using modular constructs,
changes. The use of Java reflection is the main cause of unsafety in
i.e., components (modules) and ports (module directives), offering a
static RTS approaches when compared with dynamic RTS approaches.
higher level of abstraction than packages or classes. The modularized
Reflection in Java allows for runtime behaviors that can be challenging
Java 9 JRE allows applications to depend on specific modules of the
to predict statically, which means static RTS might miss identifying
JRE rather than the entire runtime environment. Each module in JPMS
some dependencies during test selection [17,18].
includes a descriptor file called module-info.java, which specifies its
The previous dynamic and static class-level RTS techniques have
dependencies and exported services. The JPMS supports various ports
primarily focused on Java object-oriented projects, without addressing
that enable a module to export its services or require services from
the unique needs of modularized Java applications. With the intro-
other modules, facilitating clear and maintainable module interactions.
duction of the Java Platform Module System (JPMS) [21] in Java 9
Fig. 1 shows the component-based architecture of the university
and newer versions, existing RTS research approaches have not been system. It is important to mention that Hammad et al. [22] created this
adapted to accommodate the architectural constructs of component- university system by converting its equivalent Java 8 Object Oriented
based Java projects. To bridge this gap, we propose two static compo version to the CB version according to the OO2CB tool proposed in
nent-based RTS approaches, CORTS and its variant C2RTS, specifi- [22], which is a tool that converts Java 8 OO apps to equivalent Java 9
cally designed for component-based Java software systems that are CB apps following the least-privilege security principle. A least-privilege
developed using the JPMS architectural constructs. architecture is an architecture in which each component is only granted
JPMS provides explicit implementation-level support for well-known the exact privileges, in terms of inter-component communications as
architectural constructs, such as components (called modules) and ports well as the required JRE modules, it needs to provide its functional-
(called module directives). These constructs provide a higher level of ab- ity [22,24]. This principle is also important to perform safe and precise
straction than Java packages and classes. CORTS leverages the architec- regression test selection based on the exact needed inter-component
tural constructs information, such as components and ports, presented communications/dependencies.
in the module descriptor files, named module-info.java [21], to con- Before presenting the example details, it is also important to men-
struct module-level dependency graph. The variant, C2RTS, is a hybrid tion that generally, there are two common methods for organizing test
technique that integrates module- and class-level analysis, and there- cases in CB applications: (1) placing them in separate test-components
fore, uses both the module descriptor files and part of compile-time or (2) alongside core application classes within app-components. The
information to construct the dependency graph. The two approaches, first method, adopted in this paper for illustration, creates distinct
CORTS and C2RTS, find relevant test cases that can reach some changed components for test classes, aligning with the separation of concerns
module/class in the transitive closure of the dependency graph. Similar principle by isolating production and test code. This approach ensures
to recent RTS approaches [12,1620], CORTS and C2RTS consider each clear boundaries and flexible management of dependencies specific to
test class as a test case. testing. Both CORTS and C2RTS are compatible with either method. In
CORTS and C2RTS can improve safety over traditional static class- this paper, we use test-components to refer to the modules containing
level RTS techniques by capturing runtime module-level dependencies test classes, and use app-components to refer to the modules containing
that are related to reflection and dynamic class loading mechanisms. the core application classes, i.e., production code.
This is possible because such dependencies are explicitly defined in- As depicted in Fig. 1, the university system consists of four app-
side the module descriptor files using the open and opens with components, i.e., modules,2 which are location, registration,
directives [21]. stuService, and serviceProvider. In addition, the system con-
We evaluated CORTS and C2RTS in terms of (1) safety and precision tains three test-components, which are locationTest, registra-
violations, (2) reduction in test suite size, (3) reduction in dependency tionTest, and serviceProviderTest. The java.logging com-
ponent is also used by the system. The Java classes in the system
1
From this point until the end of the paper we use the term test case to
2
refer to a test class. In the paper, we use the terms module and component interchangeably.
2
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Fig. 1. Component-Based (CB) application adapted from [22].
interact as follows: The StuSchedule class generates a suggested serviceProviderTest declares a requires directive in its
schedule for a student and logs relevant details to a log file using module-info.java file to establish this communication, as shown on
the java.util.Logger class. Additionally, StuSchedule dynami- Line 8 of Fig. 2(b). Simultaneously, the stuService component de-
cally loads the ClassRoomManager class and invokes its methods via fines an exports to directive to expose the people package to ser-
Java reflection to retrieve classroom information, which it logs in the viceProviderTest, as illustrated on Line 13 of Fig. 2(a). Addition-
students schedules. A student can either be an Undergraduate or a ally, because IStudent is an interface, serviceProviderTest
Graduate, with both classes implementing the IStudent interface. also declares a uses directive, as indicated on Line 9 of Fig. 2(b).
The corresponding module-info.java files for the four app-comp
onents are presented in Fig. 2(a), while those for the three test-com 3. Approach
ponents are shown in Fig. 2(b). In the remainder of this section, we
discuss some of the key directives used in these module-info.java This section describes our proposed component-based RTS appro
files: provides with, exports to, opens to, and uses. aches, CORTS and C2RTS, which are static analysis tools and based
on analyzing dependencies from test cases to the components of the
As shown in Fig. 1, the stuService component contains the
software application under test. CORTS and C2RTS assume that the
IStudent interface inside the people package. In order for the Un-
software application is component-based Java application. It also worth
dergraduate and Graduate classes from the serviceProvider
mentioning that if the app is constructed according to the least-
component to implement the interface, the stuService needs to ex-
privilege architecture, in which each component is only granted the
port the people package using the exports to port that is shown in
precise dependencies to components and resources that are needed to
Line 12 of Fig. 2(a). In addition to the exports to port, the servi-
provide its functionality, our RTS approaches yield more precise test
ceProvider component needs to define two more ports. One port to case selection.
require the stuService component as shown in Line 18 of Fig. 2(a)
Consistent with the current trend in code-based RTS research [12,
and another provides with port to provide the functionalities of
18,19], CORTS and C2RTS consider a test class to be a test case. They
the IStudent interface using the Graduate and Undergraduate support both unit and system test cases. The inputs to CORTS and
implementation as shown in Lines 1921 of Fig. 2(a). C2RTS are the previous version of the Java application along with its
The class StuSchedule located in the registration compo- test cases, i.e., the application before modification, and the current
nent contains a code to dynamically load the class ClassRoomMan- (modified) version of the Java application. The output is the set of
ager and invoke its methods using Java reflection. Therefore, the selected test cases that must be re-executed on the current version of
location component that contains the ClassRoomManager class the application.
defines an opens to port to open the package ClassRoom to the We present CORTS in Section 3.1 and its variant C2RTS is described
registration component as shown in Line 26 of Fig. 2(a). This in Section 3.2.
port enables the classes of the registration component to load and
access all classes of the ClassRoom package using the Java reflection 3.1. The corts approach
mechanisms.
The test classes TestGraduate and TestUndergraduate, be- CORTS takes the previous version of a CB Java app along with its
longing to the serviceProviderTest test-component, use the IS- test cases, then it parses the module descriptor files (the module-
tudent interface from the stuService component. As a result, info.java files) of all app-components and test-components. While
3
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Fig. 2. module-info.java files.
parsing the descriptors, CORTS constructs a directed graph, called En- all communication ports that are directed towards or emanating from T
tity Dependency Graph (EDG), where each node represents a component are represented in the EDG as directed edges leading to or originating
or a test case (test class), and the directed edges among the nodes from every node representing a test class belonging T. CORTS is capable
represent the various types of dependencies among the components, of identifying all test classes associated with a given test-component
such as requires, uses and provides with dependencies. After through a straightforward method. This involves navigating the file
that, CORTS compares the previous version of the CB app with the system directory designated for the test-component and locating the
current version of the app to identify the modified components and class files contained inside it. In the context of component-based Java
flag their corresponding nodes in the EDG. Then, CORTS finds and applications organized using JPMS features, each component is as-
returns the set of affected test cases that directly or transitively reach signed a distinct OS directory. This directory houses all Java packages,
a modified component in the EDG. The detailed process of CORTS classes, and the module-info.java file pertinent to the component,
consists of the three steps: facilitating the identification process.
When CORTS scans the module-info.java file of each compo-
1. Building the EDG from the component-based application (Sec- nent in the CB app, it adds directed edges in the EDG according to the
tion 3.1.1). following rules. We demonstrate each of these rules using the extracted
2. Identifying the modified components in the EDG (Section 3.1.2). EDG shown in Fig. 3 for the illustrative CB example depicted in Fig. 1.
3. Selecting the affected test cases (Section 3.1.3).
Rule 1 (Requires Port). Let 𝑀1 be a component that requires another
We demonstrate these steps in light of the illustrative example component 𝑀2 , where this communication is represented using the state-
shown in Fig. 1. ment "requires 𝑀2 " in the module-info.java file of 𝑀1 . This
requires port means that a class(es) that belongs to 𝑀1 depends/
3.1.1. Building the EDG from the component-based application communicates with a class(es) that belongs to 𝑀2 . According to this de-
In this step, CORTS parses the module-info.java descriptors pendency, CORTS adds a directed edge from node 𝑀1 to node 𝑀2 in the
of the app- and test-components of the previous version of the Java EDG.
application. While parsing the descriptor files, CORTS builds the EDG, For example, the registration component requires the
where each node in this directed graph represents a component or a stuService component as specified in the corresponding module de-
test case, and the directed edges among the nodes represent the various scriptor file shown in Fig. 2, and therefore, a directed edge is added in
types of dependencies among the components. As an example, Fig. 3 the EDG from node registration to node stuService as depicted
shows the extracted EDG for the CB example shown in Fig. 1. in Fig. 3. Moreover, as shown in Fig. 2, the serviceProviderTest
CORTS distinguishes between descriptor files of app-components component requires the stuService component. Therefore, a
and those of test-components. If the module-info.java descriptor directed edge is added in the EDG from every test class node belonging
is for an app-component A, then a node is added in the EDG for A, to serviceProviderTest, i.e., the test classes TestGraduate
and all communication ports that are directed towards or emanating and TestUndergraduate, to the node stuService, as shown in
from A, e.g., requires or use ports, are represented in the EDG as Fig. 3.
directed edges leading to or originating from the node A. However, if
the descriptor is for a test-component T, then CORTS adds a node in Rule 2 (Provides With and Uses Ports). Let 𝐶1 be a class in module
the EDG for each individual test class that belongs to T. Subsequently, 𝑀1 and 𝐴2 be an abstract class or an interface in module 𝑀2 , where
4
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Fig. 3. Entity Dependency Graph (EDG) extracted by CORTS.
𝐶1 implements or extends 𝐴2 . This dependency is represented using the 3.1.2. Identifying the modified components in the EDG
statement "provides 𝐴2 with 𝐶1 " in the module-info.java file This step involves identifying the modified components to mark
of 𝑀1 . Additionally, let 𝐶3 be a class that belongs to module 𝑀3 , where their associated nodes in the EDG as modified. CORTS considers a com-
𝐶3 uses 𝐴2 , which is represented using the statement "uses 𝐴2 " in the ponent modified if any of its classes have undergone changes. There are
module-info.java file of 𝑀3 . Then, the component 𝑀3 can utilize several methods to determine which classes have been modified. For in-
the java.util.ServiceLoader from the java.base JPMS JDK stance, the Linux diff command can be used to compare the directories
module to load implementations (i.e., 𝐶1 belonging to 𝑀1 ) of the service 𝐴2 . of a component across the previous and current versions of the Java
According to this dependency from component 𝑀3 to component 𝑀1 that application. Should this command highlight a components directory
contains the concrete class 𝐶1 , CORTS adds a directed edge from node 𝑀3 due to alterations or removal of any class within it, or the addition
to node 𝑀1 in the EDG. of new classes into it, CORTS will then mark the node representing
For example, as depicted in the module configuration files shown in that component in the EDG as modified. Another method involves
Fig. 2, the serviceProvider component provides the interface comparing the smart checksums of the previous and current versions
IStudent of the component stuService with the concrete classes of each compiled Java file (i.e., .class files) to identify changed
Graduate and Undergraduate. Additionally, the registration classes [12]. In environments employing Continuous Integration (CI)
component uses the IStudent interface. Those communication ports for Java application development, like GitHub, the modifications can
enable the component registration to access the component ser- also be traced through version control specific commands, such as git
viceProvider and load the two concrete classes, Graduate and diff, to find the changed classes and components. Currently, CORTS
Undergraduate, via the class java.util.ServiceLoader. primarily utilizes the Linux diff strategy to pinpoint and mark the
Therefore, a directed edge is added in the EDG from node regis- modified components within the EDG. However, it is effortless to make
tration to node serviceProvider as shown in Fig. 3. Likewise, CORTS supports other strategies.
the test component serviceProviderTest uses the IStudent For example, if the ClassRoomManager class is modified, e.g.,
interface as depicted in Fig. 2, which grants this test component an some of its source code is changed to add/delete/modify methods, then
access to the concrete classes Graduate and Undergraduate of the component containing this class, which is location, is marked as
the component serviceProvider. Therefore, a directed edge is modified in the EDG shown in Fig. 3.
added in the EDG from each test class node (i.e., nodes representing
TestGraduate and TestUndergraduate that belong to ser-
viceProviderTest) to node serviceProvider, as shown in 3.1.3. Selecting the affected test cases
Fig. 3. In this step, mirroring the methodology of firewall static RTS
approaches [17,25], CORTS traverses the EDG to identify the nodes of
Rule 3 (Opens with Port). Let 𝑝1 be a package that belongs to a module all test cases that reach nodes representing modified components. In
𝑀1 , and let this module opens 𝑝1 to another module 𝑀2 , such that this particular, CORTS calculates the transitive closure for each test case
dependency is represented using the statement "opens 𝑝1 to 𝑀2 " in the to find all the components that a test case depends on. Subsequently,
module-info.java file of 𝑀1 . Then, 𝑀2 can communicate with 𝑀1 the set of impacted test cases whose transitive dependencies include
and load and access classes of the package 𝑝1 via the Java reflection and some modified component, is returned as the output by CORTS. We
dynamic class loading mechanisms. According to this dependency from 𝑀2 used the JGraphT library [26] to construct the EDG and to calculate
to 𝑀1 , CORTS adds a directed edge from node 𝑀2 to node 𝑀1 in the EDG. the transitive closures for the test cases within the EDG.
For example, in the module configuration files shown in Fig. 2, To complete the demonstration example, if the class ClassRoom-
the location component opens its package classRoom to the Manager is modified and its component location is marked in
registration component. Therefore, a directed edge is added in the EDG shown in Fig. 3, then all test cases that transitively reach
the EDG from node registration to node location, as shown the location node, which are TestClassRoomMngr and Test-
in Fig. 3. StuSchedule, will be selected and returned as the output of CORTS.
5
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Fig. 4. Entity Dependency Graph (EDG) extracted by C2RTS.
3.2. The c2rts approach Representing unmodified app-components as nodes in the EDG.
The unmodified app-components of the application are handled us-
We have developed a hybrid RTS approach that combines aspects ing the same way employed by CORTS, where they are presented
from both the Component-level and Class-level RTS techniques, called as nodes in the EDG using the same method described previously in
C2RTS. This variant of CORTS integrates module- and class-level de- Section 3.1.1. For example, the app-component location is identified
pendency analyses, trading off to strike a balance between safety and by C2RTS as unmodified, and therefore, is represented as a single node
precision by adjusting the level of granularity from modules to classes in the EDG.
depending on the specific classes where code changes have been made. Representing test-components as nodes in the EDG. Similar to
C2RTS trades off some safety for increased precision compared to CORTS, the C2RTS approach creates a separate node for each individual
CORTS. test class in the EDG. For example, the EDG shown in Fig. 4 contains a
While constructing the EDG, C2RTS distinguishes between modified node for each test class, such as the test classes TestStuSchedule
and unmodified app-components within the Java application. Specifi- and TestGraduate.
cally, each unmodified app-component is represented as a single node Next, we describe the various ways of C2RTS for (1) extracting
in the EDG, whereas all classes belonging to a modified app-component dependencies among classes of a modified app-component in Sec-
are represented as individual nodes. tion 3.2.1.2, (2) extracting dependencies among unmodified app-
As an example for an EDG constructed by C2RTS, Fig. 4 shows the components in Section 3.2.1.3, (3) extracting dependencies between
unmodified and modified app-components in Section 3.2.1.4, and (4)
constructed EDG given that the app-component serviceProvider is
extracting dependencies between test-components and app-components
identified as a modified app-component by C2RTS, and thus, all classes
in Section 3.2.1.5.
belonging to this app-component are represented as individual nodes
in the EDG. The remaining app-components are unmodified, and thus, 3.2.1.2. Extracting dependencies among modified app-component classes.
each of them is represented as a single node in the EDG. The subsequent The dependencies among the classes of a modified app-component are
subsections elaborate on the entire process undertaken by C2RTS to extracted using the Oracle Java Class Dependency Analyzer (jdeps)
construct the EDG and select test cases. tool [27].3 These dependencies are represented as directed edges in
the EDG between the nodes representing the classes of the modified
3.2.1. Building the EDG from the component-based application app-component.
We explain the steps applied by C2RTS to build the EDG nodes and
3.2.1.3. Extracting dependencies among unmodified app-components. The
edges from the component-based application.
dependencies among the unmodified app-components are extracted
3.2.1.1. Mappings from components to nodes in the edg. This section and represented in the EDG according to Rules 1, 2, and 3 described
explains how C2RTS maps the unmodified app-components, modified previously in Section 3.1.1. For example, in the EDG represented in
app-components, and test-components to nodes in the EDG. Fig. 4, C2RTS added an edge from node registration to node
Representing modified app-components as nodes in the EDG. location according to Rule 3.
Given the previous and current versions of the CB app, if an app- 3.2.1.4. Extracting dependencies between unmodified and modified app-
component is modified between the two versions, then instead of components. The dependencies between the unmodified components
representing this app-component as a single node in the EDG, C2RTS and the classes of the modified components are extracted using: (1)
represents each class belonging to the app-component as a single node information extracted from the component configuration files, the
in the EDG. module-info.java files, and (2) information extracted using the
In our illustrative example, we suppose that the app-component jdeps tool. These extracted dependencies are used to construct the
serviceProvider depicted in Fig. 1 is identified as modified by
C2RTS. Consequently, all the classes of this component (i.e., the Un-
dergraduate and Graduate classes) are represented as nodes in 3
jdeps now is part of the standard Java library, and is used to analyze the
the EDG, as shown in the EDG represented in Fig. 4. module-level, package-level, and class-level dependencies of Java class files.
6
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
EDG according to three formally defined rules, Rules 4, 5, 6. The three added in the EDG from the nodes representing Undergraduate and
rules are described using the following assumptions: Graduate to the node representing stuService, as shown in Fig. 4.
• Let 𝐴𝑝𝑝 be a previous version of a component-based Java appli- 3.2.1.5. Extracting dependencies between test-components and app-comp
cation that was modified to the current version 𝐴𝑝𝑝 . onents. The C2RTS approach represents each individual test class be-
• Let a module 𝑀1 represents an app-component that was identified longing to a test-component as a single node in the EDG. The depen-
as modified between 𝐴𝑝𝑝 and 𝐴𝑝𝑝 , i.e., some classes that belong dencies from test classes to unmodified app-components are extracted
to 𝑀1 were modified. and represented as edges in the EDG according to Rules 1, 2, and 3
• Let a module 𝑀2 represents an app-component that belongs to explained previously in Section 3.1.1. On the other hand, dependencies
𝐴𝑝𝑝 and this module is not modified in 𝐴𝑝𝑝 , i.e., unmodified from test classes to classes belonging to modified app-components are
app-component. extracted according to the following two rules, Rules 7 and 8 which are
Given these assumptions, C2RTS represents 𝑀2 as a single node in modified versions of Rules 4 and 5, respectively.
the EDG, and instead of representing 𝑀1 as a single node, C2RTS rep-
resents all the classes/interfaces belonging to 𝑀1 as nodes in the EDG. Rule 7 (Provides with and Uses Ports). Let 𝐶1 be a class in module 𝑀1
Subsequently, C2RTS extracts the dependencies between the classes of and 𝐴2 be an abstract class or an interface in module 𝑀2 , where 𝐶1
𝑀1 and the module 𝑀2 and reflects them as edges in the EDG based implements or extends 𝐴2 . This port is represented using the statement
on the following rules: "provides 𝐴2 with 𝐶1 " in the "module-info.java" file for 𝑀1 .
Additionally, let 𝑇 𝑀3 be a test-module, where some test classes belonging
Rule 4 (Provides with and Uses Ports). Let 𝐶1 be a class in module 𝑀1 and to 𝑇 𝑀3 use 𝐴2 , and this dependency is represented using the statement
𝐴2 be an abstract class or an interface in module 𝑀2 , where 𝐶1 implements "uses 𝐴2 " in the "module-info.java" file of 𝑇 𝑀3 . This uses
or extends 𝐴2 . This port is represented using the statement "provides port enables test classes belonging to the test-component 𝑇 𝑀3 to utilize
𝐴2 with 𝐶1 " in the "module-info.java" file for 𝑀1 . Additionally,
the java.util.ServiceLoader from the java.base JPMS JDK
let 𝐶3 be a class that belongs to an unmodified module 𝑀3 (𝑀3 is an
module to load the implementation of 𝐶1 that belongs to 𝑀1 . Subsequently,
app-component), where 𝐶3 uses 𝐴2 , and this port is represented using the
C2RTS applies the jdeps technique with 𝑇 𝑀3 and 𝑀2 to find the test
statement "uses 𝐴2 " in the "module-info.java" file for 𝑀3 . Then,
classes of 𝑇 𝑀3 that depend on 𝐴2 . Let jdeps returned that a test class
the component 𝑀3 can utilize the java.util.ServiceLoader from
the java.base JPMS JDK module to load the implementation of 𝐶1 that 𝑇 𝐶3 belonging to 𝑇 𝑀3 depends on 𝐴2 . Then, C2RTS adds a directed edge
belongs to 𝑀1 . According to this dependency from component 𝑀3 to class from node 𝑇 𝐶3 to node 𝐶1 in the EDG because 𝑇 𝐶3 can load 𝐶1 via the
𝐶1 , C2RTS adds a directed edge from node 𝑀3 to node 𝐶1 in the EDG. class java.util.ServiceLoader.
We explain how this rule is applied to the EDG nodes represented For example, in the module configuration files shown in Fig. 2,
in Fig. 4 given the module configuration files shown in Fig. 2. In the the serviceProvider component provides the interface IS-
configuration files, the serviceProvider component provides tudent of the stuService component with the concrete classes
the interface IStudent of the stuService component with the Undergraduate and Graduate. Additionally, the test-component
concrete classes Undergraduate and Graduate. Additionally, the serviceProviderTest uses the IStudent interface as depicted
registration component uses the IStudent interface. These in Fig. 2. Therefore, C2RTS finds, using jdeps, which test classes
ports enable the component registration to load implementa- of serviceProviderTest depend on IStudent, and the jdeps
tions of the two concrete classes Graduate and Undergraduate.
returns that the test classes TestGraduate and TestUndergrad-
Therefore, two directed edges are added in the EDG from the node
uate depend on IStudent. Subsequently, directed edges are added
registration to the nodes Graduate and Undergraduate as
in the EDG from each of these test classes to the concrete classes
shown in Fig. 4.
Graduate and Undergraduate as depicted in Fig. 4.
Rule 5 (Requires Port from Unmodified to Modified App-Component).
Let 𝑀2 requires 𝑀1 , where this port is represented using the statement Rule 8 (Requires Port from Test-Component to Modified App-Component).
"requires 𝑀1 " in the "module-info.java" file for 𝑀2 . Then, ac- Let 𝑇 𝑀1 be a test-component that requires the modified app-component 𝑀1 ,
cording to this dependency from an unmodified module (𝑀2 ) to a modified such that this dependency is represented using the statement "requires
module (𝑀1 ), C2RTS uses the jdeps technique with 𝑀2 and 𝑀1 to find 𝑀1 " in the "module-info.java" file for 𝑇 𝑀1 . Then, based on this
the set of dependencies from classes belonging to 𝑀2 into classes belonging dependency, C2RTS uses the jdeps technique with 𝑇 𝑀1 and 𝑀1 to find
to 𝑀1 . Let jdeps result included some dependencies from some class(es) the set of dependencies from test classes belonging to 𝑇 𝑀1 into classes
of 𝑀2 to a class called 𝐶1 that belongs to 𝑀1 . Then, C2RTS adds a directed belonging to 𝑀1 . From these dependencies, C2RTS extracts the names of
edge from node 𝑀2 to node 𝐶1 in the EDG. the source and target classes and connect their corresponding nodes in the
EDG with the proper directed edges.
Rule 6 (Requires Port from Modified to Unmodified App-Component). Let
𝑀1 requires 𝑀2 , which is represented using the statement "requires 𝑀2 "
3.2.2. Mark modified nodes and select affected test cases
in the "module-info.java" file of 𝑀1 . According to this require
dependency from a modified module (𝑀1 ) to an unmodified module (𝑀2 ), To mark the modified classes and compute the set of selected
C2RTS applies the jdeps technique with 𝑀1 and 𝑀2 to find the classes of test cases, C2RTS applies the same steps explained previously in Sec-
𝑀1 that depend on classes of 𝑀2 . Let jdeps result included that a class 𝐶1 tions 3.1.2 and 3.1.3 with one difference. That is instead of marking
belonging to 𝑀1 depends on some class(es) belonging to 𝑀2 . Then, C2RTS nodes representing modified components in the EDG, C2RTS marks
adds a directed edge from node 𝐶1 to node 𝑀2 in the EDG. the nodes that represent modified classes. Then, C2RTS computes the
For example, in the modules configurations shown in Fig. 2, the transitive closure of each test case to find all components and classes
app-component serviceProvider, which was identified by C2RTS that each test depends on. Thereafter, the set of impacted test cases
as modified, requires the unmodified app-component stuSer- whose transitive dependencies in the EDG include some changed type,
vice. Hence, the jdeps technique is applied with these two com- is returned by C2RTS as the output. For example, if the class Under-
ponents and returns that the classes Undergraduate and Grad- graduate is modified and marked in the EDG shown in Fig. 4, then
uate belonging to serviceProvider depend on the class ISu- the test cases TestUndergraduate, TestGraduate, and TestS-
dent that belongs to stuService. Consequently, directed edges are tuSchedule will be selected and returned as the output of C2RTS.
7
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
4. Experimental evaluation • RQ2: What is the precision violation w.r.t. Ekstazi of the proposed
approaches CORTS and C2RTS? Furthermore, how does this pre-
The goal of the evaluation is to compare CORTS and C2RTS with the cision violation compare to the precision violation w.r.t. Ekstazi
state-of-the-art class-level RTS tools in terms of (1) safety violation, (2) achieved by the static class-level RTS approach STARTS?
precision violation, (3) test suite reduction, (4) size of the dependency • RQ3: What is the reduction in test suite size achieved by CORTS
graph that represents static dependencies from test cases to code en- and C2RTS?
tities, and (5) reduction in test selection and execution times. An RTS • RQ4: How does the size of the static dependency graphs ex-
technique is safe if it does not miss any modification-traversing test tracted by CORTS and C2RTS compare to the size of the static
cases that should be selected for regression testing. An RTS technique is dependency graph extracted by STARTS?
precise if it does not select non-modification traversing test cases. A test
• RQ5: What is the time taken by CORTS and C2RTS to construct
case is considered as a modification-traversing test case if it exercises
and analyze the static dependency graph to select relevant tests,
during its execution a modified, new, or previously removed code
and what is their overall end-to-end testing time?
statements. Only modification-traversing test cases can reveal faults in
the modified version of a software system, and hence, must be selected
for regression testing. 4.2. Subjects
We compared CORTS and C2RTS with two RTS tools, Ekstazi [12]
and STARTS [18]. They are both state-of-the-art for class-level RTS and We evaluated CORTS and C2RTS using the 12 subjects listed in
have been widely evaluated on a large number of revisions of real world Table 1. These are open-source real-world Java projects, which are
projects [17]. The class-level RTS process identifies changes at the class known to be compatible with Ekstazi and STARTS since they were
level, instead of method or statement levels, and selects every test-class widely used in their evaluation [12,17,18]. Table 1 shows for each
that traverses or depends on any changed class. Ekstazi uses dynamic subject, the latest revision (i.e., most recent revision of the project) on
analysis and STARTS uses static analysis of compiled Java code. We
which our experiments started (SHA), the number of the source classes
compared CORTS and C2RTS with these class-level RTS approaches
(CLASSES) of the latest reversion, i.e., classes of the core program
because we aimed to investigate (1) how the safety can be improved
without counting test classes, the number of the source test classes
by raising the RTS granularity from class-level to component-level, (2)
(TESTS) of the latest reversion, number of recovered components of
how increasing the RTS granularity from class-level to component-level
the latest revision (COMPS), number of ports between the recovered
impacts the precision and test suite reduction, and (3) how increasing
components (PORTS), and the number of used revisions (REVS).
the RTS granularity reduces the size of the static component-level
dependency graph compared to the static class-level dependency graph. Converting the projects to equivalent component-based projects.
In order to evaluate the safety and precision of CORTS and C2RTS, It was not possible to evaluate CORTS and C2RTS using existing open-
we computed their safety violations and precision violations w.r.t. Ek- source component-based Java applications, i.e., multi-module Java
stazi [12]. Ekstazi is a code-based RTS approach known to be safe in applications developed using the JPMS capabilities. There are two
terms of selecting all the modification-traversing test classes, widely main reasons for that. First, the great majority of existing open-source
evaluated on a large number of revisions, and being adopted by several Object Oriented (OO) Java applications have not been converted to
popular open source projects; as such it can be considered the state- component-based equivalent applications using JPMS. For example,
of-the-art for class-level dynamic RTS tools. Assuming that a program as mentioned in Hammad et al. [22] after analyzing more than 1300
P, which has an original test suite T, was modified to a new version open-source Java projects, they found that only 33 are utilizing JPMS
P. Furthermore, assuming that two RTS approaches, RTS1 and Ekstazi capabilities. This finding comports with the results reported in prior
were applied to select test cases from T based on the code modifications work [29] as well. Second, even for the 33 existing component-based
to move the program from P to P, such that RTS1 selected the set of projects that utilize JPMS capabilities, the modules of each project are
test cases TRTS1 and Ekstazi selected the set of test cases TEkstazi . Then, open to all the system, leading to a situation in which components
the safety violation of RTS1 w.r.t. Ekstazi, precision violation of RTS1 (i.e., modules) are granted more access than they need to function,
w.r.t. Ekstazi, and reduction in test suite size obtained by RTS1 are and this violates the least-privilege architecture principle. Additionally,
defined as follows: these projects are relatively small in size, significantly smaller than
|𝐓𝐄𝐤𝐬𝐭 𝐚𝐳𝐢 𝐓𝐑𝐓𝐒𝟏 | those listed in Table 1, and were created for educational purposes,
𝑆 𝑎𝑓 𝑒𝑡𝑦𝑉 𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛 𝑤.𝑟.𝑡. 𝐸 𝑘𝑠𝑡𝑎𝑧𝑖 =
|𝐓𝐄𝐤𝐬𝐭 𝐚𝐳𝐢 𝐓𝐑𝐓𝐒𝟏 | meaning they are not real-world component-based Java applications.
Therefore, we could not use these component-based projects to evaluate
|𝐓𝐑𝐓𝐒𝟏 𝐓𝐄𝐤𝐬𝐭 𝐚𝐳𝐢 |
𝑃 𝑟𝑒𝑐 𝑖𝑠𝑖𝑜𝑛𝑉 𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛 𝑤.𝑟.𝑡. 𝐸 𝑘𝑠𝑡𝑎𝑧𝑖 = CORTS and C2RTS.
|𝐓𝐄𝐤𝐬𝐭 𝐚𝐳𝐢 𝐓𝐑𝐓𝐒𝟏 | In order to overcome this challenge, we converted the OO Java
|𝐓| |𝐓𝐑𝐓𝐒𝟏 | projects listed in Table 1 to equivalent component-based Java projects.
𝑇 𝑒𝑠𝑡 𝑠𝑢𝑖𝑡𝑒 𝑟𝑒𝑑 𝑢𝑐 𝑡𝑖𝑜𝑛 𝑜𝑏𝑡𝑎𝑖𝑛𝑒𝑑 𝑏𝑦 𝑅𝑇 𝑆 1 = To do that, we leveraged the OO2CB [22] which utilizes the JPMS
|𝐓|
capabilities and converts an OO Java application to an equivalent
The safety violation, precision violation, and test suite reduction component-based Java application according to the least-privilege se-
are multiplied by 100 to make them percentages. Lower percentages curity principle. The OO2CB uses a component recovery framework
for safety violation, precision violation, and higher percentages for test implemented by Garcia et al. [30], called ARCADE, to automatically de-
suite reduction are better [17,28]. The size of the static dependency termine the components of an OO application. The ARCADE framework
graph is computed in terms of number of nodes and edges of the graph. utilizes several well-known component recovery tools such as Architec-
ture Recovery using Concerns (ARC) [31], Bunch [32], and Algorithm
4.1. Research questions
for Comprehension-Driven Clustering (ACDC) [33]. OO2CB [22] takes
as inputs the suggested components provided by the ACDC tool and
In this research, we try to answer the following Research Questions
the binary code of the OO Java application, and outputs the equivalent
(RQ):
component-based Java application that utilizes the JPMS features along
• RQ1: What is the safety violation w.r.t. Ekstazi of the pro- with all the modules descriptors, i.e., the "module-info.java"
posed static component-level RTS approaches CORTS and C2RTS? files, generated according to the least-privilege security principle.
Furthermore, do CORTS and C2RTS reduce the safety viola- Selecting revisions. We downloaded the revisions of every subject
tion w.r.t. Ekstazi (i.e., improve safety) compared to the static among the 12 subjects listed in Table 1 using the methodology in Le-
class-level RTS approach STARTS? gunsen et al. [17]. First, we found the latest revision (specified by SHA
8
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Table 1
The Java projects used in our study.
Subject SHA CLASSES TESTS COMPS PORTS REVS
commons-math 96f2b16 864 485 59 1116 100
commons-configuration 5de7c48 261 171 18 190 100
commons-compress a189697 201 105 22 190 100
commons-collections f9f99cc 351 230 23 251 100
commons-dbcp 23f6717 60 54 4 12 100
commons-io 8d1b994 128 106 9 53 100
commons-lang 82fd251 154 153 13 83 100
commons-validator e2edf6a 64 76 3 6 100
commons-pool fde71c6 48 26 5 11 100
JFreeChart 86abdc8 638 344 33 443 100
jankotek.mapdb a333530 87 61 11 43 100
OpenTripPlanner 45c1a9f 1099 285 147 2724 100
Table 2
Average and median safety violation w.r.t. Ekstazi.
Subject A-SVCORTS % M-SVCORTS % A-SVC2RTS % M-SVC2RTS % A-SVSTARTS % M-SVSTARTS %
commons-math 0.0 0.0 0.04 0.0 0.58 0.0
commons-configuration 11.53 0.0 19.26 0.0 22.04 1.89
commons-compress 0.0 0.0 0.0 0.0 0.0 0.0
commons-collections 0.0 0.0 0.0 0.0 0.0 0.0
commons-dbcp 0.0 0.0 0.11 0.0 0.56 0.0
commons-io 0.0 0.0 0.0 0.0 0.0 0.0
commons-lang 0.0 0.0 0.0 0.0 0.0 0.0
commons-validator 0.14 0.0 1.57 0.0 6.19 0.0
commons-pool 0.79 0.0 1.29 0.0 1.65 0.0
JFreeChart 0.0 0.0 0.0 0.0 0.0 0.0
jankotek.mapdb 0.0 0.0 3.04 2.12 3.32 3.22
OpenTripPlanner 1.12 0.0 1.21 0.0 3.97 1.36
A- or M-SVi is the average/median (per subject) safety violation of a tool (i.e., CORTS, C2RTS, or STARTS) with respect to Ekstazi.
in Table 1) that satisfied these conditions: (1) does not have a build or identifier, enabling users to retrieve the exact source code revision
compile error, (2) no test case failures, and (3) successfully ran with directly from the respective projects GitHub repository. The following
STARTS and Ekstazi. Second, among all the revisions preceding SHA, subsections present and discuss the RTS results of all our experiments.
we selected up to a hundred revisions (including the SHA revision) that
met these conditions. The total number of selected revisions, for the 12 4.3. RQ1: Safety violation
subjects, was 1200. These revisions met the prerequisites for Ekstazi
and STARTS: (1) Maven version 3.2.5 or above, (2) Surefire version
Table 2 shows for each subject the results of the median and average
2.14 or above, (3) JUnit version 3 or above, (4) Java version 1.8 or
safety violation w.r.t. Ekstazi achieved by CORTS, C2RTS, and STARTS.
above. We used OO2CB [22] to convert each of the 1200 revisions to
The median and average values are computed per subject among all the
an equivalent component-based version. Table 1 shows for each subject,
subjects revisions. As shown in Table 2, CORTS and C2RTS achieved
the numbers of recovered components (COMPS) and ports (PORTS)
better results for safety violation compared to STARTS.
among the components of the latest subjects revision used in our study.
The median safety violation obtained by CORTS was zero for all
For each subject, starting from the oldest revision, among the hun-
the 12 subjects, while C2RTS had a value greater than zero for only
dred revisions, up to the most recent revision specified by SHA, we ran
one subject. For STARTS, the median safety violation was higher than
Ekstazi and STARTS techniques on the successive pairs of revisions,
zero for three subjects, i.e., 1.89% for commons-configuration
and ran CORTS and C2RTS on the corresponding component-based
and 3.22 for jankotek.mapdb. The average safety violation values
versions of these revisions. To identify changed classes between the
of CORTS and C2RTS were smaller than those for STARTS for 7 out
previous and current pair of revisions, Ekstazi compares the smart
of the 12 subjects, while all the RTS approaches achieved an average
checksums of the previous and current versions of each compiled Java
file (i.e., .class files). STARTS reuses the part of the Ekstazi source safety violation of zero for the remaining subjects. As it can be seen
code to compute smart checksums and identify changed classes in the in Table 2, CORTS reduced the average safety violation almost by half
same way. In order to ensure equitable comparisons with both Ek- from 22.04% to 11.53% for commons-configuration.
stazi and STARTS, we adhere to the same methodology for comparing The proposed approaches, CORTS and C2RTS, outperformed STARTS
smart checksums to identify changed classes, subsequently marking the in terms of safety violation because they compute dependencies from
components housing these classes as modified. In particular, the list of test cases to code entities at a higher level of granularity (i.e., component-
changed classes in STARTS can be generated by executing the Linux level) than STARTS. This component-level dependency analysis results
command-line STARTS: diff,4 and we utilized this command-line in in higher over-estimation of test dependencies compared to class-
our experiments to generate the list of classes that are identified as level test dependencies, in which more impacted (i.e., modification-
changed through smart checksum comparisons. traversing) test cases are selected. In particular, the static analysis of
The experimental dataset, which comprises the ACDC-recovered test dependencies at the component (or module) level rather than at the
architectures for all 1200 revisions of the 12 Java projects, is publicly class-level can lead to the identification of a broader set of potentially
available at https://github.com/mohammedrefai/RTS_ComponentLeve impacted test cases. This is due to the module-level analysis treating
l . Each revisions files are labeled with their corresponding SHA all classes within a module as a single entity. By considering the
module as a unified unit, this approach inherently accounts for inter-
class interactions within the module including dynamic dependencies
4
STARTS provides the command-line option to list types identified as that involve reflection, even without explicitly tracking such dynamic
changed via smart checksum computation. dependencies. This holistic view increases the likelihood of capturing
9
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Table 3
Average and median precision violation with w.r.t. Ekstazi.
Subject A-PVCORTS % M-PVCORTS % A-PVC2RTS % M-PVC2RTS % A-PVSTARTS % M-PVSTARTS %
commons-math 83.47 96.1 50.27 53.33 33.12 25.0
commons-configuration 59.68 64.57 45.64 59.19 24.54 22.14
commons-compress 75.41 86.36 62.54 79.09 53.37 64.0
commons-collections 58.54 95.21 18.89 0.0 7.15 0.0
commons-dbcp 57.41 61.29 30.05 31.81 18.18 3.03
commons-io 53.09 80.19 28.24 0.0 16.21 0.0
commons-lang 63.88 84.07 56.97 76.35 48.16 63.84
commons-validator 60.11 92.11 40.25 11.21 17.45 10.71
commons-pool 55.01 54.54 53.71 52.17 35.11 26.66
JFreeChart 49.88 73.17 42.21 0.0 32.33 0.0
jankotek.mapdb 70.29 77.38 31.89 20.41 29.02 17.74
OpenTripPlanner 87.69 91.42 87.67 91.41 73.26 75.0
A- or M-PVi is the average/median (per subject) precision violation of a tool (i.e., CORTS, C2RTS, or STARTS) with respect to Ekstazi.
Table 4
Reduction in test suite size.
Subject A-RCORTS % A-RC2RTS % A-RSTARTS % A-REkstazi %
commons-math 10.98 48.57 76.27 89.94
commons-configuration 18.03 32.51 66.46 77.03
commons-compress 11.01 24.07 56.42 86.42
commons-collections 36.96 77.96 92.87 95.51
commons-dbcp 10.32 41.74 55.98 67.72
commons-io 36.64 61.66 82.05 89.41
commons-lang 26.19 35.86 53.32 90.07
commons-validator 31.52 54.17 90.45 91.55
commons-pool 19.77 21.41 50.03 69.58
JFreeChart 43.83 51.12 84.64 93.32
jankotek.mapdb 8.12 52.56 55.61 70.29
OpenTripPlanner 20.04 20.06 57.67 89.67
A-RX is the average reduction (per subject) in test suite size achieved by an RTS approach X.
dependencies that might be overlooked when analyzing at the finer STARTS. On the other hand, the average/median precision violation
granularity of individual classes. values of C2RTS are smaller when compared with those yielded by
Moreover, the average safety violation values of CORTS are smaller CORTS with a significant variance observed across most of the sub-
than those of C2RTS. This is because C2RTS mixes tracking dependen- jects. For example, the average and median precision violation val-
cies both between modules and within them at the class-level for the ues of CORTS are 58.54% and 95.21%, respectively, for the subject
modified modules, in which inter-class dynamic dependencies that are commons-collections. These values are reduced by C2RTS to
related to reflection are missed by C2RTS, resulting in missing impacted 18.89% and 0.0%, respectively.
test cases that are captured by CORTS. C2RTS did make more mistakes in choosing irrelevant test cases
It is essential to acknowledge that the component recovery tools uti- compared to STARTS, but the precision violation yielded by C2RTS was
lized, namely ACDC and OO2CB, are based on static analysis and do not not too far from that provided by STARTS. For 8 out of the 12 subjects,
detect the dynamic class dependencies or communications associated C2RTS was, on average, only up to 13% less accurate than STARTS. For
with dynamic class loading and reflection. Consequently, the resultant the remaining subjects, the difference went up to 21%. Interestingly, in
component-based applications in our experimentation lack the opens 3 out of the 12 subjects, C2RTS had a median precision violation of 0%.
with directive within the generated module-info.java files. Conse-
4.5. RQ3: Test suite reduction
quently, CORTS and C2RTS overlooked impacted test cases, resulting in
safety violation values higher than zero for some of the subjects as seen
Table 4 shows for each subject the average reduction in test suite
in Table 2. We anticipate that CORTS and C2RTS will yield diminished
size achieved by CORTS, C2RTS, STARTS, and Ekstazi. The average
safety violation values, potentially zero or near-zero, provided that
values are computed per subject among all the subjects revisions.
reflection-related dependencies are comprehensively captured and rep-
The four RTS approaches achieved reduction in test suite size. The
resented within the module-info.java files of the evaluation subjects.
average reduction in test suite size overall the 12 subjects was 22.78%
This would entail modifications to ACDC and OO2CB to accurately cap-
for CORTS, 43.47% for C2RTS, 68.48% for STARTS, and 84.21% for
ture and represent reflection-related dependencies within the recovered
Ekstazi. The highest reduction was yielded by Ekstazi since it is a
component-based applications. We plan to investigate this direction in dynamic approach.
the future. It is evident that (1) both CORTS and C2RTS achieved a reduction
for all the subjects even though they perform RTS at a higher level
4.4. RQ2: Precision violation of granularity than STARTS, and (2) C2RTS increased the reduction
compared to CORTS from 22.78% to 43.47% on average since it tracks
Table 3 shows, for each subject, the results of the median and dependencies within the modified components at the class-level. More-
average precision violation w.r.t. Ekstazi achieved by CORTS, C2RTS, over, C2RTS achieved high reduction by more than 50% on average
and STARTS. The median and average values are computed per subject for 5 out of the 12 subjects, and a reduction by more than 40% on
among all the subjects revisions. average for 2 other subjects. Furthermore, the comparative analysis
The average and median safety violations of CORTS and C2RTS are with STARTS reveals that C2RTS maintains a competitive edge, with
higher than those of STARTS. This is because CORTS and C2RTS com- the difference in average test suite reduction between C2RTS and
pute test dependencies at a higher levels of granularity than STARTS STARTS not surpassing 21% for 5 subjects and remaining below 38%
and have higher overestimation of impacted test cases than that of across all the 12 subjects.
10
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Table 5
Dependency graph size.
Subject NODESCORTS EDGESCORTS NODESC2RTS EDGESC2RTS NODESSTARTS EDGESSTARTS
commons-math 503 4391 567 5090 2099 12 689
commons-configuration 190 1532 284 2470 827 4743
commons-compress 147 933 219 1713 547 2299
commons-collections 202 1396 236 1763 907 3536
commons-dbcp 36 147 126 684 178 711
commons-io 109 554 154 881 336 1017
commons-lang 167 982 227 1537 746 2252
commons-validator 77 208 142 581 179 592
commons-pool 27 108 124 574 208 748
JFreeChart 373 2594 468 4298 1033 7092
jankotek.mapdb 197 876 494 4600 1281 7342
OpenTripPlanner 432 5335 698 8910 2884 15 479
NODESX or EDGESX is the average number of nodes or edges in the dependency graph that was constructed by an RTS approach (X ).
Table 6
Dependency graph size reduction ratios of CORTS and C2RTS with respect to STARTS.
Subject R_NODESCORTS R_EDGESCORTS R_NODESC2RTS R_EDGESC2RTS
commons-math 4.17 2.89 3.70 2.49
commons-configuration 4.35 3.10 2.91 1.92
commons-compress 3.72 2.46 2.50 1.34
commons-collections 4.49 2.53 3.84 2.01
commons-dbcp 4.94 4.84 1.41 1.04
commons-io 3.08 1.84 2.18 1.15
commons-lang 4.47 2.29 3.29 1.47
commons-validator 2.32 2.59 1.26 1.02
commons-pool 7.56 6.91 1.66 1.31
JFreeChart 2.76 2.73 2.21 1.65
jankotek.mapdb 6.48 8.37 2.59 1.58
OpenTripPlanner 6.67 2.91 4.12 1.73
R_NODESX /R_EDGESX is the average reduction ratio of nodes/edges of class-level dependency graph achieved by an RTS
approach (X ).
4.6. RQ4: Reduction in dependency graph size memory. Furthermore, this efficiency in graph size management is
particularly beneficial in cloud-based Continuous Integration (CI) envi-
Table 5 shows for each subject the average number of nodes ronments, where resource and memory consumption directly influences
and edges of the static dependency graphs extracted by CORTS, C2RTS, costs, suggesting that such optimizations can result in economical
and STARTS. The average values are computed per subject among all advantages.
the subjects revisions. It is evident that CORTS and C2RTS generated
dependency graphs of smaller sizes compared to STARTS. 4.7. RQ5: Selection phase and end-to-end testing times
Table 6 shows, for each subject, the average size reduction ratio of
the dependency graphs extracted by CORTS and C2RTS with respect to The end-to-end execution time of an RTS approach includes two
the dependency graph extracted by STARTS. The size reduction ratio main phases, which are: (1) the selection phase that analyzes what test
is computed separately for nodes and edges as follows. For a specific cases to select, and (2) the execution phase that runs the selected test
revision of a subject, the size reduction ratio for nodes/edges achieved cases. For static RTS approaches, the selection phase time consists of
by CORTS/C2RTS is computed as the number of nodes/edges of the the time taken to construct the static dependency graph, read adapted
graph extracted by STARTS divided by the number of nodes/edges of classes and flag them in the graph, and analyze (i.e., traverse) the graph
the graph extracted by CORTS/C2RTS. to select relevant test cases. Table 7 reports the selection phase time
Referring to the data in Table 6, CORTS achieved an average for CORTS (SELECTCORTS ), C2RTS (SELECTC2RTS ), and static class-level
reduction in the STARTS dependency graph node count by a factor RTS (SELECTSTARTS-like ), as well as the end-to-end time for CORTS
starting from 4 up to 7 for 8 of the subjects, and by a factor of (E2ECORTS ), C2RTS (E2EC2RTS ), and STARTS (E2ESTARTS ). Table 7 also
approximately 3 for the remaining subjects. On the other hand, C2RTS presents TESTAll, which is a strategy that just runs all test cases
achieved an average reduction in the STARTS dependency graph node without performing any RTS analysis. We use the TESTAll strategy
count by a factor higher than 2 (i.e., ranging from 2.18 to 4.12) for time as the baseline and compared the end-to-end times of the RTS
9 subjects out of the 12 subjects. Furthermore, CORTS achieved an approaches with it. Table 7 displays, per subject, the overall cumulative
average reduction in the STARTS dependency graph edge count by time for all the 100 revisions of the subject.
factors ranging approximately from 2 up to 8 for 11 subjects, while It is important to mention that for static class-level RTS, we did not
C2RTS achieved an average reduction in edge count by factors ranging separately measure the selection phase time (i.e., SELECTSTARTS-like time
from 1.02 up to 2.49. in Table 7) using STARTS. Instead, we developed a STARTS-like tool
The results presented in Table 6 are encouraging and indicating that that functions similarly to STARTS by using jdeps to extract class
CORTS and C2RTS are effective in minimizing the static dependency dependencies and building a class-level dependency graph. However,
graph size compared to class-level RTS techniques. This capability the STARTS-like tool utilizes JGraphT for graph construction and
is crucial and presents significant implications for several reasons. analysis, whereas STARTS uses the custom, faster yasgl library [34].
First, the reduced complexity of dependency graphs makes our RTS To ensure a fair comparison, we compared the selection phase time of
approaches more scalable to very large applications such as enterprise- CORTS and C2RTS with STARTS-like, since all three use JGraphT
level applications with extensive codebases. Second, smaller graphs for graph operations. Additionally, STARTS does not provide specific
require less computational resources for analysis and consume less commands to report the exact selection phase time separately from
11
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
Table 7
Selection phase time and end-to-end testing time in seconds.
Subject SELECTCORTS SELECTC2RTS SELECTSTARTS-like TESTAll E2ECORTS E2EC2RTS E2ESTARTS
commons-math 4.153 11.859 23.209 11,042.579 9572.399 6023.054 3664.108
commons-configuration 1.445 6.283 10.329 2579.310 2132.351 1826.158 1592.784
commons-compress 0.871 1.819 4.263 819.969 730.118 621.127 572.821
commons-collections 1.206 2.586 8.957 1287.695 809.505 259.606 124.392
commons-dbcp 0.273 0.526 1.237 8252.194 7466.198 5032.529 4653.431
commons-io 0.436 1.119 2.275 4975.982 3106.881 2123.687 1828.807
commons-lang 0.852 1.583 3.781 1621.582 1131.397 1067.314 776.306
commons-validator 0.251 0.569 0.908 168.435 117.999 82.979 38.188
commons-pool 0.252 0.456 0.809 31,183.825 26,125.138 26,026.082 24,691.189
JFreeChart 2.235 10.406 38.091 538.722 305.523 274.989 149.916
jankotek.mapdb 1.195 9.279 13.849 56,929.985 54,353.748 40,567.913 38,828.116
OpenTripPlanner 8.167 20.198 45.879 17,672.936 17,318.823 17,330.737 14,778.681
SELECTX is the summation (per subject) of the overall execution time taken by an RTS approach (X ) for the test selection process (i.e., construct and analyze dependency graph
to select tests).
E2EX is the summation (per subject) of the overall end-to-end execution time taken by an RTS approach (X ).
TESTAll is the summation (per subject) of the overall time taken to just run all test cases.
other RTS phases and operations, e.g., computing and storing smart that fewer impacted test cases are missed, reducing safety violations.
checksums. For the end-to-end time, we reported the time taken by However, this comes at the expense of increased test suite size and
STARTS (i.e., E2ESTARTS time in Table 7) instead of the STARTS-like higher precision violations, as non-impacted test cases may also be
tool. selected. On the other hand, STARTS operates at the class level, and
The CORTS had the shortest selection phase time across all the 12 thus, achieves higher test suite reduction and precision, minimizing
subjects, followed by C2RTS, with STARTS-like taking the longest. For the selection of unnecessary test cases. However, this finer granularity
instance, in the JFreeChart project, CORTS took 2.235 s, C2RTS took can lead to higher safety violations compared to component-level RTS.
10.4 s, and class-level RTS took 38.09 s. By averaging the selection phase This is because static class-level RTS may miss dynamic dependencies
time across the 12 subjects, we found that the overall average selection related to reflection and dynamic class loading. In contrast, CORTS
phase time was 1.77 s for CORTS, 5.56 s for C2RTS, and 12.79 s for and C2RTS can account for such dependencies as they are explicitly
STARTS. This is because the dependency graphs for CORTS and C2RTS declared in the module-info.java files.
are smaller compared to the static class-level dependency graph. The RTS execution time. The dependency graph construction and
reported selection phase times suggest that CORTS and C2RTS scale analysis time for CORTS and C2RTS is significantly shorter than for
better for larger graphs, requiring less time to construct and analyze STARTS. This improvement is due to the smaller size of component-
dependency graphs for test case selection. level dependency graphs. However, the time spent on dependency
All three RTS approaches, i.e., CORTS, C2RTS, and STARTS, reduced graph processing constitutes a minor fraction of the overall end-to-
the overall end-to-end testing time compared to the TESTAll baseline end testing time, which is dominated by test execution. Consequently,
across all 12 subjects. For example, in the commons-pool project, STARTS significantly outperformed CORTS and C2RTS in terms of the
which comes with long running JUnit test cases, the TESTAll took overall end-to-end testing time, as it obtained higher reduction in test
31,183 s for running all test cases of the subject, which is the total suite size and smaller precision violations.
time summed-up for all the 100 revisions of this subjects, while the While dependency graph efficiency does not drastically impact total
overall end-to-end testing time was 26,125 s for CORTS, 26,026 s end-to-end testing time, it plays a crucial role in continuous integration
for C2RTS, and 24,691 for STARTS. For all the subjects, STARTS had (CI) environments by enabling faster feedback cycles for developers.
the shortest end-to-end time due to its highest reduction in the test Rapid test selection allows for immediate identification of impacted
suite size, followed by C2RTS, while CORTS took the longest time. By tests, reducing delays in iterative development workflows.
averaging the end-to-end testing time across the 12 subjects, we found Scenarios for Class- versus Component-level RTS. The choice of
that the overall average time was 11,422.76 s for TESTALL, 10,264.17 s RTS approach depends on the applications context and requirements.
for CORTS, 8436.34 s for C2RTS, and 7641.56 s for STARTS. Despite For example, class-level RTS is preferable in resource-constrained en-
that CORTS had the longest time, it still showed reduction in the end- vironments where test execution cost and time are critical, e.g., mo-
to-end testing time compared to the TESTAll strategy, indicating its bile app development pipelines. It is also preferable for applications
practical value in regression testing. C2RTS achieved better results than with frequent but small changes where the likelihood of missing im-
CORTS in reducing the end-to-end time, showing that such a hybrid pacted test cases is minimal, such as utility libraries or microservices
RTS technique can provide balancing between component- and class- with isolated functionality. On the other hand, component-level RTS
level, where it achieves reduction in the regression testing time while (e.g., CORTS and C2RTS) can be more preferable in safety-critical do-
still maintaining high safety and scalability, making it suitable for large mains where ensuring comprehensive test coverage outweighs reducing
JPMS-based programs where balance between performance and safety regression testing execution time, such as in component-based adaptive
is critical. systems with fault tolerance mechanisms [35], aircraft [36], aerospace
or other safety-critical systems [37,38]. Additionally, component-level
4.8. Results discussion RTS can be more appropriate for large-scale, monolithic enterprise
systems with complex interdependencies across components, and we
Balancing metrics across RTS approaches. The evaluation re- plan to investigate this direction in the future.
sults highlight a trade-off between key regression test selection (RTS)
metrics: safety violation, precision violation, test suite reduction, and 5. Threads to validity
end-to-end testing time. Specifically, STARTS outperforms CORTS and
C2RTS in terms of precision violation, test suite reduction, and test External validity. External validity affects the generalizability of
execution time, while CORTS and C2RTS achieve lower safety violation our results. One external validity threat is the use of 12 Java projects
rates. which might not be representative, so our results may not generalize.
CORTS and C2RTS emphasize safety by employing a coarser granu- However, the selected subjects are widely used to evaluate RTS ap-
larity (component-level) in dependency analysis. This strategy ensures proaches [12,1719,39], vary in size, application domain, and number
12
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
of test classes, which reduces this threat. Additionally, the results could Each node in a CFG represents a simple or conditional statement, and
differ for larger Enterprise Resource Planning (ERP) systems, but we each edge represents the flow of control between statements. Entities
anticipate that component-level RTS would be even more scalable and affected by modifications are selected by traversing in parallel the CFGs
reusable in such cases compared to class-level RTS approaches. We plan of P and P, and when the target entities of like-labeled CFG edges in
to investigate this direction in the future. P and P differ, the edge is added to the set of affected entities. There-
Another threat to the external validity of our experimental results after, Rothermel and Harrold extended the CFG-based algorithm for
is the use of OO2CB tool which uses the ACDC tool to determine C++ using Inter-procedural Control-Flow Graphs (ICFG) [43]. Harrold
the components of OO Java projects. Therefore, OO2CB inherits the et al. [44] further extended the CFG approach for Java software using
shortcoming of the ACDC in determining components of OO Java the Java Inter-class Graph (JIG) to handle Java features and incomplete
projects. Using other component recovery tools, such as Architecture programs.
Recovery using Concerns (ARC) [31], Bunch [32], or Weighted Com- Vokolos and Frankl [45] consider RTS based on text differencing
bined Algorithm (WCA) [40], could change the RTS results. To reduce using the Unix diff command. The approach compares the original
this threat, we leveraged the best-performing component recovery tool, code version with the modified version to identify the modified state-
i.e., ACDC [33], as concluded in prior comparative studies of recovery ments, and selects test cases that exercise code blocks containing these
techniques [30]. statements.
Internal validity. An internal factor that can affect the outcome To improve the efficiency of dynamic RTS, a number of techniques
is the possible errors in the implementations of CORTS and C2RTS. at coarser granularity (e.g., method- or class-level) rather than the
To mitigate this threat, we built our implementation on mature tools finer granularity CFG level (e.g., statement-level) were proposed. Ren
(i.e., JGraphT [41] and jdeps [27]) and tested it thoroughly. et al. [46] and Zhang et al. [47] applied change-impact analysis at the
Another threat to internal validity is the use of the OO2CB tool [22] method-level, based on call graphs techniques, to improve RTS. Recent
to generate the (JPMS) component-based projects from the subjects RTS approaches [12,18,19] were proposed to make RTS more cost-
according to the least-privilege security principle. The threat is related effective in modern software systems by focusing on class-level RTS that
to the false positives caused by the static analysis of class dependencies. (1) identifies changes at the class level and (2) computes dependencies
Static analysis results may overestimate the communications between from test cases to the classes under test. Additionally, these approaches
components which might lead to granting a component more privileges consider a test class as a test case, and thus, select test classes instead
than it needs in the resulted CB app. Consequently, this could impact
of test methods. Gligoric et al. [12] proposed Ekstazi, an approach that
the results of our experiments, especially the precision and reduction
tracks dynamic dependencies of test cases at the class level and selects
in test suite size. However, OO2CB uses the BCEL tool [42], a widely-
test cases that traverse modified classes. Ekstazi is a safe RTS approach,
used library in the industry to analyze Java apps, which mitigates this
and its safety is based on the formally proven safety of the change-based
threat. Furthermore, OO2CB defines all port types in the component-
RTS approach [48]. Zhang [19] proposed HyRTS, which is a dynamic
based applications except the ones responsible for Java reflection and
and hybrid approach that supports analyzing the adapted classes at
dynamic class loading techniques, i.e., the open and opens with
multiple granularity levels (i.e., method and class levels) to improve
ports. This limitation impacted the safety violation results yielded by
the precision and selection time. Running HyRTS using the class-level
CORTS and C2RTS. We expect that these results could be even better if
mode produces the same RTS results as Ekstazi [19]. Thus, HyRTS was
CORTS and C2RTS are applied to component-based applications that
not considered in our experimental evaluation.
utilize the opens with port to denote dependencies deriving from
While these dynamic RTS techniques can be safe, they require dy-
reflection and dynamic class loading among components. This area will
namic test coverage information which may be absent, costly to collect,
be a focus of our future research endeavors.
or require prohibitive instrumentation (e.g., for non-deterministic or
Construct validity. In general, we could have used other metrics
real-time code). On the other hand, our proposed approaches, CORTS
(e.g., test coverage and fault detection ability) to evaluate the effec-
and C2RTS are static, but still can capture runtime information repre-
tiveness of CORTS and C2RTS. However, we used the most common
sented in the module descriptors (i.e., module-info.java file) such as de-
metrics in the research literature: safety violation, precision violation,
reduction in test suite size, and reduction in end-to-end test execution pendencies related to dynamic class loading and reflection represented
time. We also used the reduction in dependency graph size as measure using the opens with directive.
to evaluate the scalability of CORTS and C2RTS to large subjects. Static RTS. Kung et al. [49,50], Hsia et al. [51], and White and Ab-
Another threat to construct validity is that we chose Ekstazi as dullah [25] proposed firewall-based approaches. The firewall contains
the ground truth against which to evaluate the static RTS techniques, the changed classes and their dependent classes, where the dependent
e.g., we computed the safety and precision violations with respect to classes are identified based on static analysis. Test cases that traverse
Ekstazi. Although Ekstazi is a state-of-the-art, and is recognized as classes in the firewall are selected. Jang et al. [52] apply firewall-
a leading and accessible dynamic class-level RTS tool, it might not based RTS at the method level to C++ software. They identify firewalls
encompass the entirety of benchmarks for all RTS scenarios. around all the methods affected by a change and select all the test cases
Conclusion Validity. We only used 12 subjects to evaluate exercising these methods for regression testing. Ryder and Tip [53]
CORTS and C2RTS. The use of additional subjects could affect the proposed a call-graph-based static change-impact analysis technique
conclusions of the evaluation. To reduce this threat, we used large real- and evaluated only one call-graph analysis on 20 revisions of one
world Java projects that have been used in other experiments for fair project [54]. Skoglund and Runesons [48] proposed a change-based
comparison. approach that only selects those test cases that exercise the changed
classes. ChEOPSJ [55,56] is a static change-based approach that uses
6. Related work the FAMIX model to represent software entities including test cases
and building dependencies between them. These approaches use fine-
RTS can reduce regression testing efforts and has been studied for grained information such as constructor calls and method invocation
over three decades [14,15]. Below we summarize the existing dynamic statements to build dependencies between software entities.
and static RTS approaches. Legunsen et al. [17,18] proposed STARTS, which is a static RTS
Dynamic RTS. Many graph-walk approaches address the problem approach that is based on the idea of the class-level firewall. STARTS
of RTS. Rothermel and Harrold [2] propose a safe approach for RTS for builds a dependency graph of program types based on compile-time
procedural programs. The algorithm uses control-flow graphs (CFG) to information, and selects test cases that can reach changed types in the
represent each procedure in a program P and its modified version P. transitive closure of the dependency graph. Yu et al. [16] evaluated
13
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
method-level and class-level static RTS in continuous integration en- that component-level RTS can be more scalable for large-scale projects.
vironments. Class-level RTS was determined to be more practical and Additionally, C2RTS demonstrated better precision than CORTS, thus
time-saving than method-level RTS. balancing between safety and precision while still reducing the size of
Gyori et al. [20] compared variants of dynamic and static class- the static dependency graph compared to static class-level RTS. Both,
level RTS with project-level RTS in the Maven Central open source CORTS and C2RTS, reduced the end-to-end testing time in comparison
ecosystem. An ecosystem may contain a large number interconnected to running all test cases without performing RTS.
projects, where client projects transitively depend on library projects. We plan to extend the application of CORTS and C2RTS to large-
Project-level RTS identifies changes at the project level and computes scale enterprise Java systems. Furthermore, it is critical to acknowledge
dependencies from test cases to projects. When a library changes, then that the component recovery tools used in our experimental evalua-
all test cases in the library and all test cases in all the librarys transitive tions, such as ACDC, do not possess the capability to detect dynamic
clients are selected. Class-level RTS was found to be less costly than dependencies, such as those involving reflection, and consequently, do
project-level RTS in terms of reduction in test suite size. not incorporate these dependencies into the module descriptor files.
Shi et al. [8] focused on optimizing RTS in continuous integra- Moving forward, we plan to explore and experiment with alternative
tion (CI) environments. They compared module- and class-level RTS component recovery tools that can better capture dynamic dependen-
techniques in the Travis cloud-based CI environment, and developed cies and reflection. Additionally, we plan to explore the application
a hybrid RTS technique, called GIBstazi, that combines aspects of of our RTS approaches in the context of Java 9 modular applications
the module- and class-level RTS techniques. Their work focuses on when treating the modularized Java Runtime Environment (JRE) and
Maven modules (i.e., build-system modules) utilizing techniques like third-party libraries, along with their dependencies, as part of the CB
the Git Inferred Build (GIB) to optimize test selection based on module application. This investigation aims to evaluate the impact of reduced
dependencies determined by the build system (i.e., focusing on build- runtime size, e.g., only including the required modules of the JRE and
time dependencies). While the work of Shi et al. [8] is more aligned third-party libraries, on the RTS performance.
with multi-module Java applications structured with the Maven build
system, our approaches, CORTS and C2RTS, utilize JPMS modules that CRediT authorship contribution statement
emphasize dependencies -including runtime dependencies specified us-
ing the opens with directive- and encapsulation according to the Mohammed Al-Refai: Writing review & editing, Writing origi-
least privilege concept. Although we did not empirically compare the nal draft, Visualization, Validation, Supervision, Software, Resources,
precision of JPMS-module level RTS to build-system module level RTS, Project administration, Methodology, Investigation, Formal analysis,
we anticipate that the latter may be less precise in detecting affected Data curation, Conceptualization. Mahmoud M. Hammad: Writing
tests due to the broader scope of build-time dependencies. On the review & editing, Visualization, Validation, Resources, Formal analysis.
other hand, JPMS-based RTS could potentially offer more precise and
safer test selection due to the explicit module dependencies and en- Declaration of competing interest
capsulation provided by JPMS. As a future work, we plan to transform
multi-module maven-based Java applications into their JPMS-utilizing The authors declare that they have no known competing finan-
counterparts to further evaluate the efficacy of CORTS and C2RTS in cial interests or personal relationships that could have appeared to
such environments. influence the work reported in this paper.
Overall, CORTS and C2RTS are similar to the described static RTS
approaches in terms of applying the firewall impact analysis tech-
References
nique, but at the module-level rather than the class- and method-levels.
However, unlike the existing static RTS techniques, our proposed ap- [1] A. Bertolino, Software testing research: Achievements, challenges, dreams,
proaches can capture runtime information that are explicitly included in: 2007 Future of Software Engineering, IEEE Computer Society, 2007
in the module descriptor files. pp. 85103.
[2] G. Rothermel, M.J. Harrold, A safe, efficient regression test selection technique,
ACM Trans. Softw. Eng. Methodol. 6 (2) (1997) 173210.
7. Conclusions and future work
[3] M.J. Harrold, Testing evolving software, J. Syst. Softw. 47 (23) (1999) 173181.
[4] H.K.N. Leung, L.J. White, Insights into regression testing, in: Proceedings of
As software systems become increasingly complex and large, espe- Conference on Software Maintenance, IEEE, Miami, FL, USA, 1989, pp. 6069.
cially with the implementation of the Java Platform Module System [5] P.K. Chittimalli, M.J. Harrold, Recomputing coverage information to assist
(JPMS), traditional regression test selection (RTS) techniques at the regression testing, IEEE Trans. Softw. Eng. 35 (4) (2009) 452469.
method and class levels often face challenges in efficiency and resource [6] E. Engström, P. Runeson, A qualitative survey of regression testing practices,
management. This research was driven by the desire to refine RTS in: International Conference on Product Focused Software Process Improvement,
for Java applications modularized with JPMS. This research leverages Springer, 2010, pp. 316.
[7] R. Greca, B. Miranda, A. Bertolino, State of practical applicability of regression
component-level granularity and provides a substantial foundation for
testing research: A live systematic literature review, ACM Comput. Surv. 55 (13s)
advancing RTS practices tailored to modern Java applications, pre-
(2023) 136.
senting a strong case for the adoption of component-level analysis in [8] A. Shi, P. Zhao, D. Marinov, Understanding and improving regression test
professional and large-scale development environments. selection in continuous integration, in: 2019 IEEE 30th International Symposium
We introduced two novel static component-based RTS approaches, on Software Reliability Engineering, ISSRE, IEEE, 2019, pp. 228238.
CORTS and its variant C2RTS, tailored for component-based Java soft- [9] W. Sun, X. Xue, Y. Lu, J. Zhao, M. Sun, Hashc: Making deep learning coverage
ware systems modularized with JPMS. CORTS constructs a module- testing finer and faster, J. Syst. Archit. 144 (2023) 102999.
level dependency graph using architectural metadata from module [10] Y. Lu, K. Shao, J. Zhao, W. Sun, M. Sun, Mutation testing of unsupervised
descriptor files to determine the impact of changes and select relevant learning systems, J. Syst. Archit. 146 (2024) 103050.
[11] Testing at the speed and scale of Google, 2011, http://google-engtools.blogspot.
test cases. C2RTS extends this by incorporating class-level analysis for
com/2011/06/testing-at-speed-and-scale-of-google.html.
modified modules, offering a hybrid approach that balances granu-
[12] M. Gligoric, L. Eloussi, D. Marinov, Practical regression test selection with
larity to improve precision while maintaining safety. Our evaluation dynamic file dependencies, in: Proceedings of the 2015 International Symposium
of CORTS and C2RTS on real-world software systems demonstrated on Software Testing and Analysis, ISSTA15, ACM, Baltimore, MD, USA, 2015,
improvements, in terms of safety, over static class-level RTS paradigms. pp. 211222.
Additionally, both CORTS and C2RTS reduced the dependency graph [13] L.C. Briand, Y. Labiche, S. He, Automating regression test selection based on
size compared to static class-level RTS, thus, providing an evidence UML designs, J. Inf. Softw. Technol. 51 (1) (2009) 1630.
14
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
[14] E. Engström, P. Runeson, M. Skoglund, A systematic review on regression test [40] O. Maqbool, H. Babri, Hierarchical clustering for software architecture recovery,
selection techniques, Inf. Softw. Technol. 52 (1) (2010) 1430. IEEE Trans. Softw. Eng. 33 (11) (2007) 759780.
[15] S. Yoo, M. Harman, Regression testing minimization, selection and prioritization: [41] B. Naveh, J.V. Sichi, JGraphT a free Java graph library, 2011.
A survey, J. Softw. Test. Verif. Reliab. 22 (2) (2012) 67120. [42] BCEL documentation available at http://jakarta.apache.org/bcel/.
[16] T. Yu, T. Wang, A study of regression test selection in continuous integration en- [43] G. Rothermel, M.J. Harrold, J. Dedhia, Regression test selection for C++
vironments, in: S. Ghosh, R. Natella (Eds.), Proceedings of the 29th International software, Softw. Test. Verif. Reliab. 10 (2) (2000) 77109.
Symposium on Software Reliability Engineering, ISSRE18, IEEE, Memphis, TN, [44] M.J. Harrold, J.A. Jones, T. Li, D. Liang, A. Orso, M. Pennings, S. Sinha, S.A.
USA, 2018, pp. 135143. Spoon, A. Gujarathi, Regression test selection for Java software, in: J. Vlissides
[17] O. Legunsen, F. Hariri, A. Shi, Y. Lu, L. Zhang, D. Marinov, An extensive study (Ed.), Proceedings of the 16th Conference on Object-Oriented Programming,
of static regression test selection in modern software evolution, in: J. Cleland- Systems, Languages, and Applications, OOPSLA01, ACM, Tampa, FL, USA, 2001,
Huang, Z. Su (Eds.), Proceedings of the 2016 24th ACM SIGSOFT International pp. 312326.
Symposium on Foundations of Software Engineering, FSE16, ACM, Seattle, WA, [45] F. Vokolos, P.G. Frankl, Empirical evaluation of the textual differencing re-
USA, 2016, pp. 583594. gression testing technique, in: Proceedings of the International Conference on
[18] O. Legunsen, A. Shi, D. Marinov, STARTS: Static regression test selection, Software Maintenance, SM98, Bethesda, MD, USA, 1998, pp. 4453.
in: M. Di Penta, T.N. Nguyen (Eds.), Proceedings of the 32nd IEEE/ACM [46] X. Ren, F. Shah, F. Tip, B.G. Ryder, O. Chesley, Chianti: a tool for change
International Conference on Automated Software Engineering, ASE17, IEEE impact analysis of java programs, in: Proceedings of the 19th Annual ACM
Press, Urbana-Champaign, IL, USA, 2017, pp. 949954. SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and
[19] L. Zhang, Hybrid regression test selection, in: M. Chechik, M. Harman (Eds.), Applications, 2004, pp. 432448.
Proceedings of the 40th International Conference on Software Engineering, [47] L. Zhang, M. Kim, S. Khurshid, Faulttracer: a change impact and regression fault
ICSE18, IEEE, Gotheburg, Sweden, 2018, pp. 199209. analysis tool for evolving java programs, in: Proceedings of the ACM SIGSOFT
[20] A. Gyori, O. Legunsen, F. Hariri, D. Marinov, Evaluating regression test selection 20th International Symposium on the Foundations of Software Engineering, 2012,
opportunities in a very large open-source ecosystem, in: S. Ghosh, R. Natella pp. 14.
(Eds.), Proceedings of the 29th International Symposium on Software Reliability [48] M. Skoglund, P. Runeson, Improving class firewall regression test selection by
Engineering, ISSRE18, IEEE, Memphis, TN, USA, 2018, pp. 112122. removing the class firewall, Int. J. Softw. Eng. Knowl. Eng. 17 (3) (2007)
[21] JPMS. http://openjdk.java.net/projects/jigsaw/spec/. 359378.
[22] M.M. Hammad, I. Abueisa, S. Malek, Tool-assisted componentization of Java ap- [49] D.C. Kung, J. Gao, P. Hsia, J. Lin, Y. Toyoshima, Class firewall, test order, and
plications, in: 2022 IEEE 19th International Conference on Software Architecture, regression testing of object-oriented programs, J. Occup. Organ. Psychol. 8 (2)
ICSA, 2022, pp. 3646, http://dx.doi.org/10.1109/ICSA53651.2022.00012. (1995) 5165.
[23] OpenJDK: Jigsaw project. https://openjdk.java.net/projects/jigsaw/. [50] D.C. Kung, J. Gao, P. Hsia, Y. Toyoshima, C. Chen, On regression testing of
[24] R.N. Taylor, N. Medvidovic, E.M. Dashofy, Software architecture: foundations, object-oriented programs, J. Syst. Softw. 32 (1) (1996) 2140.
theory, and practice, Google Sch. Google Sch. Digit. Libr. Digit. Libr. (2009) [51] P. Hsia, X. Li, D.C.-H. Kung, C.-T. Hsu, L. Li, Y. Toyoshima, C. Chen, A technique
(2009). for the selective revalidation of OO software, J. Software: Evol. Process. 9 (4)
[25] L.J. White, K. Abdullah, A firewall approach for regression testing of object- (1997) 217233.
oriented software, in: Proceedings of the 10th International Software Quality [52] Y.K. Jang, M. Munro, Y.R. Kwon, An improved method of selecting regression
Week, QW97, San Francisco, CA, USA, 1997. tests for C++ programs, J. Softw. Maint. Evol. 13 (5) (2011) 331350.
[26] D. Michail, J. Kinable, B. Naveh, J.V. Sichi, JGraphT—A Java library for graph [53] B.G. Ryder, F. Tip, Change impact analysis for object-oriented programs, in:
data structures and algorithms, ACM Trans. Math. Software 46 (2) (2020). Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis
[27] jdeps: The Java class dependency analyzer. Available from Oracle: https://docs. for Software Tools and Engineering, 2001, pp. 4653.
oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html. [54] X. Ren, F. Shah, F. Tip, B.G. Ryder, O. Chesley, J. Dolby, Chianti: A Prototype
[28] A. Shi, A. Gyori, M. Gligoric, A. Zaytsev, D. Marinov, Balancing trade-offs in
Change Impact Analysis Tool for Java, Tech. Rep., Rutgers University, 2003.
test-suite reduction, in: A. Orso, M.-A. Storey (Eds.), Proceedings of the 22nd
[55] Q.D. Soetens, S. Demeyer, A. Zaidman, Change-based test selection in the
International Symposium on Foundations of Software Engineering, FSE14, ACM,
presence of developer tests, in: A. Cleve, F. Ricca (Eds.), Proceedings of the 17th
Hong Kong, China, 2014, pp. 246256.
European Conference on Software Maintenance and Reengineering, CSMR13,
[29] N. Ghorbani, J. Garcia, S. Malek, Detection and repair of architectural inconsis-
IEEE, Genoa, Italy, 2013, pp. 101110.
tencies in Java, in: 2019 IEEE/ACM 41st International Conference on Software
[56] Q.D. Soetens, S. Demeyer, A. Zaidman, J. Pérez, Change-based test selection: An
Engineering, ICSE, 2019, pp. 560571, http://dx.doi.org/10.1109/ICSE.2019.
empirical evaluation, Empir. Softw. Eng. (2015) 143.
00067.
[30] J. Garcia, I. Ivkovic, N. Medvidovic, A comparative analysis of software archi-
tecture recovery techniques, in: 2013 28th IEEE/ACM International Conference Dr. Mohammed Al-Refai is an Assistant Professor in the
on Automated Software Engineering, ASE, IEEE, 2013, pp. 486496. Computer Science Department within the Computer and
[31] J. Garcia, D. Popescu, C. Mattmann, N. Medvidovic, Y. Cai, Enhancing architec- Information Technology School at the Jordan University of
tural recovery using concerns, in: 2011 26th IEEE/ACM International Conference Science and Technology (JUST). Al-Refais research focuses
on Automated Software Engineering, ASE 2011, IEEE, 2011, pp. 552555. on various areas within software engineering, including
[32] B.S. Mitchell, S. Mancoridis, On the automatic modularization of software model-driven development, model-based testing, software
systems using the bunch tool, IEEE Trans. Softw. Eng. 32 (3) (2006) 193208. architecture, software testing, regression test selection and
prioritization, software security, and the integration of fuzzy
[33] V. Tzerpos, R.C. Holt, ACDC: an algorithm for comprehension-driven clustering,
logic and machine learning in software engineering applica-
in: Proceedings Seventh Working Conference on Reverse Engineering, IEEE, 2000,
tions. Al-Refai earned his Ph.D. in Computer Science from
pp. 258267.
Colorado State University, Fort Collins, Colorado, under the
[34] Yet another simple graph library. https://github.com/TestingResearchIllinois/
supervision of Prof. Sudipto Ghosh. He also holds M.S. and
yasgl. B.S. in Computer Science from Jordan University of Science
[35] M. Stoicescu, J.-C. Fabre, M. Roy, Architecting resilient computing systems: A and Technology. Al-Refai is a member of the Association for
component-based approach for adaptive fault tolerance, J. Syst. Archit. 73 (2017) Computing Machinery (ACM) and the Institute of Electrical
616. and Electronics Engineers (IEEE).
[36] H. Usach, J.A. Vila, C. Torens, F. Adolf, Architectural design of a safe mission
manager for unmanned aircraft systems, J. Syst. Archit. 90 (2018) 94108.
[37] Z. Yang, Z. Qiu, Y. Zhou, Z. Huang, J.-P. Bodeveix, M. Filali, C2AADL_Reverse: Dr. Mahmoud Hammad is an Associate Professor in the
A model-driven reverse engineering approach to development and verification Software Engineering Department within the Computer and
of safety-critical software, J. Syst. Archit. 118 (2021) 102202. Information Technology School at the Jordan University of
Science and Technology (JUST). He is also the director of
[38] I. Allende, N. Mc Guire, J. Perez, L.G. Monsalve, R. Obermaisser, Towards
the Center for E-Learning and Open Educational Resources.
Linux based safety systems—A statistical approach for software execution path
Hammads research interests are in the field of software
coverage, J. Syst. Archit. 116 (2021) 102047.
engineering, specifically in the area of software architecture,
[39] M.K. Shin, S. Ghosh, L.R. Vijayasarathy, An empirical comparison of
self-adaptive software systems, mobile computing, software
four Java-based regression test selection techniques, J. Syst. Softw. 186 analysis, software security, natural language processing and
(2022) 111174, http://dx.doi.org/10.1016/j.jss.2021.111174, URL https://www. machine learning. Hammad received his Ph.D. in Software
sciencedirect.com/science/article/pii/S0164121221002582. Engineering from the University of California, Irvine (UCI)
15
M. Al-Refai and M.M. Hammad Journal of Systems Architecture 160 (2025) 103343
under the supervision of Prof. Sam Malek . During his Ph.D., received his M.S. in Software Engineering from George
Hammad developed a self-protecting Android software sys- Mason University, VA, USA and B.S. in Computer Science
tem , an Android software system that can monitor itself and from Yarmouk University, Jordan . Hammad is a member
adapt (change) its behavior at runtime to keep the system of the Association of Computing Machinery (ACM), ACM
secure and protected from Inter-Component Communication Special Interest Group on Software Engineering (SIGSOFT),
attacks at all times. Hammad and the Institute of Electrical and Electronics Engineers
(IEEE). https://hammadmahmoud.github.io/
16