Scalable Semantic Web Service Discovery for Goal-driven Service-Oriented Architectures


March 17th, 2008
Faculty of Mathematics, Computer Science and Physics
Leopold-Franzens University Innsbruck, Austria


Univ.-Prof. Dr. Dieter Fensel
Director Semantic Technology Institute (STI Innsbruck)
Leopold-Franzens University Innsbruck, Austria


Prof. Dr. John Domingue
Deputy Director Knowledge Media Institute (KMI)
Open University, UK




The concept of Service-Oriented Architectures (SOA) is the latest design paradigm for IT systems. The idea is to use Web services as the basic blocks, which provide programmatic access to computational facilities over the Internet. The aim is to exploit the potential of the World Wide Web as an infrastructure for computation, help to reduce the development and maintenance costs of IT systems, and also to tackle the integration problem within and in between collaborating organizations.

The realization of sophisticated SOA technologies is a massive challenge. The initial Web service technology stack around WSDL, SOAP, and UDDI facilitates the technical provision and usage of Web services. However, it limits the detection and usability analysis of suitable Web services for a particular client request to manual inspection. To overcome these deficiencies, the emerging concept of Semantic Web services (SWS) develops inference-based techniques for the automated discovery, composition, and execution of Web services on the basis of exhaustive semantic annotations. This also addresses the problem of semantic interoperability by using ontologies as the underlying data model and by applying reasoning techniques developed in the context of Semantic Web.

One of the central operations in SOA environments is the detection of suitable Web services for solving a given request, commonly referred to as Web service discovery. This is usually performed as the first processing step that finds potential candidates out of the available Web services. Most techniques - in particular in the area of SWS - primarily consider functional aspects for the discovery task. The usability of the discovered Web services is then further inspected in subsequent processing steps that consider non-functional aspects such as the quality-of-service or behavioral compatibility. Next to the automation of the discovery tasks, recent approaches for advanced SWS technologies further envision to integrate automated Web service discovery engines as a heavily used software component. Examples for this are approaches for dynamic Web service composition or for semantically enabled business process management wherein the actual Web services shall be detected dynamically at runtime in order to achieve higher flexibility and better maintainability.

From this application purpose, two central requirements arise for automated Web service discovery engines: (1) a high retrieval accuracy in order to perform the discovery task with an appropriate quality, and (2) a high computational performance in order to serve as a operationally reliable software component, in particular within larger search spaces of available Web services that can be expected in real-world applications. The former can most adequately be achieved by semantic matchmaking techniques that work on sufficiently rich descriptions, which can achieve a better accuracy than other techniques. For the latter, it is necessary to reduce the time and the computational costs for the discovery task.

The present work addresses this challenge by developing a goal-based, semantically enabled Web service discovery technique along with a caching mechanism for enhancing the computational performance. In consequence, the thesis consists of three consecutive parts and aims at advancing the state-of-the-art in respective SWS technology developments:

  1. a refined goal model for Semantic Web services for facilitating problem-oriented Web service usage in SOA systems: the client merely specifies the objective to be achieved as a goal that abstracts from technical details, and the system automatically discovers, composes, and executes suitable Web services for solving this;
  2. a formally defined approach for semantically enabled Web service discovery that separates design time and runtime operations and warrants high retrieval performance by matchmaking of ontology-based functional descriptions of goals and Web services;
  3. a caching mechanism for Web service discovery that captures the relevant knowledge of design time discovery runs and effectively exploits this in order to enhance the computational performance of Web service discovery at runtime.

The goal model defines goals as formal descriptions of the objectives that clients want to achieve by using Web services, and specifies how these are used and processed within SWS environments. A central aspect is the distinction of goal templates as generic and reusable objective descriptions that are stored in the system, and goal instances that describe concrete client requests and are defined by instantiating a goal template with concrete input values. This allows us to separate design time and runtime operations for the discovery task. The suitable Web services for goal templates are discovered at design time. The result is captured in a specialized knowledge structure which serves as the heart of the caching technique. At runtime, a client - either a human or a machine - formulates the concrete objective to be achieved in terms of a goal instance. As the time critical and expectably most frequent operation in real-world SOA applications, the discovery of suitable Web services for goal instances at runtime is optimized by exploiting the captured knowledge.

The Web service discovery techniques developed in this work perform semantic matchmaking of sufficiently rich functional descriptions. For this, we define functional descriptions that precisely describe the start- and end-states of the possible executions of Web services, respectively solutions of goals in terms of preconditions and effects. These are specified on the basis of ontologies, and we define their formal semantics in a first-order logic framework. Upon this, we specify the necessary semantic matchmaking techniques for Web service discovery at both design time and runtime. We shall show that these achieve a higher precision and recall than most existing techniques for semantically enabled Web service discovery.

The caching mechanism is based on a directed acyclic graph that organizes the goal templates in a subsumption hierarchy with respect to the requested functionalities and captures the relevant knowledge on the usability of the available Web services from the design time discovery results. This provides a formally defined index for the efficient search of goals and Web services, and we specify the necessary algorithms for automatically generating and properly maintaining this graph whenever a goal template or a Web service is added, removed, or modified. The optimized discovery algorithms exploit this to enhance the computational performance by minimizing the relevant search space and the number of necessary matchmaking operations. This is a novel approach in the field of Semantic Web services that can achieve a better performance increase than existing optimization techniques and maintain a high retrieval accuracy for automated Web service discovery.

To evaluate the achievable performance increase, we compare our caching-enabled Web service discovery with other engines that do not apply any or merely less elaborated optimization techniques. For the shipment scenario that has been defined in the Semantic Web Services Challenge - a widely recognized initiative for the demonstration and comparison of SWS techniques - our prototype implementation shows significant improvements in the efficiency, scalability, and stability for performing the discovery task. To also assess the practical relevance of the developed technology, we examine its applicability in one of the largest existing SOA systems maintained by the US-based telecommunication provider Verizon as well as in prominent application scenarios for Semantic Web services. This reveals that the goal-based approach and semantically enabled discovery techniques can greatly increase the quality of SOA technology, and that optimized techniques occur to be necessary in order to warrant the operational reliability of automated discovery engines in real-world applications wherein larger numbers of available Web services can be expected.

The specifications throughout this work are by purpose kept on a generic level. We only formally define the central aspects, using classical first-order logic (FOL) as the specification language. The aim is to support the adaption of the developed techniques to several SWS frameworks as well as to other, not semantically enabled SOA technologies.