COLLABORATIVE REUSE OF SERVICES
IN DESIGNING
COMMUNICATION PROTOCOLS
Bahram
Khalili
Fidelity
Investments
400
Las Colinas Blvd. E.
Irving,
TX 75039-5579
ABSTRACT
This paper outlines a new approach for designing data communication protocols. The underlying methodology is to design communication protocols as dynamically re-definable organizations with processing intelligence. Two fundamental system requirements provide the basis for the proposed methodology; Data Communication Bandwidth, and Dynamic Infrastructure. Data Communication Bandwidth requires the protocol to provide for two-way communication of potentially massive volumes of data located on multiple computing platforms (distributed) of possibly different types (heterogeneous) with near-real-time response. Dynamic Infrastructure requires protocols to provide a means for dynamic customization within various distributed Client-Server environments in order to prevent protocol obsolescence. The outlined approach can be conceptually viewed as a set of dynamically generated objects, which provide a framework for servicing client requests. It is modeled so that no one centralized component is responsible for all activities, instead there is a collaborative reuse of a number of independent services. Collaborative reuse indicates, whenever possible, new services are built from the elements of exiting services or objects. Cooperative reuse yields a logically consistent system that is dynamically re-definable and can adapt, with minimum effort, to various Client-Server environments.
1. INTRODUCTION
In order to establish the necessary base line and motivation for the proposed approach, it is worth while to present a brief review of an existing data communication protocol such as Remote Procedure Call (Yen-Min, 1994) as a basis of comparison. Remote Procedure Call is a pre-defined store procedure within a Client-Server distributed system. The operation is an extension of the standard procedure call mechanism (Gray and Reuter, 1993) that is provided in most high-level programming languages. In standard procedure call mechanism a procedure is implemented in one process and may be called by other processes that share the same memory address space. The restriction of sharing the same address space is eliminated in the remote procedure call mechanism. In spite of many restrictions (Khalili, 1995), RPC is widely used in many Client-Server distributed systems due to the conceptual and practical simplicity. RPC is a message-based protocol that provides a two-way synchronous communication between the client processes and the servers. Figure 1 presents the basic components of a RPC block.
The word “static” best characterizes the behavior of the RPC protocol . The RPC block receives a request from the client process and communicates the request to the server while forcing the caller to a wait-state (synchronous mode). The server returns the results back to the client after processing. In other words, RPC is an intermittent state that is entered at time t0 and is terminated at a later time t1 with limited processing intelligence. The fundamental problem with the RPC protocol (Kim and Purtilo, 1995) is that it was initially designed for interprocess communication within centralized computing systems, and later it was extended to serve in distributed environments. Distributed systems involve multiple heterogeneous platforms (Finke et. al., 1993) with non-deterministic behavior, therefore requiring new protocol design approaches to address the distributed needs.
This paper outlines a new approach for designing data communication protocols within distributed systems. The model, similar to the organization of a computer, consists of a logical memory unit for caching remote data locally, a synchronization unit to maximize execution concurrency (Ravindran and Thenmozhi, 1993), and a rule-based control unit that utilizes a Knowledge Base Subsystem (Eick and Raupp, 1991) and serves as the intelligence center of the model. The proposed architecture is constructed through an object model which inherits the basic functionality of the RPC protocol. The provided services can be classified in four conceptual units, where each unit is comprised of one or more logical subsystems that perform a designated function as follows:
a) Synchronization Unit
The Synchronization unit maximizes processing concurrency (Raynal et. al., 1992) by providing an asynchronous communication mode while supporting existing synchronous mode.
b) Storage Unit
The Storage unit provides caching capability to locally store all server data upon request. This will minimize the need for continuous remote communication which in turn maximizes the system throughput.
c) Control Unit
The Control unit is a rule-based (Croker and Dhar, 1993) component that serves as the intelligence center of the model. This unit is in charge of distributed communication with servers through dedicated network mediums, interfacing with client processes, and communicating with the synchronization, storage, and Knowledge Base subsystems. It also provides functionality to monitor the control flow, measure performance, maintain messaging order, select data resolution levels, evaluate bottlenecks, establish priority of execution, and detect and recover from faulty situations.
d) Knowledge Base
The Knowledge Base is a repository of rules that may be accessed directly by the Client for defining and modifying rules for the purpose of validating requested activities against the established rules. An abstract view of the model is presented in Figure 2.
The remainder of this paper is organized in three sections: Section 2 presents a conceptual overview of the approach. Section 3 outlines the fundamental system requirements and necessary design principles. Section 4 presents an abstract view of the approach from a subsystem perspective.
2. CONCEPTUAL VIEW
The proposed architecture can be conceptually viewed as a set of dynamically generated objects (Kifer et. al., 1995) which provide a framework for servicing client requests. The architecture is modeled using the object oriented paradigm (Meyer, 1993) where no one centralized component is responsible for all activities. Instead, there is a collaboration of a number of independent components. At the center of the model is the Core object which is created as soon as a request is initiated by the client. The Core object, in turn creates other objects which provide specific services such as accessing remote servers or interfacing with the Knowledge Base in order to validate certain activities or features. These objects are automatically destroyed after they have performed their required services as it pertains to the current request. The Core object itself is destroyed after the client request is completely serviced and is no longer pending. Unless otherwise specified by the client, the results of each request is automatically stored in a logical cache for potential reuse in subsequent requests.
The storage unit is responsible for maintaining the logical cache and it provides services such as storing or removing data from cache (Afek et. al., 1989), replacing contents of the cache using dedicated replacement algorithms, and automatic backup services to prevent loss of data. The Core object also facilitates the validation of the requested activities by providing an interface to the Knowledge Base where policy rules are stored.
The Knowledge Base is a repository of rules that constrain (Croker and Dhar, 1993) various activities initiated through client requests and carried out by the Core object. The rules are established either by the clients using a Policy Map or internally by the Core object using a Knowledge base Map. Each rule is categorically defined as either static or dynamic. The static rules control the internal operation of the object and are not accessible externally. These rules enforce priority orders among internal components and prevent any resource deadlocks that might arise during the process. The dynamic rules allow clients to continuously redefine the behavior of the protocol to best fit their data communication requirements. The ability to dynamically redefine the protocol’s behavior enhances flexibility of the data communication considerably.
The Core object allows Client processes to specify the preferred synchronization mode for carrying out their requests through the synchronization unit. The synchronization unit supports three modes of communication; Synchronous, Asynchronous (Tsitsiklis and Stamoulis, 1995), and Pseudo-synchronous (Comer and Stevens, 1991). In Synchronous mode the client is entirely blocked until all response data has be returned from the server. In Asynchronous mode full parallelism is achieved between the client and ultimately the server. The Pseudo-synchronous is a hybrid of these two modes where a transaction is carried out asynchronously, however, the client process must check for the progress of the request manually (polling). The Core object also invokes a number of functional components to perform a set of specific functions. Each component is responsible for one of the following activities:
· Synchronization access - provides internal access to the synchronization unit and maximizes communication concurrency,
· Priority Scheduling - establishes and enforces scheduling priorities,
· Communication Flow Control - controls distributed flow of information and provides Client-Server load balancing (Lin and Raghavendra, 1991). It also measures performance of various components in order to discover potential system bottlenecks (Chandy et. al., 1983),
· Error Detection and Recovery - selects the optimum distributed routes (Kim and Purtilo, 1995) and, detects and recovers from system failures,
· Dynamic Data Resolution - dynamically changes the resolution of data,
· Data Management - manages all incoming and outgoing data. Also establishes connection with the storage unit for local caching of data, and
· Message Management - communicates system messages to client processes and establishes connection with the storage unit for local caching of messages.
3. DESIGN PRINCIPLES AND REQUIREMENTS
The system’s requirements in designing a dynamic communication protocol fall in three major categories:
· Data Communication Bandwidth - allow for communication of potentially massive volumes of data located on multiple computing platforms (distributed) of different types (heterogeneous) with near-real-time response,
· Dynamic Infrastructure - allow for dynamic customization within various distributed Client-Server environments, thereby avoiding the need for continuos replacement of the data communication protocol as system demands increase or technology changes (prevent protocol obsolescence), and
· Heterogeneous Distributed Systems - support heterogeneous systems where all software, network, or host processors may be of different types.
The technical path adopted here for designing a data communication protocol which meets the above requirements is to rely on an object-oriented architecture. That is; while constructing the model, every object must be capable of both managing and manipulating its state and its behavior (Meyer, 1993). While the responsibility for doing so ultimately rests in each object itself, it is usually convenient, if not essential, to implement a set of services which supplies all objects with a set of core capabilities designed to automatically manage many common aspects of states and behaviors.
As with traditional systems (Gray and Reuter, 1993), the architecture of an object based system (Kifer et. al., 1995) should not be confused with its implementation. Here, the term infrastructure denotes the implementation of the architecture. The design principles necessary to achieve the fundamental requirements of the systems are outlined below. To help better distinguish the contrast between the architecture and the infrastructure of the design, these principles have been categorically divided as architecture or object principles, and infrastructure or implementation principles.
Object Principles:
· Objects encapsulate state (characteristics) and behavior (capabilities),
· Object state is always private to each object’s implementation. This provision ensures that the object interface is separated from its implementation,
· Object behavior is the only means available for manipulating an object,
· The architecture supports both maintenance of state and invocation of behaviors, and
· Capabilities and services are expected to reuse each other extensively.
Infrastructure Principles:
· Access to an object is achieved through the invocation of a request,
· Access to an object can be shared by more than one client process at a time,
· Clients of an object need not reside in the same process as the object’s implementation,
· The published state of an object comprise more than just a value; other properties describe whether its current value is NULL (not set), or whether other classifications and constraints exist on its current value (based on dynamic rules stored in the Knowledge Base),
· System behavior is asynchronous (Tsitsiklis and Stamoulis, 1995), that is, objects can be affected by more than one process. Consequently, clients interested in changes occurring in other processes must register interest in those changes to be notified when the changes occur.
The above outlined the fundamental requirements and design principles considered while constructing the model. The next section presents the model from a subsystem perspective. The subsystem perspective defines independent components and their interrelations (mappings) that construct the model
4. SUBSYSTEM PERSPECTIVE
The proposed model may be viewed as a set of related subsystems, each of which logically aggregate a set of closely collaborating objects. During execution, the system is interested in maintaining a set of active mappings among various subsystems. Each mapping connects entities in one subsystem with collaborating entities in another. The subsystems and their respective mapping mechanisms are presented in Figure 3.
The diagram presented in Figure 3 illustrates how each of the subsystems communicate with the Control Subsystem, the intelligence center of the model, via a mapping mechanism. The mapping mechanisms do not follow a general form but are created according to the unique requirements of the each subsystem. For instance, the Knowledge Base Subsystem uses a mapping mechanism, namely the Knowledge Base Map, which allows the Control Subsystem to validate various run-time activities against a set of rules that relate to these activities. While these subsystems all use mapping mechanisms, the form and implementation of the mechanisms have little in common.
The general scheme of subsystems and mapping mechanisms allow virtually any common facility to be integrated with (or removed from) the object model (Kifer et. al., 1995) with minimum disruption to other subsystems.
4.1 Subsystems
A subsystem is any set of related objects responsible for implementing some readily identifiable domain or functionality. The central subsystem is the Control Subsystem. It is responsible for communicating with other subsystems, including the client processes, via their provided mappings. Other subsystems are:
a) Knowledge Base Subsystem
The Knowledge Base subsystem is a repository of static and dynamic rules (Eick and Raupp, 1991) utilized for customizing the behavior of the protocol by the client. Knowledge Base subsystem may be accessed directly by the client via a Policy Map or by the Control subsystem via Knowledge Base Map.
b) External Subsystem
The External Subsystem is responsible for accessing remote servers through various network communication protocols such as TCP/IP (Gray and Reuter, 1993). The is Request/Response (Comer and Stevens, 1991) based mechanism that provides specific packaging instructions for Client requests and Server responses in dedicated data Views (Request and Response Views).
c) Synchronization Subsystem
The Synchronization subsystem facilitates three modes of operations for client requests (transactions); Synchronous, Asynchronous, and Pseudo-Synchronous.
d) Storage Subsystem
The Storage Subsystem provides logical and physical operations for storing all server response data and messages throughout the system.
e) Persistence Subsystem
The Persistence Subsystem (Fu and Dasgupta, 1993) utilizes a logical caching mechanism controlled by the Storage Subsystem for storing remote data locally even after the client transaction is terminated. This will minimize the need for data communication on subsequent remote server requests.
4.2 Mapping Mechanisms
Each subsystem implements a mapping mechanism to map the name-space of its service to the name-space of the Core objects. The mapping mechanisms are:
· Policy Map - provides a mechanism for clients to directly modify (add, delete, or change) dynamic rules in the client knowledge bases,
· Knowledge Base Map - allows access to Knowledge Base rules for run time verification,
· External Map - creates Input/Output data views for client requests through the Remote Data Access object. Additionally, it performs data type conversion if requested by the client processes at the invocation time,
· Client Interface Map - provides a two-way communication between client and Core object,
· Synchronization Map - provides a mechanism for the Core object to establish transaction concurrency, as instructed by the client,
· Storage map - provides a mechanism for manipulating the storage area ( load, save, refresh, and clear operations), and
· Persistence Map - allows enabling, disabling, and manipulating the cache operations.
The subsystem perspective outlined in this section categorized services as logical subsystems with an abstract interrelation or mapping among them.
5. CONCLUSION
The existing interprocess communication protocols impose numerous restrictions when operating within distributed computing environments. The fundamental problem common among these protocols is that they were initially designed to operate within centralized systems and later extended to address the data communication requirements in distributed environments. This paper outlined a object-based model for designing data communication protocols. The underlying methodology is to design communication protocols as dynamically re-definable organizations with processing intelligence. The model, similar to the organization of a computer, consists of a logical memory unit for caching remote data locally, a synchronization unit to maximize execution concurrency, and a rule-based control unit that utilizes a Knowledge Base subsystem that serves as the intelligence center of the model. The concepts presented in this paper intend to point out a new approach for designing such protocols. The actual design and implementation of such concepts require extensive future work.
REFERENCES
Gray, J., Reuter, A., 1993, “Transaction Processing: Concepts and Techniques”, Morgan Kaufmann Publishers Inc., San Mateo.
Meyer, B., 1993, “Systematic Concurrent Object-Oriented Programming”, Communications of the ACM, Vol. 36, No. 9, pp. 56-81.
Yen-Min, H., 1994, “Designing an Agent Synthesis System for Cross-RPC Communication”, IEEE Transactions on Software Engineering, Vol. 20, No. 3, pp. 188-199.
Ravindran, K., Thenmozhi, A., 1993, “Extraction of Logical Concurrency in Distributed Applications”, IEEE 13th International Conference on Distributed Computing Systems, pp. 66-73.
Finke, S., Jahn, P., Langmack, O., Lohr, K., Piens, I., Wolf, T. H., 1993, “Distributed and Inheritance in the HERON Approach to Heterogeneous Computing”, IEEE 13th International Conference on Distributed Computing Systems, pp. 399-408.
Raynal, M., Mizuno, M., Neilsen, M., 1992, “Synchronization and Concurrency Measures for Distributed Computations”, IEEE 12th Int. Conference on Distributed Computing Systems, pp. 700-707.
Fu, M., Dasgupta, P., 1993, “A Concurrent Programming Environment For Memory-Mapped Persistent Objects”, the 17th International Computer Software and Application Conference, COMPASS.
Chandy, K. M., Misra, J., Hass, L. M., 1983, “Distributed Deadlock Detection”, ACM Trans. Computer Syst., Vol. 1, pp. 144-156.
Afek, Y., Brown, G., Merritt, M., 1989, “Lazy Caching”, ACM Trans. on Programming Languages and Systems”, Vol. 15, No. 1, pp. 182-205.
Kim, T-H., Purtilo, J. M., 1995, “Configuration-Level Optimization of RPC-Based Distributed Programs”, IEEE 15th International Conference on Distributed Computing Systems, pp. 307-315.
Tsitsiklis, J. N., Stamoulis, G. D., 1995, “On the Average Communication Complexity of Asynchronous Distributed Algorithms”, Journal of the ACM, Vol. 42, No. 2, pp. 382-400.
Kifer, M., Lausen, G., Wu, J., 1995, “Logical Foundation of Object-Oriented and Frame-Based Languages”, Journal of the ACM, Vol. 42, No. 4, pp. 741-843.
Comer, D.E., Stevens, D. L., 1991, “Internetworking With TCP/IP”, Prentice Hall Inc., Englewood Cliffs, New Jersey.
Eick, C. F., Raupp, T., 1991, “Towards a Formal Semantics and Inference Rules for Conceptual Data Models”, Data Knowledge Engineering, Vol. 6, pp. 297-311.
Croker, A., Dhar, V., 1993, “A Knowledge Representation for Constraint Satisfaction Problems”, IEEE Transactions on Knowledge and Data Engineering, Vol. 5, No. 5, pp. 740-752.
Lin, H., Raghavendra, C., 1991, “A Dynamic Load Balancing Policy With a Central Job Dispatcher (LBC)”, IEEE 11th International Conference on Distributed Computing Systems, pp. 264-271.
Khalili, B., 1995, “Issues in using RPCs for Distributed Communication”, First World Conference on Integrated Design and Process Technology, IDPT-Vol. 1, Austin, Texas, pp. 369-374.
FIGURES
Fig. 1 RPC block diagram
Fig. 2 Abstract view of the proposed model
Fig. 3 Subsystem perspective