广告广告
  加入我的最爱 设为首页 风格修改
首页 首尾
 手机版   订阅   地图  繁体 
您是第 2672 个阅读者
 
发表文章 发表投票 回覆文章
  可列印版   加为IE收藏   收藏主题   上一主题 | 下一主题   
william0430
数位造型
个人文章 个人相簿 个人日记 个人地图
路人甲
级别: 路人甲 该用户目前不上站
推文 x0 鲜花 x0
分享: 转寄此文章 Facebook Plurk Twitter 复制连结到剪贴簿 转换为繁体 转换为简体 载入图片
推文 x0
[Java] java论文翻译,请各位大大帮帮忙
java论文翻译,请各位大大帮帮忙,期末报告要交的........谢谢!!
CCJ: object-based message passing and collective communication in Java

CCJ is a communication library that adds MPI-like message passing and collective operations to
Java. Rather than trying to adhere to the precise MPI syntax, CCJ aims at a clean integration of
communication into Java’s object-oriented framework. For example, CCJ uses thread groups to support Java’s multithreading model and it allows any data structure (not just arrays) to be communicated. CCJ is implemented entirely in Java, on top of RMI, so it can be used with any Java virtual machine. The paper discusses three parallel Java applications that use collective communication. It compares the performance (on top of a Myrinet cluster) of CCJ, RMI and mpiJava versions of these applications and also compares their code complexity. A detailed performance comparison between CCJ and mpiJava is given using the Java Grande ForumMPJ benchmark suite. The results show that neither CCJ’s object-oriented design nor its implementation on top of RMI impose a performance penalty on applications compared to theirmpiJava counterparts. The source of CCJ is available from our Web site Copyright _c 2003 John Wiley & Sons, Ltd. KEY WORDS: parallel programming; collective communication
1. INTRODUCTION
Recent improvements in compilers and communication mechanisms make Java a viable platform for
high-performance computing . Java’s support for multithreading and remote method invocation
(RMI) is a suitable basis for writing parallel programs. RMI uses a familiar abstraction (object
invocation), integrated in a clean way in Java’s object-oriented programming model. For example,
almost any data structure can be passed as an argument or return value in an RMI. Also, RMI can be
implemented efficiently and it can be extended seamlessly with support for object replication.

A disadvantage of RMI, however, is that it only supports communication between two parties,
a client and a server. Experience with other parallel languages has shown that many applications
also require communication between multiple processes. The MPI message-passing standard defines
collective communication operations for this purpose . Several projects have proposed to extend
Java with MPI-like collective operations . For example,MPJ proposes MPI language bindings
to Java, but it does not integrate MPI’s notions of processes and messages into Java’s objectoriented framework. Unlike RMI, the MPI primitives are biased towards array-based data structures, so collective operations that exchange other data structures are often awkward to implement. Some existing Java systems already support MPI’s collective operations, but they invoke a C-library from Java using the Java native interface (JNI), which has a large runtime overhead .
In this paper we present the CCJ (Collective Communication in Java) library which adds the core
of MPI’s message passing and collective communication operations to Java’s object model. CCJ
maintains thread groups, the members of which can communicate by exchanging arbitrary object data structures. For example, if one thread needs to distribute a list data structure among other threads, it can invoke an MPI-like scatter primitive to do so. CCJ is implemented entirely in Java, on top of RMI. It therefore does not suffer from JNI overhead and it can be used with any Java virtual machine. We study CCJ’s performance on top of a fast RMI system (Manta ) that runs over a Myrinet network. Performance measurements for CCJ’s collective operations show that its runtime overhead is almost negligible compared to the time spent in the underlying (efficient) RMI mechanism.We also discuss CCJ applications and their performance. CCJ’s support for arbitrary data structures is useful, for example, in implementing sparse matrices. We also compare CCJ’s performance to mpiJava in detail using the Java Grande Forum MPJ benchmark suite.
The rest of the paper is structured as follows. In Sections 2 and 3, we present CCJ’s design and
implementation, respectively. In Section 4, we discuss code complexity and performance of three
application programs using CCJ, mpiJava and plain RMI. In Section 5, we present the result from
the Java Grande Forum benchmarks. Section 6 presents related work and Section 7 concludes.

2. OBJECT-BASED MESSAGE PASSING AND COLLECTIVE COMMUNICATION
With Java’s multithreading support, individual threads can be coordinated to operate under mutual
exclusion. However, with collective communication, groups of threads cooperate to perform a given
operation collectively. This form of cooperation, instead of mere concurrency, is used frequently in
parallel applications and enables efficient implementation of the collective operations.
In this section, we present and discuss the approach taken in our CCJ library to integrate message
passing and collective communication, as inspired by theMPI standard, into Java’s object-based model.
CCJ integrates MPI-like operations in a clean way in Java, but without trying to be compatible with the precise MPI syntax. CCJ translates MPI processes into active objects (threads) and thus preserves MPI’s implicit group synchronization properties. In a previous work, we discussed the alternative approach of using groups of passive objects .

2.1. Thread groups
With the MPI standard, processes perform point-to-point and collective communication within the
context of a communicator object. The communicator defines the group of participating processes
which are ordered by their rank. Each process can retrieve its rank and the size of the process
group from the communicator object. MPI communicators cannot be changed at runtime, but new
communicators can be derived from existing ones.
In MPI, immutable process groups (enforced via immutable communicator objects) are vital for
defining sound semantics of collective operations. For example, a barrier operation performed on an
immutable group clearly defines which processes are synchronized; for a broadcast operation, the set of receivers can be clearly identified. The ranking of processes is also necessary to define operations like scatter/gather data redistributions, where the data sent or received by each individual process is determined by its rank. Unlike MPI, the PVM message-passing system allows mutable process groups, trading clear semantics for flexibility.
The MPI process group model, however, does not easily map onto Java’s multithreading model.
The units of execution in Java are dynamically created threads rather than heavy-weight processes.
Also, the RMI mechanism blurs the boundaries between individual Java virtual machines (JVMs).
Having more than one thread per JVM participating in collective communication can be useful, for
example, for application structuring or for exploiting multiple CPUs of a shared-memory machine.
Although the MPI standard requires implementations to be thread-safe, dynamically created threads
cannot be addressed by MPI messages, excluding their proper use in collective communication.
CCJ maps MPI’s immutable process groups onto Java’s multithreading model by defining a model
of thread groups that constructs immutable groups from dynamically created threads. CCJ uses a twophase creation mechanism. In the first phase, a group is inactive and can be constructed by threads willing to join. After construction is completed, the group becomes immutable (called active) and can be used for collective communication. For convenience, inactive copies of active groups can be created and subsequently modified. Group management in CCJ uses the following three classes.

• ColGroup. Objects of this class define the thread groups to be used for collective operations.
ColGroup provides methods for retrieving the rank of a given ColMember object and the size
of the group.

• ColMember. Objects of this class can become members of a given group. Applications
implement subclasses of ColMember, the instances of which will be associated with their own
thread of control.

• ColGroupMaster. Each participating JVM has to initialize one object of this class acting as a
central group manager. The group master also encapsulates the communication establishment
like the interaction with the RMI registry.

For implementing the two-phase group creation, ColGroupMaster provides the following interface.
Groups are identified by String objects with symbolic identifications.

• void addMember(String groupName, ColMember member). Adds a member to a group.
If the group does not yet exist, the group will be created. Otherwise, the group must still be
inactive; the getGroup operation for this group must not have completed so far.

• ColGroup getGroup(String groupName, int numberOfMembers). Activates a group.
The operationwaits until the specified number of members have been added to the group. Finally,
the activated group is returned. All members of a group have to call this operation prior to any
collective communication.

2.2. Message passing
For some applications, simple message exchange between two group members can be beneficial.
Inspired by the MPI standard, we added the following operations for synchronous and asynchronous
message sending, for receiving and for a combined send–receive.We also added a rendezvous message
exchange, which is equivalent to two nodes performing send–receive operations with each other.
This rendezvous can be implemented very efficiently by a single RMI request/reply pair of messages.

• void send sync(ColGroup group, Serializable object, int destination). Sends object to the
member destination of the group. Waits until the object has been received using the receive
operation.

• void send async(ColGroup group, Serializable object, int destination). Same as
send sync, but only delivers the object at the receiving member’s node, without waiting for
the receiver to call the receive operation.
• Serializable receive(ColGroup group, int source). Receives and returns an object from the
group’s member source. Waits until the object is available.

• Serializable send receive(ColGroup send group, Serializable send object, ColGroup
receive group, Serializable receive object). Simultaneously performs a send async and an
unrelated receive operation.
• Serializable rendezvous(ColGroup group, Serializable object, int peer). Sends object to
the group’s member peer and returns an object sent by that member.

2.3. Collective communication
As described above, CCJ’s group management alleviates the restrictions of MPI’s static, communicator based group model. For defining an object-based framework, the collective communication operations themselves also have to be adapted. MPI defines a large set of collective operations, inspired by parallel application codes written in more traditional languages such as Fortran or C. Basically, MPI messages consist of arrays of data items of given data types. Although important for many scientific codes, arrays cannot serve as general-purpose data structures in Java’s object model. Instead, collective operations should deal with serializable objects in the most general case.
The implementation of the collective operations could either be part of the group or of the mbers.
For CCJ, we decided on the latter option, as this is closer to the original MPI specification and more
intuitive with the communication context (the group) becoming a parameter of the operation.
From MPI’s original set of collective operations, CCJ currently implements the most important
ones, leaving out those operations that are either rarely used or strongly biased by having arrays
as general parameter data structure. CCJ currently implements Barrier, Broadcast, Scatter, Gather,
Allgather, Reduce and Allreduce. We now present the interface of these operations in detail. For the reduce operations, we also present the use of function objects implementing the reduction operators themselves. For scatter and gather, we present the DividableDataObjectInterface imposing a notion of indexing for the elements of general (non-array) objects. CCJ uses Java’s exception handling mechanism for catching error conditions returned by the various primitives. For brevity, however, we do not show the exceptions in the primitives discussed below. Like MPI, CCJ requires all members of a group to call collective operations in the same order and with mutually consistent parameter objects.

• void barrier(ColGroup group). Waits until all members of the specified group have called the
method.
• Serializable broadcast(ColGroup group, Serializable obj, int root). One member of the
group, the one whose rank equals root, provides an object obj to be broadcast to the group.
All members (except the root) return a copy of the object. To the root member, a reference to obj is returned.

MPI defines a group of operations that perform global reductions such as summation or maximum on
data items distributed across a communicator’s process group. MPI identifies the reduction operators either via predefined constants like ‘MPI MAX’, or by user-implemented functions. However, objectoriented reduction operations have to process objects of application-specific classes; implementations of reduction operators have to handle the correct object classes.
One implementation would be to let application classes implement a reduce method that can
be called from within the collective reduction operations. However, this approach restricts a class
to exactly one reduction operation and excludes the basic (numeric) data types from being used in
reduction operations.

As a consequence, the reduction operators have to be implemented outside the objects to be
reduced. Unfortunately, unlike in C, functions (or methods) cannot be used as first-class entities
in Java. Alternatively, Java’s reflection mechanism could be used to identify methods by their names
and defining class (specified by String objects). Unfortunately, this approach is unsuitable, because
reflection is done at runtime, causing prohibitive costs for use in parallel applications. Removing
reflection from object serialization is one of the essential optimizations of our fast RMI plementation
in the Manta system .
CCJ thus uses a different approach for implementing reduction operators: function objects .
CCJ’s function objects implement the specific ReductionObjectInterface containing a single method
Serializable reduce(Serializable o1, Serializable o2). With this approach, all application specific
classes and the standard data types can be used for data reduction. The reduction operator itself can be flexibly chosen on a per-operation basis. Operations implementing this interface are supposed to be associative and commutative. CCJ provides a set of function objects for the most important reduction operators on numerical data. This leads to the following interface for CCJ’s reduction operations in the ColMember class.

• Serializable reduce(ColGroup group, Serializable dataObject, ReductionObjectInterface
reductionObject, int root). Performs a reduction operation on the dataObjects provided by the
members of the group. The operation itself is determined by the reductionObject; each member
has to provide a reductionObject of the same class. reduce returns an object with the reduction
result to the member identified as root. All other members get a null reference.

• Serializable allReduce(ColGroup group, Serializable dataObject, ReductionObjectInterface
reductionObject). Like reduce but returns the resulting object to all members.

The final group of collective operations that have been translated from MPI to CCJ is the one of
scatter/gather data redistributions: MPI’s scatter operation takes an array provided by a root process and distributes (‘scatters’) it across all processes in a communicator’s group. MPI’s gather operation collects an array from items distributed across a communicator’s group and returns it to a root process. MPI’s allgather is similar, returning the gathered array to all participating processes, however.

Although defined via arrays, these operations are important for many parallel applications.
The problem to solve for CCJ thus is to find a similar notion of indexing for general (non-array)
objects. Similar problems occur for implementing so-called iterators for container objects . Here,
traversing (iterating) an object’s data structure has to be independent of the object’s implementation in order to keep client classes immune to changes of the container object’s implementation. Iterators request the individual items of a complex object sequentially, one after the other. Object serialization, as used by Java RMI, is one example of iterating a complex object structure. Unlike iterators, however, CCJ needs random access to the individual parts of a dividable object based on an index mechanism. For this purpose, objects to be used in scatter/gather operations have to implement the DividableDataObjectInterface with the following two methods:

• Serializable elementAt(int index, int groupSize). Returns the object with the given index in
the range from 0 to groupSize − 1

• void setElementAt(int index, int groupSize, Serializable object). Conversely, sets the object
at the given index.
Based on this interface, the class ColMember implements the following three collective operations.

• Serializable scatter(ColGroup group, DividableDataObjectInterface rootObject, int root).
The root member provides a dividable object which will be scattered among the members of the
given group. Each member returns the (sub-)object determined by the elementAt method for its
own rank. The parameter rootObject is ignored for all other members.

• DividableDataObjectInterface gather(ColGroup group, DividableDataObjectInterface
rootObject, Serializable dataObject, int root). The root member provides a dividable object
which will be gathered from the dataObjects provided by the members of the group. The actual
order of the gathering is determined by the rootObject’s setElementAt method, according to
the rank of the members. The method returns the gathered object to the root member and a null
reference to all other members.

• DividableDataObjectInterface allGather(ColGroup group, DividableDataObjectInterface
resultObject, Serializable dataObject). Like gather, however the result is returned to all
members and all members have to provide a resultObject.

2.4. Example application code
We will now illustrate how CCJ can be used for application programming. As our example, we show
the code for the all-pairs shortest path (ASP) application, the performance of which will be discussed in Section 4. Figure 1 shows the code of the Asp class that inherits from ColMember. Asp thus constitutes the application-specificmember class for the ASP application. Its method do asp performs the computation itself and uses CCJ’s collective broadcast operation. Before doing so, Asp’s run method first retrieves the rank and size from the group object. Finally, do asp calls the done method from the ColMember class in order to de-register the member object. The necessity of the done method is an artifact of Java’s thread model in combination with RMI; without any assumptions about the underlying JVMs, there is no fully transparent way of terminating an RMI-based distributed application run. Thus, CCJ’s members have to de-register themselves prior to termination to allow the application to terminate gracefully.

Figure 2 shows the MainAsp class, implementing the method main. This method runs on all JVMs participating in the parallel computation. This class establishes the communication context before starting the computation itself. Therefore, a ColGroupMaster object is created (on all JVMs).
Then, MainAsp creates an Asp member object, adds it to a group, and finally starts the omputation.
Our implementation of the ColGroupMaster also provides the number of available nodes, which is
useful for initializing the application. On other platforms, however, this information could also be
retrieved from different sources.
For comparison, Figure 3 shows some of the code of the mpiJava version of ASP. We will use this mpiJava program in Section 4 for a performance comparison with CCJ. A clear difference between the mpiJava and CCJ versions is that the initialization code of CCJ is more complicated.
The reason is that mpiJava offers a simple model with one group member per processor, using the
MPI.COMM WORLD communicator. CCJ on the other hand is more flexible and allows multiple active objects per machine to join a group, which requires more initialization code. Also, the syntax of mpiJava is more MP ..

访客只能看到部份内容,免费 加入会员 或由脸书 Google 可以看到全部内容




william liu   老山药
献花 x0 回到顶端 [楼 主] From:台湾数位联合 | Posted:2006-05-27 22:41 |

首页  发表文章 发表投票 回覆文章
Powered by PHPWind v1.3.6
Copyright © 2003-04 PHPWind
Processed in 0.018067 second(s),query:15 Gzip disabled
本站由 瀛睿律师事务所 担任常年法律顾问 | 免责声明 | 本网站已依台湾网站内容分级规定处理 | 连络我们 | 访客留言