Regarding the choice of technologies, from Eric Aslakson,
10/10/2002
Hi Haifeng, Julian, Conrad, Edwin,
Remember, you asked! My viewpoint is probably biased by what I've been
working on, so you'll have to take what I say with a grain of salt. I would
vote for priority on the query distributor (and return data amalgamator).
I think that Clarens works well in its Python manifestation, and all the
server code that we need to get to seems to have Python hooks already. We
can do a Web service workaround on Win32 for now and port Clarens to Java
later.
Also a comment about whether to Java or not to Java... Don't get me wrong,
I think Java is a superior language to
C/C++/Fortran/ASP/Javascript/VB/VBScript. The unfortunate reality is most
of our server modules are written in C++ (COBRA, ORCA, ROOT) and Fortran
(PAW). And most of our front end clients are written in C++ (Iguana, ROOT).
I believe we would get further, faster with JAS, but ROOT seems to be the
preferred choice (I suppose maybe because we want to replace Objectivity
with ROOT-IO). But we must answer the question whether to ship data around
from one JVM to another JVM. Now the data to be shipped around is either
ROOT files (or other binary files) or alternatively SQL data. For ROOT or
binary files, I contend that Clarens itself (hooked into a GridFTP or SRB
layer) is optimal.
In the case of SQL data, there's some preference for jdbc over odbc. I
contend, however, that adding a libodbc++ (or OTL) layer on top of odbc
makes it as nice as jdbc. We can certainly support all the backend
databases in as pluggable a manner as jdbc.
But even assuming that getting data into a JVM is as easy as getting data
into a binary OTL object, you still have to solve the problem of getting the
data out of the JVM at the client (for non-Java applications like ROOT), and
here is where the problem is. You could have the JVM write the data to a
file on local disk, but this introduces latencies, forces you to keep track
of filenames, perform cleanup, breaks streaming etc.. If you want to
maintain data streaming, then the next possible method is to use JNI (Java
Native Interface) to talk directly to a Root process, or maybe write data
into Root memory. The problem with this is it isn't very platform
independent and is not very easy. Ideally we would like to have messages
and data flow from the JVM out, but also into the JVM. I know how to do
this with COM on windows, but not on *nix. I suppose you could have the JVM
speak to the outside world through some kind of a local web-server (or
sockets based method), but this seems to be complicated too.
So my opinion is that Java is great in its own world, but is not optimal
when you need to speak out or into it.
I think the real answer is that we would like to have both - data
distribution via Clarens/SRB (for files), Clarens/WebServices/OTL for SQL
data for C++ consumers, and ALSO Java distribution for Java clients.
To summarize, I would prefer Haifeng to work with me on the OTL layers,
providing distributed queries. Edwin, I hope you're not mad at me.
That's my 2 cents.
Eric