Commonality in distributed analysis architecture
  User Interface Master control server Secondary control server Analysis worker
Description User can have 0-n UIs. Can disconnect/reconnect Persistent for session duration. Centralized session management at grid-wide level. Session management at grid-site level. Started at sites providing session resources Processing and local storage resource
Implementations ROOT, JAS, Python (e.g. Ganga), web browser UI, CAIGEE web client PROOF master (sort of), JAS control server, Clarens PROOF secondary master (future) PROOF slaves, JAS data servers
Single user view. All is duplicated for additional analysis users with their own sessions and tasks.
 
Text Box: Status:
Missing & needed
Work needed
Essentials exist
Low/future priority

Bold=priority work
   
Services used   Comments, implementations
"setup" login - vo authentication, authorization, rights management, inter-vo authentication, auth, rights, collaborative management of sessions, tasks authentication authentication Lothar: registration, DBs scoped out. Working out sharing functionality. Existing tools + new work
"data mgmt, file movement" rms - replica location, data mover, cost estimator for replica selection rms, data collecting services rms, data collecting services, software installer Data mover - gridftp etc.
"physics selection and abstract task definition" vds -  dataset catalog service (physics query or name lookup returns dataset(s), transformation catalog vds  
"concrete task planning" resource cost estimator planner/rb/scheduler - discovery, matchmaking, estimation, reservation, workflow executor, information service   Workflow executor - dagman;  Info service - RGMA, MDS, must extend schema for dist analysis
"schedule and submit" job execution service - abstract job submit (plan + submit), concrete job submit (concretely specified submit), request executor job status, job control, job client interaction   Planner - Sphinx; Request executor - COD, Condor-G, GRAM;
"monitoring and control" job status, job control, fault monitor, grid monitors   Grid monitoring framework - MonaLisa
control/comm, job capability control/comm control/comm control/comm job capability - do I support capability X, eg. Do you support access to partial results? With what protocols?
"metadata" metadata metadata metadata metadata many kinds, needs definition
         
Execution flow for basic interactive analysis
  User Interface Master control server Secondary control server Analysis worker
Text Box: Bold=heading
 
 
  Sign on  
  Authenticate, authorize  
  Inherit rights, policies, environment; adjust  
  Optional - connect to existing session  
 
Select dataset(s)  
  Browse/query datasets, transformations  
  Check grid weather, set user-defined planner constraints  
  Start session  
  Launch master controller Init config, constraints from UI  
  Analysis worker configuration  
  Data placement planning  
  CPU allocation planning  
  Availability, cost evaluation; approval or reconfiguration Reconfigure as needed  
  Perform data placement and CPU allocation  
  Launch slaves Secondary control launched Slaves launched
  Slave communication for config Slave communication for config Slave communication for config
  Establish task  
  Set application, algorithms, configuration  
  Select dataset(s)  
     Redo CPU allocation Redo CPU allocation  
  Initiate task Initiate task Initiate task Initiate task
  Software setup
  Execute
  Monitoring of task jobs, statistics, the grid Feed monitoring, statistics
  See new cost estimate based on execution  
  Control/adjust execution constrained by policy, e.g. suspend/resume, abort Execution control, fault management Execution control Execution control
  Modify CPU allocation; re-strategize Redo CPU allocation Redo CPU allocation Launch new slaves
  Intermediate results Feed intermediate results
  Task completion notification Complete task  
  Save logging, provenance Logging, metadata to catalogs Logging, metadata to catalogs
  Gather results Merging statistical and bulk data Merging statistical and bulk data Upload results for merge, archive
  Archive, share results Data return, archiving, cataloging  
  Task save  
  Interactive analysis  
  Examine, analyze results  
  Scan list of events  
  Define, launch new selection in batch  
  Launch re-reco on sample  
  Save dataset locally  
  Save objects with their provenance  
  LOOP on Establish task  
  Save session config  
  End and tear down session  
  LOOP on Select datasets