Scalability to Hundreds of Clients in HEP Object Databases

 

Koen Holtman, Julian Bunn, CERN

The CMS collaboration plans to implement its data storage and processing system using a single large federated object database. The scalability of such a database is an important consideration. In this respect, the main goals for the CMS system are

These scalability issues are being studied by CMS as part of the RD45 collaboration and of the GIOD project (a joint project between Caltech, HP, and CERN). In this paper, we report on Objectivity/DB database scalability tests made on a 256-processor HP Exemplar machine at Caltech. This leading-edge SMP shared-memory machine allows the study of scaling effects when the number of database clients is in the hundreds.

Our tests focused on the behaviour of the throughput as a function of the number of database clients, under DAQ and reconstruction style workloads. We present results and conclusions from our study.

Preliminary results from out study show almost ideal scaling up to at least 150 parallel reconstruction clients running in an event farm configuration, and scaling up to at least 64 clients in a DAQ workload with an aggregate throughput of 60 MBytes/sec. We have also successfully tested 210 simultaneous database clients using the system. We plan to extend on these figures in the second part of our study. Though there were some surprises, all database operations critical to successful DAQ and reconstruction appear to scale well. Special provisions were necessary to optimise the system for more than a few tens of clients that were reading from a single striped disk array. Specifically, a read-ahead optimisation layer was developed that ensured efficient disk I/O. The Objectivity/DB lockserver did not limit scalability for DAQ test rates up to at least 100 MBytes/sec.