Executive summary: disk I/O on chuma and sam will probably limit us to no more than 2 active ingest collections.
As we near the end of the transfer of 2005 data from the data warehouse disk to the HSM, look at the "steady-state" results. We transfer an average of 2.5TB/day using one collector.
The network activity isn't overwhelmingly busy--I use gort disk service as a proxy for that. (I may miss some, because the IceCube bootcamp is on this week.)
On klausz we see that the process takes about 30% of the CPU. Memory usage is pretty trivial. And the Ethernet usage seems to be largely driven by the file size. Smaller file size means more overhead and lower transfer rates and less network usage. The gaps in network usage seem to appear at the same times as rsync starts a new copy, and would therefore represent times when rsync is busy building its tables and not copying files.
Sam is fairly busy. CPU usage is about 45% with one collector active. However, some of that is idle time, and the actual CPU busy usage is about 30%. That suggests that with the current configuration, we can handle no more than 3 active collectors. We can goose this with more and faster CPUs, and possibly more memory. Both disk reads and writes seem to be pegged at about 70MB/sec. Since the maximum rate is about 120MB/sec, that means we can't handle more than two simultaneous collectors.
One thing I was not expecting from earlier studies was the CPU load on chuma; about 20%. Thus chuma can withstand a maximum of 5 times the current rate, or if I use the non-idle value of 15%, about 6 times the current rate. I do not know if this is the same for sustained access as for ingest. BUT note that the disk read rate is about 100MB/sec and disk write about 70MB/sec, which will limit it sooner since the max rate is roughly 120MB/sec.
So for ingest we are probably limited to 2x the current rate, unless
Ouch. RamSan 620 5TB configuration quote (budgeting only, may be applicable discounts) as of 24-Jun was $190,000. This does not look like the answer we want.
Modified 24-June-2011 at 08:45
http://icecube.wisc.edu/~jbellinger/StorHouse/24Jun2011
Previous notes | Next notes | Main slide directory |