Archive for the ‘Benchmarks’ Category

Using sysbench to benchmark CPU


30 Aug

Just a quick note about using sysbench to benchmark CPU; this will require you to compare it against other servers though.
[[email protected] ~]# sysbench –test=cpu –cpu-max-prime=20000 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000

Test execution summary:
total time: 23.2247s
total number of events: 10000
total time taken by event execution: 23.2217
per-request statistics:
min: 2.12ms
avg: 2.32ms
max: 40.60ms
approx. 95 percentile: 3.00ms

Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 23.2217/0.00

Just keep an eye on total time. This benchmarks a single core, how long it can calculate prime to 20,000. The above was taken on a VMWare VM with a physical server using a Xeon X5560 @ 2.8ghz CPU.

This will test multi-core/thread performance:

[[email protected] ~]# sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
 total time: 5.9896s
 total number of events: 10000
 total time taken by event execution: 23.9490
 per-request statistics:
 min: 2.12ms
 avg: 2.39ms
 max: 10.36ms
 approx. 95 percentile: 3.34ms

Threads fairness:
 events (avg/stddev): 2500.0000/58.60
 execution time (avg/stddev): 5.9873/0.00


Amazon AWS EBS – magnetic vs sc1 (cold storage) vs st1 (throughput optimized)


18 Jul

Amazon AWS EBS magnetic/sc1/st1

So I intend to install Graylog on a AWS EC2 instance (aka virtual server), I prefer Graylog over ELK, and need a storage mount for the volumes. My root partition, / , is just a 50gb general purpose SSD volume (gp2) but when I went to add a 500gb magnetic volume I was surprised to find other options like ST1 and SC1.  In us-east-1 anyways, it’s $0.05 a gb for magnetic, $0.045 for throughput optimized (st1) and $0.025 for cold storage (sc1). So, which one is faster?   If you look at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html you’re lead to believe that both SC1 and ST1 are faster than magnetic, and since they’re cheaper, why not just use it? However…

But what they don’t tell you is that you have to have a MINIMUM of 500gb to even use ST1 or SC1, you can create a 150gb magnetic EBS volume if you want and it’d work out to be much cheaper.

Additionally, the speeds they show you, leading you to think ST1/SC1 is superior, are “Max” speeds, which to me means you have to provision the maximum amount of space. Remember, they usually give you more speed the more storage you add. So if you wanted a 500gb volume, that’s bare minimum for SC1/ST1, so you may not get “max” speeds they show. Additionally, magnetic volumes are 1gb-1000gb, so if you provision 500gb magnetic you’d get about half the maximum performance. So this post is to answer a question that was driving me nuts, is half the magnetic volume performance better than the minimum SC1 or ST1 performance? So here we go, I’ll be using “sysbench” to benchmark a ext4 formatted 500gb magnetic, st1 and sc1 EBS volume to learn which is “fastest”.  Now don’t get me wrong, I know SSD is the fastest, but I just need cheap storage that’s reasonably fast to store and retrieve log files. According to Amazon, ST1 is perfect for this. It’s also worth noting that ST1 and SC1 cannot be boot volumes, only SSD and Magnetic can be boot volumes, so if you need an EC2 instance for pure EBS storage you’d need a 8gb SSD volume for the OS itself and then you can add ST1/SC1/Magnetic to it.

Create Test Files

Ok I’m going to create 10gb of “test” files, sine this EC2 instance only has 4gb of memory it’s guaranteed to not be using memory cache. I like to use “time” when I create these files to see how long it took to create 10gb of random test files, it should give me a general idea of what to expect in terms of performance.

Magnetic

[[email protected] magnetic]# time sysbench --test=fileio --file-total-size=10G prepare
sysbench 0.4.12: multi-threaded system evaluation benchmark
128 files, 81920Kb each, 10240Mb total
Creating files for the test...

real 6m36.685s
user 0m0.029s
sys 0m8.231s

Cold Storage (SC1)

[[email protected] sc1]# time sysbench --test=fileio --file-total-size=10G prepare
sysbench 0.4.12: multi-threaded system evaluation benchmark

128 files, 81920Kb each, 10240Mb total
Creating files for the test...

real 4m10.453s
user 0m0.039s
sys 0m7.958s

Throughput Optimized (ST1)

[[email protected] st1]# time sysbench --test=fileio --file-total-size=10G prepare
sysbench 0.4.12: multi-threaded system evaluation benchmark

128 files, 81920Kb each, 10240Mb total
Creating files for the test...

real 2m56.582s
user 0m0.025s
sys 0m8.137s

Initial Conclusion

It looks like ST1 is the fastest, followed by SC1 and then Magnetic. Interesting…

Random Read/Write for 5 minutes

Ok time to run for 5 minutes doing Random Read/Write’s. I chose 5 minutes because I wanted the disks to really spin up and begin processing. Sometimes AWS/Amazon likes to burst so I thought 5 minutes would show a good overall performance, whether it burst in the beginning and slowed down or if it started slowly and then ramped up in performance, I wanted the 5 minute average. After all, log files could be thrown at this volume constantly. The results won’t mean much unless you compare them with other results. Pay closest attention to the line I bold, that’s the line that will be the most important one to compare.

Magnetic

[[email protected] magnetic]# sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw \
> --init-rng=on --max-time=300 --max-requests=0 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from timer.


Extra file open flags: 0
128 files, 80Mb each
10Gb total file size
Block size 16Kb
Number of random requests for random IO: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Time limit exceeded, exiting...
Done.

Operations performed: 28252 Read, 18834 Write, 60160 Other = 107246 Total
Read 441.44Mb Written 294.28Mb Total transferred 735.72Mb (2.4523Mb/sec)
 156.95 Requests/sec executed

Test execution summary:
 total time: 300.0130s
 total number of events: 47086
 total time taken by event execution: 264.3029
 per-request statistics:
 min: 0.00ms
 avg: 5.61ms
 max: 224.96ms
 approx. 95 percentile: 17.49ms

Threads fairness:
 events (avg/stddev): 47086.0000/0.00
 execution time (avg/stddev): 264.3029/0.00

Cold Storage

[[email protected] sc1]# sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from timer.


Extra file open flags: 0
128 files, 80Mb each
10Gb total file size
Block size 16Kb
Number of random requests for random IO: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Time limit exceeded, exiting...
Done.

Operations performed: 7419 Read, 4946 Write, 15744 Other = 28109 Total
Read 115.92Mb Written 77.281Mb Total transferred 193.2Mb (659.42Kb/sec)
 41.21 Requests/sec executed

Test execution summary:
 total time: 300.0208s
 total number of events: 12365
 total time taken by event execution: 179.8208
 per-request statistics:
 min: 0.00ms
 avg: 14.54ms
 max: 529.72ms
 approx. 95 percentile: 38.78ms

Threads fairness:
 events (avg/stddev): 12365.0000/0.00
 execution time (avg/stddev): 179.8208/0.00

Throughput Optimized

[[email protected] st1]# sleep 180 && sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1
Initializing random number generator from timer.


Extra file open flags: 0
128 files, 80Mb each
10Gb total file size
Block size 16Kb
Number of random requests for random IO: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!
Time limit exceeded, exiting...
Done.

Operations performed: 25920 Read, 17280 Write, 55195 Other = 98395 Total
Read 405Mb Written 270Mb Total transferred 675Mb (2.25Mb/sec)
 144.00 Requests/sec executed

Test execution summary:
 total time: 300.0024s
 total number of events: 43200
 total time taken by event execution: 189.2417
 per-request statistics:
 min: 0.00ms
 avg: 4.38ms
 max: 334.78ms
 approx. 95 percentile: 15.43ms

Threads fairness:
 events (avg/stddev): 43200.0000/0.00
 execution time (avg/stddev): 189.2417/0.00

Test Conclusion

Interesting again. I thought the results would mirror the test file creation results but it looks like Magnetic outperformed both SC1 and ST1. However ST1 came fairly close to magnetic, maybe close enough that the performance could be considered comparable or identical. Considering ST1 created the test files in half the time I’m thinking magnetic is the best at reads, but ST1 is the best at writes and SC1 is kind of in between. So let’s move on. 

Read Tests

Ok so I’m going to do 2 read tests now, and only do them for 180 seconds (3 minutes). I’ll first do a sequential read test for 3 minutes, and then a random read test for 3 minutes. This is usually the fastest a disk or network connection can perform, it’s very easy on the disk drive and so you may find the bottleneck is the disk controller or the network connection and not the disk.

Magnetic

Sequential Read: Read 10Gb  Written 0b  Total transferred 10Gb  (62.852Mb/sec)
Random Read: Read 461.53Mb  Written 0b  Total transferred 461.53Mb  (2.564Mb/sec)

Cold Storage (SC1)

Sequential Read: Read 7.1055Gb  Written 0b  Total transferred 7.1055Gb  (40.403Mb/sec)
Random Read: Read 155.95Mb  Written 0b  Total transferred 155.95Mb  (887.18Kb/sec)

Throughput Optimized (ST1)

Sequential Read: Read 10Gb  Written 0b  Total transferred 10Gb  (62.059Mb/sec)
Random Read: Read 479.41Mb  Written 0b  Total transferred 479.41Mb  (2.6634Mb/sec)

Test Conclusion

Well it looks like cold storage read performance sucks. Magnetic and ST1 are neck and neck. I did another 90 second test run on sequential reads on Magnetic and it came in at 64.5 Mb/sec so it’s not “definitive” that ST1 is slightly faster. I’d say they’re pretty much identical. Since Magnetic is $0.05 per gigabyte and ST1 is $0.045 per gigabyte, I’d be leaning towards ST1 and my test file creation test in the beginning indicated ST1 was faster at writes, but let’s do some actual write tests.  

Write Tests

Ok so I’m going to do 2 write tests now, and only do them for 180 seconds (3 minutes). I’ll first do a sequential write test for 3 minutes, and then a random write test for 3 minutes. Since I plan to use this storage mostly for writing logs, I’m interested to see which disk can handle writes the best.

When people talk about sequential vs random writes to a file, they’re generally drawing a distinction between writing without intermediate seeks (“sequential”) vs. a pattern of seek-write-seek-write-seek-write, etc. (“random”).  When you write two blocks that are next to each-other on disk, you have a sequential write.  When you write two blocks that are located far away from each other on disk (so the magnetic disk has to seek, move the head, to the new location), you have random writes.

Log file writes are generally sequential. If you share the disks with anything else (like a database, backups, the operating system, etc.) then the write performance could suffer since the disk has to seek to the other location to do the write and then go back to where the log file is.  Since this is a secondary EBS volume used mostly for log files, we should be ok.

Magnetic

Sequential Write: Read 0b  Written 6.0571Gb  Total transferred 6.0571Gb  (34.457Mb/sec)
Random Write: Read 0b  Written 792.19Mb  Total transferred 792.19Mb  (4.4009Mb/sec)

Cold Storage (SC1)

Sequential Write: Read 0b  Written 8.0362Gb  Total transferred 8.0362Gb  (45.716Mb/sec)
Random Write: Read 0b  Written 92.188Mb  Total transferred 92.188Mb  (524.01Kb/sec)

Throughput Optimized (ST1)

Sequential Write: Read 0b  Written 10Gb  Total transferred 10Gb  (60.554Mb/sec)
Random Write: Read 0b  Written 378.12Mb  Total transferred 378.12Mb  (2.1006Mb/sec)

Test Conclusion

I was surprised SC1 was faster at sequential writes than Magnetic, but then I remember the test file creation times, how SC1 did outperform magnetic. Considering SC1 is half the price of Magnetic, that’s kind of impressive. But then random write performance was horrific. It looks like ST1 sequential write performance though is king, by a wide margin, but it came at the cost of random write performance (in which Magnetic is king).
If you need well rounded write performance magnetic is your only option. I guess this is why magnetic EBS volumes can be used as a boot volume unlike SC1 and ST1. 

Conclusion

Ok so it looks like ST1 is the way to go. For EBS, even bottom-of-the-barrel performance of ST1 beats the middle-of-the-road performance of Magnetic; I guess this goes in line with Amazon saying ST1 is optimized for databases or log’s, etc.  And since it is a tad cheaper than magnetic, I’d say we found a winner. For logs I’m going with a ST1 EBS volume. However if you need less than 500gb of storage, magnetic is really your only option and it’s not a “horrible” option, it’s well rounded. But for 500gb or more, go ST1. Maybe I’ll get bored one day and benchmark a 150gb magnetic volume to see how much performance is actually lost vs a 500gb magnetic volume.

SC1’s read performance sucks all around, it’s random write performance is junk as well but its sequential write performance is surprisingly decent.  I guess SC1 would be good for backups and archives and such, which kind of makes sense since that’s what Amazon advertises it for, infrequently accessed cold storage, ie backups and archives. Impressive Amazon is able to optimize things to that level.  As a side note, for backups/cold storage, in terms of performance and costs you have SC1 > S3 > Glacier so I’m thinking a good AWS backup strategy would to be place your immediate backups onto a SC1 EBS volume, then after 4 weeks move it to an S3 bucket and after 3 months move it to Glacier.

 

How to do a disk benchmark using sysbench


03 Feb

I’ll clean this up later but I wanted to throw this down here before I forget. I’m constantly benchmarking various mounts (local Magnetic HDD, local SSD, SW RAID vs HW RAID, RAID5 vs RAID6, ZFS RAIDZ1 vs ZFS RAIDZ2, NFS Mounts, ISCSI mounts).

I used this as my starting guide:  https://www.howtoforge.com/how-to-benchmark-your-system-cpu-file-io-mysql-with-sysbench  But it was a bit dated, I used the commands as a guideline but I modified the numbers/values naturally.

  1. As a general rule of thumb I pick about +2gb more than the amount of memory the server has, just to be 100% sure that memory is not being used instead of the disk. So on an 8gb system you’d want to create 10gb of ‘test files’ that sysbench can work with. Run this command to prepare the system by creating these test files in the exact structure that sysbench requires
    1. sysbench --test=fileio --file-total-size=10G prepare
  2. This will run a 5 minute (300 second) test and do random read’s and write’s. It will then spit out a Total read, total written, total transferred and most importantly how many MB or GB a second it was able to get. That’s what I use to compare servers or mediums against each other.
    1. sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw \
       --init-rng=on --max-time=300 --max-requests=0 run
  3. This will run a 3 minute (180 second) test and do sequential reads only. This is usually the fastest a disk or link can perform, it’s very easy on the disk drive and so you may find the bottleneck is the controller or the network connection.
    1.  sysbench --test=fileio --file-total-size=10G --file-test-mode=seqrd \
       --init-rng=on --max-time=180 --max-requests=0 run
  4. When all is said and done, run this: (I’m not sure why I couldn’t just delete all the test* files but they say run cleanup, so whatever)
    1. sysbench --test=fileio --file-total-size=10G cleanup

These are the valid options for file-test-mode (these are the various file system tests you can perform)

–file-test-mode=seqwr (sequential write test)
–file-test-mode=seqrd (sequential read test)
–file-test-mode=rndwr (random write test)
–file-test-mode=rndrd (random read test)
–file-test-mode=rndrw (random read and write test)

You can play with the time to run (60 seconds, 180 seconds, 300 seconds, etc.) and the file test mode and record the various results.

—————————————————————————————————————————————————————————–
For my own personal future reference:
POOPEYE = Dell PowerEdge R610 with Dual Xeon Processors, 32gb of memory (and no, I did not name this server)
gitchef = KVM VM running on poopeye with 4 cores,  6gb of memory

This is POOPEYE benchmarking in /root/  which is his RAID5 internal hard drives:  (2 runs)
Read 4.6106Gb  Written 3.0737Gb  Total transferred 7.6843Gb (26.228Mb/sec)
Read 4.5959Gb  Written 3.064Gb  Total transferred 7.6599Gb (26.099Mb/sec)

This is POOPEYE benchmarking in/mnt/pve/nfs-naspool/disktesting which is NFS to the main ZFS pool on Nas4Free:  (2 runs)
Read 365.62Mb  Written 243.75Mb  Total transferred 609.38Mb(2.0311Mb/sec)
Read 334.69Mb  Written 223.12Mb  Total transferred 557.81Mb(1.8593Mb/sec)

This is gitchef running on POOPEYE as a KVM VM. Right now his QCOW2 file is on local storage: (2 runs)
Read 2.2507Gb  Written 1.5004Gb  Total transferred 3.7511Gb(12.804Mb/sec)
Read 1.3498Gb  Written 921.44Mb  Total transferred 2.2496Gb(7.6785Mb/sec)

I was troubleshooting ZFS + NFS performance issues and so I got these results after each change: (Random Read/Write test)

# zpool Dataset with SYNC=DISABLED (across nfs)
Read 4.6976Gb  Written 3.1317Gb  Total transferred 7.8293Gb (26.724Mb/sec)
Read 4.6106Gb  Written 3.0737Gb  Total transferred 7.6843Gb (26.229Mb/sec)

# Direct to the main pool (no dataset) with SYNC=STANDARD (across nfs)
Read 703.12Mb  Written 468.75Mb  Total transferred 1.1444Gb (3.9062Mb/sec)

# Direct to the main pool with SYNC=STANDARD but ZIL going to 20gb SSD
Partition
Read 796.86Mb  Written 531.23Mb  Total transferred 1.297Gb (4.4269Mb/sec)

# Direct to the main pool with SYNC=STANDARD but cache to 20gb SSD Partition (SSD wasn’t that fast, it was a cheap SSD)
Read 545Mb  Written 363.33Mb  Total transferred 908.33Mb (3.0277Mb/sec)
Read 1.6278Gb  Written 1.0852Gb  Total transferred 2.713Gb (9.2604Mb/sec)

# Dataset with SYNC=disabled but with fast compression
Read 7.4808Gb  Written 4.9872Gb  Total transferred 12.468Gb (42.557Mb/sec)
Read 7.7948Gb  Written 5.1965Gb  Total transferred 12.991Gb (44.344Mb/sec)

So on a personal note, ZFS with a dataset having SYNC=disabled and compression=lz4 on results in wicked fast speeds, faster than local RAID5, heh.

===

AWS EBS Magnetic benchmarks: (for personal reference)

Random Read/Write Magnetic: 2.4523Mb/sec
Random Read/Write: Cold Storage (SC1): 659.42Kb/sec
Random Read/Write: Throughput Optimized (ST1): 2.25Mb/sec

Sequential Read Magnetic: 62.852Mb/sec
Random Read Magnetic: 2.564Mb/sec
Sequential Read SC1:  40.403Mb/sec
Random Read SC1: 887.18Kb/sec
Sequential Read ST1: 62.059Mb/sec
Random Read ST1: 2.6634Mb/sec

Sequential Write Magnetic: 34.457Mb/sec
Random Write Magnetic: 4.4009Mb/sec
Sequential Write SC1: 45.716Mb/sec
Random Write SC1: 524.01Kb/sec
Sequential Write ST1: 60.554Mb/sec
Random Write ST1: 2.1006Mb/sec

===

Oh, note to self: I just found this link: https://wiki.mikejung.biz/Sysbench  it looks interesting. I/you/we need to go over it later and integrate some of it into this page (while giving full credit to the author of course).

 

Deon's Playground

Placing whatever interests me and more