This weekend I attempted to figure out how HBase writes perform in multi-threaded environments. To do so, I wrote a Java program that writes records in a HBase table, and which can be configured to run to write N number of records using M Threads.
I used three variants to write records into HBase:
Here are the various results:
I will be adding more results here: https://github.com/srikanthps/hbase-benchmarker/blob/master/Reports.TXT
Conclusions:
So, what conclusions can be drawn from above figures:
1. Write performances seems to be good when lesser threads are used for writing.
2. All 3 variants seems to perform more or less similarly. In our application, we use HTablePool based approach and its nice to see that it performs equally nice. Another thing which I observed during my tests is that pool size may not have much role to play. With pool size of 10 too, the results are more or less similar.
3. As number of threads increase, the write performance deteriorates. A potential reason for this behavior is that HBase opens only one connection per region server from a single JVM. So, will increasing the number of JVMs help in horizontally scaling the writes?
Under heavy multi-threading, the avg. time increases as most of the threads seems to be blocking here:
"pool-1-thread-99" prio=6 tid=0x045a4c00 nid=0x22e4 waiting for monitor entry [0x06c6f000]
java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:471) - waiting to lock <0x291f0088> (a java.io.DataOutputStream)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:713)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333)
at $Proxy0.put(Unknown Source)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3$1.call(HConnectionManager.java:1242)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3$1.call(HConnectionManager.java:1240)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1035)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3.doCall(HConnectionManager.java:1239)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1161)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1247)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:609)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
Source code for benchmarking application:
Available in my github repository.
https://github.com/srikanthps/hbase-benchmarker
I used three variants to write records into HBase:
- Use one HTable for every write
- HTable is created using a singleton instance of HBaseConfiguration
- HTable is created using new instance of HBaseConfiguration (Why? - I wanted to check how the behavior changes if connections to servers are not shared. Reference: HTable and HConnectionManager documentation)
- Use HTablePool
Here are the various results:
| HBase 0.20.6 Standalone mode (all figures in milliseconds) | ||||||||
| Scenario | Num Threads | Puts | Mean | Standard Deviation | Max | Min | 80th Percentile | 90th Percentile |
| HTable from HTablePool (pool size: 100) | 1 | 100 | 4.86 | 1.49 | 12.00 | 3.00 | 5.00 | 6.90 |
| HTable with same configuration instance | 1 | 100 | 4.52 | 1.12 | 9.00 | 3.00 | 5.00 | 6.90 |
| HTable with new Hbase configuration instance | 1 | 100 | 4.15 | 0.88 | 8.00 | 3.00 | 4.00 | 5.00 |
| HTable from HTablePool (pool size: 100) | 1 | 1000 | 4.19 | 3.60 | 108.00 | 3.00 | 4.00 | 5.00 |
| HTable with same configuration instance | 1 | 1000 | 4.02 | 1.79 | 47.00 | 3.00 | 4.00 | 5.00 |
| HTable with new Hbase configuration instance | 1 | 1000 | 3.84 | 3.57 | 108.00 | 3.00 | 4.00 | 4.00 |
| HTable from HTablePool (pool size: 100) | 1 | 10000 | 3.95 | 4.13 | 211.00 | 2.00 | 4.00 | 5.00 |
| HTable with same configuration instance | 1 | 10000 | 4.28 | 10.02 | 571.00 | 2.00 | 4.00 | 5.00 |
| HTable with new Hbase configuration instance | 1 | 10000 | 3.80 | 5.46 | 244.00 | 2.00 | 4.00 | 4.00 |
| HTable from HTablePool (pool size: 100) | 10 | 100 | 24.59 | 22.73 | 204.00 | 4.00 | 38.60 | 44.00 |
| HTable with same configuration instance | 10 | 100 | 20.48 | 28.05 | 233.00 | 5.00 | 22.00 | 36.80 |
| HTable with new Hbase configuration instance | 10 | 100 | 21.44 | 28.05 | 241.00 | 5.00 | 26.80 | 50.90 |
| HTable from HTablePool (pool size: 100) | 10 | 1000 | 26.10 | 21.94 | 246.00 | 5.00 | 40.00 | 49.00 |
| HTable with same configuration instance | 10 | 1000 | 23.80 | 25.31 | 238.00 | 3.00 | 35.80 | 44.00 |
| HTable with new Hbase configuration instance | 10 | 1000 | 22.55 | 21.96 | 209.00 | 4.00 | 35.00 | 44.00 |
| HTable from HTablePool (pool size: 100) | 10 | 10000 | 22.06 | 18.81 | 279.00 | 3.00 | 36.00 | 43.00 |
| HTable with same configuration instance | 10 | 10000 | 22.67 | 18.44 | 287.00 | 3.00 | 37.00 | 45.00 |
| HTable with new Hbase configuration instance | 10 | 10000 | 24.43 | 26.67 | 507.00 | 3.00 | 38.00 | 46.00 |
| HTable from HTablePool (pool size: 100) | 100 | 100 | 106.45 | 68.53 | 422.00 | 5.00 | 164.60 | 199.40 |
| HTable with same configuration instance | 100 | 100 | 102.81 | 68.45 | 421.00 | 4.00 | 167.80 | 194.60 |
| HTable with new Hbase configuration instance | 100 | 100 | 25.73 | 20.58 | 209.00 | 4.00 | 29.80 | 35.90 |
| HTable from HTablePool (pool size: 100) | 100 | 1000 | 195.38 | 189.46 | 696.00 | 3.00 | 375.80 | 488.00 |
| HTable with same configuration instance | 100 | 1000 | 189.86 | 193.69 | 688.00 | 3.00 | 384.00 | 492.00 |
| HTable with new Hbase configuration instance | 100 | 1000 | 79.80 | 110.55 | 525.00 | 4.00 | 151.00 | 252.80 |
| HTable from HTablePool (pool size: 100) | 100 | 10000 | 219.65 | 214.10 | 1277.00 | 3.00 | 433.00 | 543.00 |
| HTable with same configuration instance | 100 | 10000 | 277.60 | 473.36 | 4129.00 | 3.00 | 453.80 | 609.00 |
| HTable with new Hbase configuration instance | 100 | 10000 | 214.80 | 230.87 | 1386.00 | 3.00 | 435.00 | 566.00 |
| HBase 0.20.6 Distributed Mode with 3 Region Servers (Hadoop 0.20) (all figures in milliseconds) | ||||||||
| Scenario | Num Threads | Puts | Mean | Standard Deviation | Max | Min | 80th Percentile | 90th Percentile |
| HTable from HTablePool (pool size: 100) | 1 | 100 | 3.25 | 1.96 | 18.00 | 2.00 | 3.00 | 3.90 |
| HTable with same configuration instance | 1 | 100 | 2.98 | 1.18 | 9.00 | 2.00 | 3.00 | 4.00 |
| HTable with new Hbase configuration instance | 1 | 100 | 2.91 | 1.43 | 10.00 | 1.00 | 3.00 | 4.90 |
| HTable from HTablePool (pool size: 100) | 1 | 1000 | 4.56 | 3.77 | 40.00 | 1.00 | 6.00 | 11.00 |
| HTable with same configuration instance | 1 | 1000 | 4.98 | 4.04 | 20.00 | 1.00 | 11.00 | 11.00 |
| HTable with new Hbase configuration instance | 1 | 1000 | 2.28 | 1.20 | 9.00 | 1.00 | 3.00 | 3.00 |
| HTable from HTablePool (pool size: 100) | 1 | 10000 | 2.48 | 1.76 | 41.00 | 1.00 | 3.00 | 3.00 |
| HTable with same configuration instance | 1 | 10000 | 2.43 | 1.78 | 49.00 | 1.00 | 3.00 | 3.00 |
| HTable with new Hbase configuration instance | 1 | 10000 | 2.84 | 2.79 | 52.00 | 1.00 | 3.00 | 5.00 |
| HTable from HTablePool (pool size: 100) | 10 | 100 | 15.88 | 21.62 | 204.00 | 2.00 | 23.80 | 32.90 |
| HTable with same configuration instance | 10 | 100 | 13.09 | 20.50 | 204.00 | 2.00 | 18.00 | 20.00 |
| HTable with new Hbase configuration instance | 10 | 100 | 8.78 | 19.64 | 197.00 | 2.00 | 8.80 | 13.00 |
| HTable from HTablePool (pool size: 100) | 10 | 1000 | 12.36 | 10.76 | 203.00 | 2.00 | 19.00 | 22.00 |
| HTable with same configuration instance | 10 | 1000 | 12.39 | 11.49 | 202.00 | 2.00 | 20.00 | 26.00 |
| HTable with new Hbase configuration instance | 10 | 1000 | 9.13 | 10.79 | 204.00 | 1.00 | 14.00 | 17.90 |
| HTable from HTablePool (pool size: 100) | 10 | 10000 | 32.41 | 59.62 | 1785.00 | 1.00 | 50.00 | 80.00 |
| HTable with same configuration instance | 10 | 10000 | 37.87 | 143.47 | 4911.00 | 2.00 | 51.00 | 84.00 |
| HTable with new Hbase configuration instance | 10 | 10000 | 12.93 | 11.26 | 212.00 | 2.00 | 20.00 | 24.00 |
| HTable from HTablePool (pool size: 100) | 100 | 100 | 296.71 | 132.66 | 710.00 | 47.00 | 404.80 | 467.50 |
| HTable with same configuration instance | 100 | 100 | 263.60 | 135.97 | 738.00 | 16.00 | 389.40 | 411.80 |
| HTable with new Hbase configuration instance | 100 | 100 | 7.11 | 8.33 | 48.00 | 2.00 | 10.00 | 12.00 |
| HTable from HTablePool (pool size: 100) | 100 | 1000 | 108.58 | 105.80 | 419.00 | 2.00 | 215.00 | 268.00 |
| HTable with same configuration instance | 100 | 1000 | 107.31 | 105.14 | 362.00 | 2.00 | 218.00 | 267.00 |
| HTable with new Hbase configuration instance | 100 | 1000 | 10.24 | 13.45 | 203.00 | 1.00 | 11.00 | 20.00 |
| HTable from HTablePool (pool size: 100) | 100 | 10000 | 149.96 | 197.28 | 1527.00 | 1.00 | 267.00 | 333.00 |
| HTable with same configuration instance | 100 | 10000 | 134.47 | 144.55 | 646.00 | 2.00 | 276.00 | 358.00 |
| HTable with new Hbase configuration instance | 100 | 10000 | 247.84 | 692.83 | 11043.00 | 1.00 | 370.00 | 668.00 |
| HBase 0.90.1 CDH3 Standalone installation (all figures in milliseconds) | ||||||||
| Scenario | Num Threads | Puts | Mean | Stdev | Max | Min | 80 Pctl | 90 Pctl |
| HTable from HTablePool (pool size: 100) | 1 | 100 | 6.54 | 4.24 | 38.00 | 4.00 | 7.00 | 8.00 |
| HTable with same configuration instance | 1 | 100 | 5.77 | 1.34 | 10.00 | 4.00 | 7.00 | 8.00 |
| HTable with new Hbase configuration instance | 1 | 100 | 5.23 | 1.33 | 12.00 | 4.00 | 5.00 | 6.90 |
| HTable from HTablePool (pool size: 100) | 1 | 1000 | 5.50 | 1.62 | 26.00 | 4.00 | 6.00 | 7.00 |
| HTable with same configuration instance | 1 | 1000 | 4.95 | 1.35 | 22.00 | 3.00 | 6.00 | 6.00 |
| HTable with new Hbase configuration instance | 1 | 1000 | 4.94 | 1.31 | 16.00 | 3.00 | 6.00 | 6.00 |
| HTable from HTablePool (pool size: 100) | 1 | 10000 | 4.84 | 2.57 | 92.00 | 3.00 | 5.00 | 6.00 |
| HTable with same configuration instance | 1 | 10000 | 4.59 | 4.88 | 229.00 | 3.00 | 5.00 | 6.00 |
| HTable with new Hbase configuration instance | 1 | 10000 | 4.30 | 4.72 | 231.00 | 3.00 | 4.00 | 5.00 |
| HTable from HTablePool (pool size: 100) | 10 | 100 | 26.44 | 27.85 | 205.00 | 4.00 | 34.80 | 39.00 |
| HTable with same configuration instance | 10 | 100 | 27.50 | 28.68 | 214.00 | 7.00 | 35.00 | 40.00 |
| HTable with new Hbase configuration instance | 10 | 100 | 31.55 | 31.22 | 205.00 | 4.00 | 37.00 | 70.40 |
| HTable from HTablePool (pool size: 100) | 10 | 1000 | 25.80 | 14.84 | 204.00 | 3.00 | 36.00 | 41.00 |
| HTable with same configuration instance | 10 | 1000 | 26.15 | 15.21 | 212.00 | 3.00 | 37.00 | 41.90 |
| HTable with new Hbase configuration instance | 10 | 1000 | 25.72 | 14.31 | 199.00 | 3.00 | 37.00 | 43.00 |
| HTable from HTablePool (pool size: 100) | 10 | 10000 | 26.12 | 16.09 | 280.00 | 3.00 | 36.00 | 41.00 |
| HTable with same configuration instance | 10 | 10000 | 27.45 | 16.93 | 274.00 | 3.00 | 38.00 | 43.00 |
| HTable with new Hbase configuration instance | 10 | 10000 | 25.57 | 13.90 | 263.00 | 3.00 | 36.00 | 40.00 |
| HTable from HTablePool (pool size: 100) | 100 | 100 | 22.55 | 17.28 | 50.00 | 5.00 | 44.80 | 46.90 |
| HTable with same configuration instance | 100 | 100 | 132.36 | 77.87 | 481.00 | 18.00 | 197.80 | 241.30 |
| HTable with new Hbase configuration instance | 100 | 100 | 29.23 | 22.76 | 97.00 | 4.00 | 52.60 | 59.90 |
| HTable from HTablePool (pool size: 100) | 100 | 1000 | 175.42 | 272.73 | 2987.00 | 3.00 | 307.60 | 356.00 |
| HTable with same configuration instance | 100 | 1000 | 19.54 | 47.49 | 248.00 | 3.00 | 9.00 | 11.00 |
| HTable with new Hbase configuration instance | 100 | 1000 | 52.58 | 69.17 | 383.00 | 4.00 | 79.80 | 150.90 |
| HTable from HTablePool (pool size: 100) | 100 | 10000 | 173.08 | 1294.93 | 24606.00 | 4.00 | 117.00 | 148.00 |
| HTable with same configuration instance | 100 | 10000 | 57.95 | 43.75 | 414.00 | 4.00 | 93.00 | 113.00 |
| HTable with new Hbase configuration instance | 100 | 10000 | 180.96 | 163.70 | 937.00 | 4.00 | 334.00 | 424.00 |
I will be adding more results here: https://github.com/srikanthps/hbase-benchmarker/blob/master/Reports.TXT
Conclusions:
So, what conclusions can be drawn from above figures:
1. Write performances seems to be good when lesser threads are used for writing.
2. All 3 variants seems to perform more or less similarly. In our application, we use HTablePool based approach and its nice to see that it performs equally nice. Another thing which I observed during my tests is that pool size may not have much role to play. With pool size of 10 too, the results are more or less similar.
3. As number of threads increase, the write performance deteriorates. A potential reason for this behavior is that HBase opens only one connection per region server from a single JVM. So, will increasing the number of JVMs help in horizontally scaling the writes?
Under heavy multi-threading, the avg. time increases as most of the threads seems to be blocking here:
"pool-1-thread-99" prio=6 tid=0x045a4c00 nid=0x22e4 waiting for monitor entry [0x06c6f000]
java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:471) - waiting to lock <0x291f0088> (a java.io.DataOutputStream)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:713)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333)
at $Proxy0.put(Unknown Source)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3$1.call(HConnectionManager.java:1242)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3$1.call(HConnectionManager.java:1240)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1035)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3.doCall(HConnectionManager.java:1239)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1161)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1247)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:609)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:474)
Source code for benchmarking application:
Available in my github repository.
https://github.com/srikanthps/hbase-benchmarker


3 comments:
Have you thought about integrating AsyncBase with the benchmark?
Hi, Have you tested the get performance like this ever?
I'm getting some weired results with get.
Is there any additional modifications needed to use multithreaded get/scan (like maximum concurrent connections etc..) to come up with increased performance than sequential read?
Thanks and Regards,
Jimson
I'm afraid that write performance isn't depend on way of HTable creating. If you move getHTable() method invocation to block code that put records in cycle, you'll, perhaps, obtain absolutly different results. Now you use one HTable instance for all puts. Pool can win in case of using new vs pooled instance HTable.
Post a Comment