164 lines
11 KiB
HTML
164 lines
11 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
|
<head>
|
|
<meta name="generator" content="HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />
|
|
<title>Hash Skewed Distribution Memory Use Test</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
|
|
</head>
|
|
<body>
|
|
<div id="page">
|
|
<h1>Hash-Based Skewed-Distribution Random-Integer <tt>find</tt>
|
|
Find Timing Test</h1>
|
|
<h2><a name="description" id="description">Description</a></h2>
|
|
<p>This test inserts a number of values with a markedly
|
|
non-uniform i.i.d. integer keys into a container, then performs
|
|
a series of finds using <tt>find</tt> . It measures the average
|
|
time for <tt>find</tt> as a function of the number of values in
|
|
the containers. The keys are generated as follows. First, a
|
|
uniform integer is created; it is then shifted left 8 bits.</p>
|
|
<p>(The test was executed with <a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/performance/ext/pb_ds/hash_zlob_random_int_find_timing.cc"><tt>hash_zlob_random_int_find_timing_test</tt></a>
|
|
200 200 2100)</p>
|
|
<h2><a name="purpose" id="purpose">Purpose</a></h2>
|
|
<p>The test checks the effect of different range-hashing
|
|
functions and trigger policies (see <a href="hash_based_containers.html#hash_policies">Design::Associative
|
|
Containers::Hash-Based Containers::Hash Policies</a> and
|
|
<a href="hash_based_containers.html#resize_policies">Design::Associative
|
|
Containers::Hash-Based Containers::Resize Policies</a>).</p>
|
|
<h2><a name="results" id="results">Results</a></h2>
|
|
<p>Figures <a href="#NHG">NHG</a>, <a href="#NHM">NHM</a>, and
|
|
<a href="#NHL">NHL</a> show the results for various hash-based
|
|
associative-containers in <a href="assoc_performance_tests.html#gcc"><u>g++</u></a>, <a href="assoc_performance_tests.html#msvc"><u>MSVC++</u></a>, and
|
|
<a href="assoc_performance_tests.html#local"><u>local</u></a>,
|
|
respectively.</p>
|
|
<div id="NHG_res_div">
|
|
<div id="NHG_gcc">
|
|
<div id="NHG_hash_zlob_random_int_find_timing_test">
|
|
<div id="NHG_assoc">
|
|
<div id="NHG_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHG" id="NHG"><img src="hash_zlob_random_int_find_timing_test_gcc.png" alt="no image" /></a></h6>NHG: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#gcc">g++</a><p>In the above figure, the names in the legends have the following meaning:</p>
|
|
<ol>
|
|
<li>
|
|
cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map-
|
|
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
|
|
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>
|
|
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
|
|
with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>
|
|
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
|
|
with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li>
|
|
<li>
|
|
gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map-
|
|
<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a>
|
|
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
|
|
, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
|
|
with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
|
|
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
|
|
with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a>
|
|
</li>
|
|
<li>
|
|
n_hash_map_ncah-
|
|
<tt>std::tr1::unordered_map</tt> with <tt>cache_hash_code</tt> = <tt><b>false</b></tt></li>
|
|
<li>
|
|
cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map-
|
|
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
|
|
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
|
|
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
|
|
with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
|
|
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
|
|
with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li>
|
|
</ol>
|
|
</div><div style="width: 100%; height: 20px"></div></div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div id="NHM_res_div">
|
|
<div id="NHM_msvc">
|
|
<div id="NHM_hash_zlob_random_int_find_timing_test">
|
|
<div id="NHM_assoc">
|
|
<div id="NHM_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHM" id="NHM"><img src="hash_zlob_random_int_find_timing_test_msvc.png" alt="no image" /></a></h6>NHM: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#msvc">msvc++</a><p>In the above figure, the names in the legends have the following meaning:</p>
|
|
<ol>
|
|
<li>
|
|
n_hash_map_ncah-
|
|
<tt>stdext::hash_map</tt></li>
|
|
<li>
|
|
cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map-
|
|
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
|
|
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>
|
|
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
|
|
with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>
|
|
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
|
|
with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li>
|
|
<li>
|
|
gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map-
|
|
<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a>
|
|
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
|
|
, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
|
|
with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
|
|
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
|
|
with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a>
|
|
</li>
|
|
<li>
|
|
cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map-
|
|
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
|
|
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
|
|
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
|
|
with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
|
|
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
|
|
with <i>α<sub>min</sub></i> = <i>1/8</i> and <i>α<sub>max</sub></i> = <i>1/1</i></li>
|
|
</ol>
|
|
</div><div style="width: 100%; height: 20px"></div></div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div id="NHL_res_div">
|
|
<div id="NHL_local">
|
|
<div id="NHL_hash_zlob_random_int_find_timing_test">
|
|
<div id="NHL_assoc">
|
|
<div id="NHL_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style = "border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHL" id= "NHL"><img src="hash_zlob_random_int_find_timing_test_local.png" alt="no image" /></a></h6>NHL: Skewed-distribution random int find timing test using <tt>find</tt> - <a href = "assoc_performance_tests.html#local">local</a></div><div style = "width: 100%; height: 20px"></div></div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<h2><a name="observations" id="observations">Observations</a></h2>
|
|
<p>In this setting, the keys' distribution is so skewed that
|
|
the unerlying hash-table type affects performance marginally.
|
|
(This is in contrast with <a href="hash_text_find_find_timing_test.html">Hash-Based Text
|
|
<tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_find_find_timing_test.html">Hash-Based
|
|
Random-Integer <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_subscript_find_timing_test.html">Hash-Based
|
|
Random-Integer Subscript Find Timing Test</a> and <a href="hash_random_int_subscript_insert_timing_test.html">Hash-Based
|
|
Random-Integer Subscript Insert Timing Test</a> .)</p>
|
|
<p>The range-hashing scheme affects performance dramatically. A
|
|
mask-based range-hashing scheme effectively maps all values
|
|
into the same bucket. Access degenerates into a search within
|
|
an unordered linked-list. In Figures <a href="#NHG">NHG</a> and
|
|
<a href="#NHM">NHM</a> , it should be noted that
|
|
<tt>std::tr1::unordered_map</tt> and <tt>stdext::hash_map</tt>
|
|
are hard-wired currently to mod-based and mask-based schemes,
|
|
respectively.</p>
|
|
<p>When observing the settings of this test, it is apparent
|
|
that the keys' distribution is far from natural. One might ask
|
|
if the test is not contrived to show that, in some cases,
|
|
mod-based range hashing does better than mask-based range
|
|
hashing. This is, in fact just the case. We did not encounter a
|
|
more natural case in which mod-based range hashing is better.
|
|
In our opnion, real-life key distributions are handled better
|
|
with an appropriate hash function and a mask-based
|
|
range-hashing function. (<a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/ext/pb_ds/example/hash_shift_mask.cc"><tt>shift_mask.cc</tt></a>
|
|
shows an example of handling this a-priori known skewed
|
|
distribution with a mask-based range-hashing function). If hash
|
|
performance is bad, a <i>Χ<sup>2</sup></i> test can be used
|
|
to check how to transform it into a more uniform
|
|
distribution.</p>
|
|
<p>For this reason, <tt>pb_ds</tt>'s default range-hashing
|
|
function is mask-based.</p>
|
|
<p><a href="assoc_performance_tests.html#hash_based_types">Observations::Hash-Based
|
|
Container Types</a> summarizes some observations on hash-based
|
|
containers; <a href="assoc_performance_tests.html#hash_based_policies">Observations::Hash-Based
|
|
Containers' Policies</a> summarizes some observations on
|
|
hash-based containers' policies.</p>
|
|
</div>
|
|
</body>
|
|
</html>
|