Muster
 All Classes Namespaces Files Functions Variables Typedefs Macros
random.h
Go to the documentation of this file.
1 //////////////////////////////////////////////////////////////////////////////////////////////////
2 // Copyright (c) 2010, Lawrence Livermore National Security, LLC.
3 // Produced at the Lawrence Livermore National Laboratory
4 // LLNL-CODE-433662
5 // All rights reserved.
6 //
7 // This file is part of Muster. For details, see http://github.com/tgamblin/muster.
8 // Please also read the LICENSE file for further information.
9 //
10 // Redistribution and use in source and binary forms, with or without modification, are
11 // permitted provided that the following conditions are met:
12 //
13 // * Redistributions of source code must retain the above copyright notice, this list of
14 // conditions and the disclaimer below.
15 // * Redistributions in binary form must reproduce the above copyright notice, this list of
16 // conditions and the disclaimer (as noted below) in the documentation and/or other materials
17 // provided with the distribution.
18 // * Neither the name of the LLNS/LLNL nor the names of its contributors may be used to endorse
19 // or promote products derived from this software without specific prior written permission.
20 //
21 // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS
22 // OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
23 // MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
24 // LAWRENCE LIVERMORE NATIONAL SECURITY, LLC, THE U.S. DEPARTMENT OF ENERGY OR CONTRIBUTORS BE
25 // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
26 // (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
27 // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
28 // WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
29 // ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 //////////////////////////////////////////////////////////////////////////////////////////////////
31 
32 ///
33 /// @file random.h
34 /// @author Todd Gamblin tgamblin@llnl.gov
35 /// @brief Helper functions for taking random samples and seeding
36 /// RNGs from the system clock.
37 ///
38 #ifndef MUSTER_RANDOM_H
39 #define MUSTER_RANDOM_H
40 
41 #include <sys/time.h>
42 #include <tr1/unordered_map>
43 
44 namespace cluster {
45 
46  ///
47  /// This is Knuth's algorithm R for taking a sample of numElements numbers.
48  /// We sample <code>sample_size</code> indices from [0..numElements) and write them to
49  /// <code>out</code>.
50  ///
51  /// Note that this algorithm scales linaerly with the number of objects sampled from
52  /// (that is, numElements).
53  ///
54  /// @sa fast_sample()
55  ///
56  /// @param numElements total number of elements to select from
57  /// @param sample_size number of elements to select
58  /// @param out destination for selected elements, must model output iterator
59  /// @param random model of STL Random Number Generator.
60  /// must be callable as random(N), returning a random number in [0,N).
61  ///
62  template <class OutputIterator, class Random>
63  void algorithm_r(size_t numElements, size_t sample_size, OutputIterator out, Random& random) {
64  size_t first = 0;
65  size_t remaining = numElements;
66  size_t m = sample_size;
67 
68  while (m > 0) {
69  if ((size_t)random(remaining) < m) {
70  *out = first;
71  ++out;
72  --m;
73  }
74  --remaining;
75  ++first;
76  }
77  }
78 
79 
80  ///
81  /// This is a fast algorithm for random sampling that scales with the number of elements
82  /// sampled (sample_size). Its performance does not depend on the number of elements sampled
83  /// from (numElements), so it will perform better than algorithm R in most cases.
84  ///
85  /// Specifically, this algorithm will perform better when sample_size is small compared to
86  /// numElements. Algorithm R will perform better when sample_size is a large percentage of
87  /// the elements to be sampled. Use this algoritm when you need O(sample_size) performance
88  /// and sample_size is constant in the limit with respect to num_elements.
89  ///
90  /// @param numElements total number of elements to select from
91  /// @param sample_size number of elements to select
92  /// @param out destination for selected elements, must model output iterator.
93  /// @param random model of STL Random Number Generator.
94  /// must be callable as random(N), returning a random number in [0,N).
95  ///
96  template <class OutputIterator, class Random>
97  void fast_sample(size_t numElements, size_t sample_size, OutputIterator out, Random& random) {
98  // Map keeps track of numbers whose identities have been swapped.
99  std::tr1::unordered_map<size_t, size_t> swaps;
100 
101  for (size_t s = 0; s < sample_size; s++) {
102  size_t remaining = numElements - s; // number of elements remaining to be picked
103  size_t pick = random(remaining); // random pick from remaining elts
104 
105  *out = swaps.count(pick) ? swaps[pick] : pick;
106  ++out;
107 
108  // Make as-yet unpicked number eligible to be picked next time
109  size_t last = remaining - 1;
110  swaps[pick] = swaps.count(last) ? swaps[last] : last;
111  }
112  }
113 
114 
115  ///
116  /// Returns a seed for random number generators based on the product
117  /// of sec and usec from gettimeofday().
118  ///
119  inline long get_time_seed() {
120  struct timeval time;
121  gettimeofday(&time, 0);
122  return time.tv_sec * time.tv_usec;
123  }
124 
125 } // namespace cluster
126 
127 #endif // MUSTER_RANDOM_H
long get_time_seed()
Returns a seed for random number generators based on the product of sec and usec from gettimeofday()...
Definition: random.h:119
void algorithm_r(size_t numElements, size_t sample_size, OutputIterator out, Random &random)
This is Knuth&#39;s algorithm R for taking a sample of numElements numbers.
Definition: random.h:63
void fast_sample(size_t numElements, size_t sample_size, OutputIterator out, Random &random)
This is a fast algorithm for random sampling that scales with the number of elements sampled (sample_...
Definition: random.h:97
Muster. Copyright © 2010, Lawrence Livermore National Laboratory, LLNL-CODE-433662.
Distribution of Muster and its documentation is subject to terms of the Muster LICENSE.
Generated on Thu Sep 1 2016 using Doxygen 1.8.5