/** * Copyright (c) Meta Platforms, Inc. and affiliates. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */
folly/synchronization/Rcu.h
C++ read-copy-update (RCU) functionality in folly.
Read-copy-update (RCU) is a low-overhead synchronization mechanism that provides guaranteed ordering between operations on shared data. In the simplest usage pattern, readers enter a critical section, view some state, and leave the critical section, while writers modify shared state and then defer some cleanup operations.
Proper use of the APIs provided by folly RCU will guarantee that a cleanup operation that is deferred during a reader critical section will not be executed until after that critical section is over.
The main synchronization primitive in folly RCU is a
folly::rcu_domain
. A folly::rcu_domain
is a
“universe” of deferred execution. Each domain has an executor on which
deferred functions may execute. folly::rcu_domain
provides
the requirements to be BasicLockable,
and readers may enter a read region in a folly::rcu_domain
by treating the domain as a mutex type that can be wrapped by C++
locking primitives.
For example, to enter and exit a read region using non-movable RAII
semantics, you could use an std::scoped_lock
:
{
std::scoped_lock<folly::rcu_domain> lock(folly::rcu_default_domain());
.read();
protectedData}
Alternatively, if you need to be able to move the lock, you could use
std::unique_lock
:
class ProtectedData {
private:
std::unique_lock<folly::rcu_domain> lock_;
void* data_;
}
There is a global, default domain that can be accessed using
folly::rcu_default_domain()
as in the example above. If
required, you can also create your own domain:
{
* my_executor = getExecutor();
Executor::rcu_domain domain(my_executor /* or nullptr to use default */);
folly}
In general, using the default domain is strongly encouraged as you will likely get better cache locality by sharing a domain that it used by other callers in process. If, however, you can’t avoid blocking during reader critical sections, your own custom domain should be used to avoid delaying reclamation from other updaters. A custom domain can also be used if you want update callbacks to be invoked on a specific executor.
A typical reader / updater synchronization will look something like this:
void doSomethingWith(IPAddress host);
static std::atomic<ConfigData*> globalConfigData;
void reader() {
while (true) {
;
IPAddress curManagementServer{
// We're about to do some reads we want to protect; if we read a
// pointer, we need to make sure that if the writer comes along and
// updates it, the writer's cleanup operation won't happen until we're
// done accessing the pointed-to data. We get a Guard on that
// domain; as long as it exists, no function subsequently passed to
// invokeEventually will execute.
std::scoped_lock<folly::rcu_domain> guard(folly::rcu_default_domain());
* configData = globalConfigData.load(std::memory_order_consume);
ConfigData// We created a guard before we read globalConfigData; we know that the
// pointer will remain valid until the guard is destroyed.
= configData->managementServerIP;
curManagementServer // RCU domain via the scoped mutex is released here; retired objects
// may be freed.
}
(curManagementServer);
doSomethingWith}
}
void writer() {
while (true) {
std::this_thread::sleep_for(std::chrono::seconds(60));
* oldConfigData = globalConfigData.load(std::memory_order_relaxed);
ConfigData* newConfigData = loadConfigDataFromRemoteServer();
ConfigData.store(newConfigData, std::memory_order_release);
globalConfigData::rcu_retire(oldConfigData);
folly// Alternatively, in a blocking manner:
// folly::rcu_synchronize();
// delete oldConfigData;
}
}
In the example above, a single writer updates
ConfigData*
in a loop, and then defers freeing it until all
RCU readers have exited their read regions. The writer may use either of
the following two APIs to safely defer freeing the old
ConfigData*
:
rcu_retire()
: To schedule the asynchronous deletion of
the oldConfigData
pointer when all readers have exited
their read regions.rcu_synchronize()
: To block the calling thread until
all readers have exited their read regions, at which point the pointer
is safe to be deleted.If you expect there to be very long read regions, it may be required
to use rcu_synchronize()
or a periodic
rcu_barrier()
(described below) to avoid running out of
memory due to delayed reclamation.
Another synchronization primitive provided by the folly RCU library
is rcu_barrier()
. Unlike rcu_synchronize()
,
which blocks until all outstanding readers have exited their read
regions, rcu_barrier()
blocks until all outstanding
deleters (specified in a call to rcu_retire()
) are
completed.
As mentioned above, one example of where this may be useful is
avoiding out-of-memory errors due to scheduling too many objects whose
reclamation is delayed. Taking our example from above, we could avoid
OOMs using a periodic invocation to rcu_barrier()
as
follows:
static std::atomic<ConfigData*> globalConfigData;
void writer() {
uint32_t retires = 0;
while (true) {
std::this_thread::sleep_for(std::chrono::seconds(60));
* oldConfigData = globalConfigData.load(std::memory_order_relaxed);
ConfigData* newConfigData = loadConfigDataFromRemoteServer();
ConfigData.store(newConfigData, std::memory_order_release);
globalConfigDataif (retires++ % 1000 == 0) {
::rcu_barrier();
folly}
::rcu_retire(oldConfigData);
folly}
}
When invoking folly::rcu_retire()
, you may optionally
also pass a custom deleter function that is invoked instead of std::default_delete
:
#include <folly/logging/xlog.h>
static std::atomic<ConfigData*> globalConfigData;
void writer() {
while (true) {
std::this_thread::sleep_for(std::chrono::seconds(60));
* oldConfigData = globalConfigData.load(std::memory_order_relaxed);
ConfigData* newConfigData = loadConfigDataFromRemoteServer();
ConfigData.store(newConfigData, std::memory_order_release);
globalConfigData::rcu_retire(oldConfigData, [](ConfigData* obj) {
folly(INFO) << "Deleting retired config " << oldConfigData->version);
XLOGdelete obj;
});
}
}
Exceptions may not be thrown at any point in a retire callback. This includes both the deleter, as well as the object’s destructor. Other than this, any operation is safe from within a retired object’s destructor, including retiring other objects, or even retiring the same object as long as the custom deleter did not free it.
When using the default domain or the default executor, it is not
legal to hold a lock across an rcu_retire()
that is
acquired by the deleter. This is normally not a problem when using the
default deleter delete
, which does not acquire any user
locks. However, even when using the default deleter, an object having a
user-defined destructor that acquires locks held across the
corresponding call to rcu_retire()
can still deadlock.
Note as well that there is no guarantee of the order in which retire callbacks are invoked. A retire callback is guaranteed to be invoked only after all readers that were present when the callback was scheduled have exited. Otherwise, any ordering of callback invocation may occur.
fork()
may not be invoked in a multithreaded program
where any thread other than the calling thread is in an RCU read region.
Doing so will result in undefined behavior, and will likely lead to
deadlock. If the forking thread is inside of an RCU read region, it must
invoke exec()
before exiting the read region.
std::scoped_lock<folly::rcu_domain>
creation/destruction is on the order of ~5ns. By comparison,
folly::SharedMutex::lock_shared()
followed by
folly::SharedMutex::unlock_shared()
is ~26ns.