/** * Copyright (c) Meta Platforms, Inc. and affiliates. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */
folly/Conv.h
folly/Conv.h
is a one-stop-shop for converting values
across types. Its main features are simplicity of the API (only the
names to
and toAppend
must be memorized),
speed (folly is significantly faster, sometimes by an order of
magnitude, than comparable APIs), and correctness.
All examples below are assume to have included
folly/Conv.h
and issued using namespace folly;
You will need:
// To format as text and append to a string, use toAppend.
;
fbstring str(2.5, &str);
toAppend(str, "2.5");
CHECK_EQ
// Multiple arguments are okay, too. Just put the pointer to string at the end.
(" is ", 2, " point ", 5, &str);
toAppend(str, "2.5 is 2 point 5");
CHECK_EQ
// You don't need to use fbstring (although it's much faster for conversions and in general).
std::string stdStr;
("Pi is about ", 22.0 / 7, &stdStr);
toAppend// In general, just use to<TargetType>(sourceValue). It returns its result by value.
= to<std::string>("Variadic ", "arguments also accepted.");
stdStr
// to<fbstring> is 2.5x faster than to<std::string> for typical workloads.
= to<fbstring>("Variadic ", "arguments also accepted."); str
Using to<Target>(value)
to convert one integral
type to another will behave as follows:
short x;
unsigned short y;
...
auto a = to<int>(x); // zero overhead conversion
auto b = to<int>(y); // zero overhead conversion
to
inserts bounds checks and throws
std::range_error
if the target type cannot accommodate the
source value. Example:short x;
unsigned short y;
long z;
...
= 123;
x auto a = to<unsigned short>(x); // fine
= -1;
x = to<unsigned short>(x); // THROWS
a = 2000000000;
z auto b = to<int>(z); // fine
+= 1000000000;
z = to<int>(z); // THROWS
b auto b = to<unsigned int>(z); // fine
As mentioned, there are two primitives for converting anything to
string: to
and toAppend
. They support the same
set of source types, literally by definition (to
is
implemented in terms of toAppend
for all types). The call
toAppend(value, &str)
formats and appends
value
to str
whereas
to<StringType>(value)
formats value
as a
StringType
and returns the result by value. Currently, the
supported StringType
s are std::string
and
fbstring
Both toAppend
and to
with a string type as
a target support variadic arguments. Each argument is converted in turn.
For toAppend
the last argument in a variadic list must be
the address of a supported string type (no need to specify the string
type as a template argument).
Nothing special here - integrals are converted to strings in decimal format, with a ‘-’ prefix for negative values. Example:
auto a = to<fbstring>(123);
assert(a == "123");
= to<fbstring>(-456);
a assert(a == "-456");
The conversion implementation is aggressively optimized. It converts
two digits at a time assisted by fixed-size tables. Converting a
long
to an fbstring
is 3.6x faster than using
boost::lexical_cast
and 2.5x faster than using
sprintf
even though the latter is used in conjunction with
a stack-allocated constant-size buffer.
Note that converting integral types to fbstring
has a
particular advantage compared to converting to std::string
No integral type (<= 64 bits) has more than 20 decimal digits
including sign. Since fbstring
employs the small string
optimization for up to 23 characters, converting an integral to
fbstring
is guaranteed to not allocate memory, resulting in
significant speed and memory locality gains. Benchmarks reveal a 2x gain
on a typical workload.
char
to string
conversionAlthough char
is technically an integral type, most of
the time you want the string representation of 'a'
to be
"a"
, not 96
That’s why
folly/Conv.h
handles char
as a special case
that does the expected thing. Note that signed char
and
unsigned char
are still considered integral types.
folly/Conv.h
uses V8’s double
conversion routines. They are accurate and fast; on typical
workloads, to<fbstring>(doubleValue)
is 1.9x faster
than sprintf
and 5.5x faster than
boost::lexical_cast
(It is also 1.3x faster than
to<std::string>(doubleValue)
const char*
to
string conversionFor completeness, folly/Conv.h
supports
const char*
including i.e. string literals. The
“conversion” consists, of course, of the string itself. Example:
auto s = to<fbstring>("Hello, world");
assert(s == "Hello, world");
folly/Conv.h
includes three kinds of parsing
routines:
to<Type>(const char* begin, const char* end)
rigidly converts the range [begin, end) to Type
These
routines have drastic restrictions (e.g. allow no leading or trailing
whitespace) and are intended as an efficient back-end for more tolerant
routines.to<Type>(stringy)
converts stringy
to Type
Value stringy
may be of type
const char*
, StringPiece
,
std::string
, or fbstring
(Technically, the
requirement is that stringy
implicitly converts to a
StringPiece
to<Type>(&stringPiece)
parses with progress
information: given stringPiece
of type
StringPiece
it parses as much as possible from it as type
Type
and alters stringPiece
to remove the
munched characters. This is easiest clarified by an example:= " 1234 angels on a pin";
fbstring s (s);
StringPiece pcauto x = to<int>(&pc);
assert(x == 1234);
assert(pc == " angels on a pin";
Note how the routine ate the leading space but not the trailing one.
Parsing integral types is unremarkable - decimal format is expected,
optional '+'
or '-'
sign for signed types, but
no optional '+'
is allowed for unsigned types. The one
remarkable element is speed - parsing typical long
values
is 6x faster than sscanf
. folly/Conv.h
uses
aggressive loop unrolling and table-assisted SIMD-style code arrangement
that avoids integral division (slow) and data dependencies across
operations (ILP-unfriendly). Example:
= " 12345 ";
fbstring str assert(to<int>(str) == 12345);
= " 12345six seven eight";
str (str);
StringPiece pcassert(to<int>(&pc) == 12345);
assert(str == "six seven eight");
folly/Conv.h
uses, again, V8’s
double-conversion routines as back-end. The speed is 3x faster than
sscanf
and 1.7x faster than in-home routines such as
parse<double>
But the more important detail is
accuracy - even if you do code a routine that works faster than
to<double>
chances are it is incorrect and will fail
in a variety of corner cases. Using to<double>
is
strongly recommended.
Note that if the string “NaN” (with any capitalization) is passed to
to<double>
then NaN
is returned, which
can be tested for as follows:
= "nan"; // "NaN", "NAN", etc.
fbstring str double d = to<double>(str);
if (std::isnan(d)) {
// string was a valid representation of the double value NaN
}
Note that passing “-NaN” (with any capitalization) to
to<double>
also returns NaN
.
Note that if the strings “inf” or “infinity” (with any
capitalization) are passed to to<double>
then
infinity
is returned, which can be tested for as
follows:
= "inf"; // "Inf", "INF", "infinity", "Infinity", etc.
fbstring str double d = to<double>(str);
if (std::isinf(d)) {
// string was a valid representation of one of the double values +Infinity
// or -Infinity
}
Note that passing “-inf” or “-infinity” (with any capitalization) to
to<double>
returns -infinity
rather than
+infinity
. The sign of the infinity
can be
tested for as follows:
= "-inf"; // or "inf", "-Infinity", "+Infinity", etc.
fbstring str double d = to<double>(str);
if (d == std::numeric_limits<double>::infinity()) {
// string was a valid representation of the double value +Infinity
} else if (d == -std::numeric_limits<double>::infinity()) {
// string was a valid representation of the double value -Infinity
}
Note that if an unparseable string is passed to
to<double>
then an exception is thrown, rather than
NaN
being returned. This can be tested for as follows:
= "not-a-double"; // Or "1.1.1", "", "$500.00", etc.
fbstring str double d;
try {
= to<double>(str);
d } catch (const std::range_error &) {
// string could not be parsed
}
Note that the empty string (""
) is an unparseable value,
and will cause to<double>
to throw an exception.
tryTo<T>
is the non-throwing variant of
to<T>
. It returns an
Expected<T, ConversionCode>
. You can think of
Expected
as like an Optional<T>
, but if
the conversion failed, Expected
stores an error code
instead of a T
.
tryTo<T>
has similar performance as
to<T>
when the conversion is successful. On the error
path, you can expect tryTo<T>
to be roughly three
orders of magnitude faster than the throwing to<T>
and to completely avoid any lock contention arising from stack
unwinding.
Here is how to use non-throwing conversions:
auto t1 = tryTo<int>(str);
if (t1.hasValue()) {
(t1.value());
use}
Expected
has a composability feature to make the above
pattern simpler.
<int>(str).then([](int i) { use(i); }); tryTo