PERLVARIABLEINTERNALS
As far as Perl code is concerned scalars will present themselves as integers, floats or strings on
demand. Internally scalars are stored in a C structure, called an SV (scalar value), which contains
several slots. The important ones for our purposes are:
IV an integer value
UV an unsigned integer value, only used for ints > MAXINT / 2.
NV a numeric value (ie a float)
PV a pointer value (ie a string)
When a value is created one of those slots will be filled. As various operations are done on a value the
slot's contents may change, and other slots may be filled.
For example:
my $foo = "4"; # fill $foo's PV slot, as "4" is a string
my $bar = $foo + 1; # fill $bar's IV slot, as 4 + 1 is an int,
# and fill $foo's IV slot, as we had to figure
# out the numeric value of the string
$foo = "lemon"; # fill $foo's PV slot, as "lemon" is a string
That last operation immediately shows a problem. $foo's IV slot was filled with the integer value 4, but
the assignment of the string "lemon" only filled the PV slot. So what's in the IV slot? There's a handy
tool for that, Devel::Peek, which is distributed with perl. Here's part of Devel::Peek's output:
$ perl -MDevel::Peek -E 'my $foo = 4; $foo = "lemon"; Dump($foo);'
IV = 4
PV = 0x7fe6e6c04c90 "lemon"\0
So how, then, does perl know that even thought there's a value in the IV slot it shouldn't be used?
Because once you've assigned "lemon" to the variable you can't get that 4 to show itself ever again, at
least not from pure perl code.
The SV also has a flags field, which I missed out above. (I've also missed out some of the flags here,
I'm only showing you the relevant ones):
$ perl -MDevel::Peek -E 'my $foo = 4; $foo = "lemon"; Dump($foo);'
FLAGS = (POK)
IV = 4
PV = 0x7fe6e6c04c90 "lemon"\0
The "POK" flag means, as you might have guessed, that the "PV" slot has valid contents - in case you're
wondering, the "PV" slot there contains a pointer to the memory address 0x7fe6e6c04c90, at which can be
found the word "lemon".
It's possible to have multiple flags set. That's the case in the second line of code in the example. In
that example a variable contains the string "4", so the "PV" slot is filled and the "POK" flag is set. We
then take the value of that variable, add 1, and assign the result to another variable. Obviously adding
1 to a string is meaningless, so the string has to first be converted to a number. That fills the "IV"
slot:
$ perl -MDevel::Peek -E 'my $foo = "4"; my $bar = $foo + 1; Dump($foo);'
FLAGS = (IOK,POK)
IV = 4
PV = 0x7fd6e7d05210 "4"\0
Notice that there are now two flags. "IOK" means that the "IV" slot's contents are valid, and "POK" that
the "PV" slot's contents are valid. Why do we need both slots in this case? Because a non-numeric string
such as "lemon" is treated as the integer 0 if you perform numeric operations on it.
All that I have said above about "IV"s also applies to "NV"s, and you will sometimes come across a
variable with both the "IV" and "NV" slots filled, or even all three:
$ perl -MDevel::Peek -E 'my $foo = 1e2; my $bar = $foo + 0; $bar = $foo . ""; Dump($foo)'
FLAGS = (IOK,NOK,POK)
IV = 100
NV = 100
PV = 0x7f9ee9d12790 "100"\0
Finally, it's possible to have multiple flags set even though the slots contain what looks (to a human)
like different values:
$ perl -MDevel::Peek -E 'my $foo = "007"; $foo + 0; Dump($foo)'
FLAGS = (IOK,POK)
IV = 7
PV = 0x7fcf425046c0 "007"\0
That code initialises the variable to the string "007", then uses it in a numeric operation. That causes
the string to be numified, the "IV" slot to be filled, and the "IOK" flag set. It should, of course, be
clear to any fan of classic literature that "007" and 7 are very different things. "007" is not an
integer.
Booleans
In perl 5.35.7 and later, Boolean values - ie the results of comparisons - have some extra magic. As well
as their value, which is either 1 (true, an integer) or '' (false, an empty string), they have a flag to
indicate their Booleanness. This is exposed via the "builtin::is_bool" perl function so we don't need to
do XS voodoo to interrogate it.
WHATScalar::TypeDOES(atleastinversion1.0.0)
NB that this section documents an internal function that is not intended for public use. The interface of
"_scalar_type" should be considered to be unstable, not fit for human consumption, and subject to change
without notice. This documentation is correct as of version 1.0.0 but may not be updated for future
versions - its purpose is pedagogical only.
The "is_*" functions are just wrappers around the "type" function. That in turn delegates most of the
work to a few lines of C code which grovel around looking at the contents of the individual slots and
flags. That function isn't exported, but if you really want to call it directly it's called
"_scalar_type" and will return one of three strings, "INTEGER", "NUMBER", or "SCALAR". It will return
"SCALAR" even for a reference or undef, which is why I said that the "type" function only *mostly* wraps
around it :-)
The first thing that "_scalar_type" does is look at the "IOK" flag. If it's set, and the "POK" flag is
not set, then it returns "INTEGER". If "IOK" and "POK" are set it stringifies the contents of the "IV"
slot, compares to the contents of the "PV" slot, and returns "INTEGER" if they are the same, or "SCALAR"
otherwise.
The reason for jumping through those hoops is so that we can correctly divine the type of "007" in the
last example above.
If "IOK" isn't set we then look at "NOK". That follows exactly the same logic, looking also at "POK", and
returning either "NUMBER" or "SCALAR", being careful about strings like "007.5".
If neither "IOK" nor "NOK" is set then we return "SCALAR".
And what about "UV"s? They are treated exactly the same as "IV"s, and a variable with a valid "UV" slot
will have the "IOK" flag set. It will also have the "IsUV" flag set, which we use to determine how to
stringify the number.