basic_types
Explore Elixir
Elixir Basic Types
Understanding Elixir's Basic Data Types
Elixir provides a rich set of fundamental data types that form the building blocks for any program. Understanding these types is crucial for effective Elixir development. This section details the core types, their syntax, and common usage patterns.
Integer Types
Integers in Elixir can be represented in various bases: decimal (base 10), hexadecimal (base 16), and binary (base 2). Regardless of how they are specified, all integers are stored internally as base 10 values. Underscores can be used for readability in large numbers.
1234567 == 1_234_567 # true
0xcafe == 0xCAFE # true
0b10101 == 65793 # true
0xCafe == 0b1100101011111110 # true
0xCafe == 51_966 # true
Integer.to_string(51_966, 16) == "CAFE" # true
Integer.to_string(0xcafe) == "51996" # true
Float Types
Elixir uses 64-bit double-precision floating-point numbers. Floats can be specified with an exponent using the e
notation. Note that a float literal cannot begin or end with a decimal point.
> 1.2345
> 0.001 == 1.0e-3 # true
> .001 # syntax error!
Atom Types
Atoms are constants whose name is their value. They are typically used to represent fixed values or states. Atoms are named using a colon followed by characters that typically include letters, numbers, underscores, and sometimes question marks or exclamation points at the end. For atoms with special characters or spaces, they must be enclosed in double quotes.
Atoms are stored in a global table and are never de-allocated, so it's advisable to avoid programmatic creation of atoms to prevent memory leaks. Use them for fixed identifiers and keys.
To use other characters, you must quote the atom.
> :something
> :_some_thing
> :allowed?
> :Some@Thing@12345
> :"Üñîçødé and Spaces"
> Atom.to_string(:Yay!) # "Yay!"
> :123 # syntax error!
Boolean Types
Elixir's boolean values, true
and false
, are actually syntactic sugar for the atoms :true
and :false
, respectively. They are not a distinct type but rather specific atoms.
> true == :true # true
> false == :false # true
> is_boolean(:true) # true
> is_atom(false) # true
> is_boolean(:True) # false!
Nil Type
Similar to booleans, nil
in Elixir is syntactic sugar for the atom :nil
. It represents the absence of a value and is not a special type but an atom.
> nil == :nil # true
> is_atom(nil) # true
Binary Types
Binaries are sequences of bytes enclosed in << >>
and separated by commas. By default, each number within a binary is treated as an 8-bit value. You can explicitly specify the size of each element using ::size(n)
, ::n
, or specific type specifiers like ::utf8
, ::utf16
, ::utf32
, or ::float
. If the total number of bits in a binary is not divisible by 8, it is considered a bitstring. Binaries can be concatenated using the <>
operator.
> <<0,1,2,3>>
> <<100>> == <<100::size(8)>> # true
> <<4::float>> == <<64, 16, 0, 0, 0, 0, 0, 0>> # true
> <<65::utf32>> == <<0, 0, 0, 65>> # true
> <<0::2, 1::2>> == <<1::4>> # true
> <<1,2>> <> <<3,4>> == <<1,2,3,4>> # true
> is_binary(<<1,2,3,4>>) # true
> is_binary(<<1::size(4)>>) # false!, num of bits not devisible by 8
> is_bitstring(<<1::size(4)>>) # true
String Types
Strings in Elixir are UTF-8 encoded binaries, enclosed in double quotes ("
). They can span multiple lines and support string interpolation using #{}
, which can contain any valid Elixir expression. Like binaries, strings are concatenated using the <>
operator.
> "This is a string."
> "☀★☂☻♞☯☭☢€→☎♫♎⇧☮♻⌘⌛☘☊♔♕♖☦♠♣♥♦♂♀" # no problem :)
> "This is an #{ Atom.to_string(:interpolated) } string."
> "Where is " <> "my other half?"
> "multi\nline" == "multi
line" # true
> <<69,108,105,120,105,114>> == "Elixir" # true
> String.length("🎩") # 1
> byte_size("🎩") # 4
> is_binary("any string") # true
> String.valid?("こんにちは") # true
> String.valid?("hello" <> <<255>>) # false!
> String.valid?(<<4>>) # true
> String.printable?(<<4>>) # false! 4 is a valid UTF-8 codepoint, but is not printable.
Escape Sequences in Strings
Elixir strings support various escape sequences for special characters, including whitespace and control sequences.
Characters | Whitespace | Control Sequences |
---|---|---|
\" – double quote |
\b – backspace |
\a – bell/alert |
\' – single quote |
\f - form feed |
\d - delete |
\\ – single backslash |
\n – newline |
\e - escape |
\s – space |
\r – carriage return |
|
\t - tab |
\0 - null byte |
|
\v – vertical tab |
Additionally, \x...
represents a character with its hexadecimal representation, and \x{...}
allows for hexadecimal representations with one or more digits.
> "\x3f" == "?" # true
> "\x{266B}" == "♫" # true
> "\x{2660}" == "♠" # true
Regular Expression Types
Regular expressions in Elixir are inherited from Erlang's re
module and are Perl-compatible. They are defined using the ~r
sigil and can span multiple lines. Various modifiers can be appended to alter their behavior.
Modifiers:
- u
: Enables Unicode-specific patterns and treats escapes like \w
, \W
, \s
accordingly for Unicode.
- i
: Ignores case during matching.
- s
: Allows the dot (.
) to match newline characters.
- m
: Makes ^
and $
match the start and end of each line, respectively. Use \A
and \z
for the start and end of the entire string.
- x
: Ignores whitespace characters unless escaped, and #
starts comments.
- f
: Forces an unanchored pattern to match at the first possible position, even if it spans across a newline.
- r
: Inverts the "greediness" of the regular expression.
To override newline treatment, start the pattern with:
- (*CR)
: Carriage return
- (*LF)
: Line feed
- (*CRLF)
: Carriage return followed by line feed
- (*ANYCRLF)
: Any of the three above
- (*ANY)
: All Unicode newline sequences
> Regex.compile!("caf[eé]") == ~r/caf[eé]/ # true
> Regex.match?(~r/caf[eé]/, "café") # true
> Regex.regex?(~r"caf[eé]") # true
> Regex.regex?("caf[eé]") # false! string not compiled regex
> Regex.run(~r/hat: (.*)/, "hat: 🎩", [capture: :all_but_first]) == ["🎩"] # true
# Modifiers
> Regex.match?(~r/mr. bojangles/i, "Mr. Bojangles") # true
> Regex.compile!("mr. bojangles", "sxi") # ~r/mr. bojangles/sxi
# Newline overrides
> ~r/(*ANY)some\npattern/
For more in-depth information on Elixir's data types and their usage, refer to the official Elixir documentation and the Regular-Expressions.info website for comprehensive regex resources.