basic_types

Explore Elixir

Elixir Basic Types

Understanding Elixir's Basic Data Types

Elixir provides a rich set of fundamental data types that form the building blocks for any program. Understanding these types is crucial for effective Elixir development. This section details the core types, their syntax, and common usage patterns.

Integer Types

Integers in Elixir can be represented in various bases: decimal (base 10), hexadecimal (base 16), and binary (base 2). Regardless of how they are specified, all integers are stored internally as base 10 values. Underscores can be used for readability in large numbers.

1234567 == 1_234_567                    # true
0xcafe  == 0xCAFE                       # true
0b10101 == 65793                        # true
0xCafe  == 0b1100101011111110           # true
0xCafe  == 51_966                       # true
Integer.to_string(51_966, 16) == "CAFE" # true
Integer.to_string(0xcafe) == "51996"    # true

Float Types

Elixir uses 64-bit double-precision floating-point numbers. Floats can be specified with an exponent using the e notation. Note that a float literal cannot begin or end with a decimal point.

> 1.2345
> 0.001 == 1.0e-3 # true
> .001            # syntax error!

Atom Types

Atoms are constants whose name is their value. They are typically used to represent fixed values or states. Atoms are named using a colon followed by characters that typically include letters, numbers, underscores, and sometimes question marks or exclamation points at the end. For atoms with special characters or spaces, they must be enclosed in double quotes.

Atoms are stored in a global table and are never de-allocated, so it's advisable to avoid programmatic creation of atoms to prevent memory leaks. Use them for fixed identifiers and keys.

Atom Naming Conventions
To use other characters, you must quote the atom.

> :something
> :_some_thing
> :allowed?
> :Some@Thing@12345
> :"Üñîçødé and Spaces"
> Atom.to_string(:Yay!)  # "Yay!"
> :123                   # syntax error!

Boolean Types

Elixir's boolean values, true and false, are actually syntactic sugar for the atoms :true and :false, respectively. They are not a distinct type but rather specific atoms.

> true  == :true     # true
> false == :false    # true
> is_boolean(:true)  # true
> is_atom(false)     # true
> is_boolean(:True)  # false!

Nil Type

Similar to booleans, nil in Elixir is syntactic sugar for the atom :nil. It represents the absence of a value and is not a special type but an atom.

> nil == :nil  # true
> is_atom(nil) # true

Binary Types

Binaries are sequences of bytes enclosed in << >> and separated by commas. By default, each number within a binary is treated as an 8-bit value. You can explicitly specify the size of each element using ::size(n), ::n, or specific type specifiers like ::utf8, ::utf16, ::utf32, or ::float. If the total number of bits in a binary is not divisible by 8, it is considered a bitstring. Binaries can be concatenated using the <> operator.

> <<0,1,2,3>>
> <<100>> == <<100::size(8)>>        # true
> <<4::float>> == <<64, 16, 0, 0, 0, 0, 0, 0>>  # true
> <<65::utf32>> == <<0, 0, 0, 65>>   # true
> <<0::2, 1::2>> == <<1::4>>         # true
> <<1,2>> <> <<3,4>> == <<1,2,3,4>>  # true
> is_binary(<<1,2,3,4>>)             # true
> is_binary(<<1::size(4)>>)          # false!, num of bits not devisible by 8
> is_bitstring(<<1::size(4)>>)       # true

String Types

Strings in Elixir are UTF-8 encoded binaries, enclosed in double quotes ("). They can span multiple lines and support string interpolation using #{}, which can contain any valid Elixir expression. Like binaries, strings are concatenated using the <> operator.

> "This is a string."
> "☀★☂☻♞☯☭☢€→☎♫♎⇧☮♻⌘⌛☘☊♔♕♖☦♠♣♥♦♂♀"  # no problem :)
> "This is an #{ Atom.to_string(:interpolated) } string."
> "Where is " <> "my other half?"
> "multi\nline" == "multi
line"                                    # true
> <<69,108,105,120,105,114>> == "Elixir" # true
> String.length("🎩")               # 1
> byte_size("🎩")                   # 4
> is_binary("any string")           # true
> String.valid?("こんにちは")         # true
> String.valid?("hello" <> <<255>>) # false!
> String.valid?(<<4>>)              # true
> String.printable?(<<4>>)          # false! 4 is a valid UTF-8 codepoint, but is not printable.

Escape Sequences in Strings

Elixir strings support various escape sequences for special characters, including whitespace and control sequences.

Characters Whitespace Control Sequences
\" – double quote \b – backspace \a – bell/alert
\' – single quote \f - form feed \d - delete
\\ – single backslash \n – newline \e - escape
\s – space \r – carriage return
\t - tab \0 - null byte
\v – vertical tab

Additionally, \x... represents a character with its hexadecimal representation, and \x{...} allows for hexadecimal representations with one or more digits.

> "\x3f" == "?"      # true
> "\x{266B}" == "♫" # true
> "\x{2660}" == "♠" # true

Regular Expression Types

Regular expressions in Elixir are inherited from Erlang's re module and are Perl-compatible. They are defined using the ~r sigil and can span multiple lines. Various modifiers can be appended to alter their behavior.

Modifiers: - u: Enables Unicode-specific patterns and treats escapes like \w, \W, \s accordingly for Unicode. - i: Ignores case during matching. - s: Allows the dot (.) to match newline characters. - m: Makes ^ and $ match the start and end of each line, respectively. Use \A and \z for the start and end of the entire string. - x: Ignores whitespace characters unless escaped, and # starts comments. - f: Forces an unanchored pattern to match at the first possible position, even if it spans across a newline. - r: Inverts the "greediness" of the regular expression.

To override newline treatment, start the pattern with: - (*CR): Carriage return - (*LF): Line feed - (*CRLF): Carriage return followed by line feed - (*ANYCRLF): Any of the three above - (*ANY): All Unicode newline sequences

> Regex.compile!("caf[eé]") == ~r/caf[eé]/ # true
> Regex.match?(~r/caf[eé]/, "café")        # true
> Regex.regex?(~r"caf[eé]")                # true
> Regex.regex?("caf[eé]")                  # false! string not compiled regex
> Regex.run(~r/hat: (.*)/, "hat: 🎩", [capture: :all_but_first]) == ["🎩"]  # true
# Modifiers
> Regex.match?(~r/mr. bojangles/i, "Mr. Bojangles") # true
> Regex.compile!("mr. bojangles", "sxi")            # ~r/mr. bojangles/sxi
# Newline overrides
> ~r/(*ANY)some\npattern/

For more in-depth information on Elixir's data types and their usage, refer to the official Elixir documentation and the Regular-Expressions.info website for comprehensive regex resources.