readers - Reader Lock Table

Author

       Generated automatically by Doxygen for LMDB from the source code.

                                                      LMDB                                            readers(3)

Data Structure Documentation

Detailed Description

Readers don't acquire any locks for their data access. Instead, they simply record their transaction ID
in the reader table. The reader mutex is needed just to find an empty slot in the reader table. The
slot's address is saved in thread-specific data so that subsequent read transactions started by the same
thread need no further locking to proceed.

If MDB_NOTLS is set, the slot address is not saved in thread-specific data.

No reader table is used if the database is on a read-only filesystem, or if MDB_NOLOCK is set.

Since the database uses multi-version concurrency control, readers don't actually need any locking. This
table is used to keep track of which readers are using data from which old transactions, so that we'll
know when a particular old transaction is no longer in use. Old transactions that have discarded any data
pages can then have those pages reclaimed for use by a later write transaction.

The lock table is constructed such that reader slots are aligned with the processor's cache line size.
Any slot is only ever used by one thread. This alignment guarantees that there will be no contention or
cache thrashing as threads update their own slot info, and also eliminates any need for locking when
accessing a slot.

A writer thread will scan every slot in the table to determine the oldest outstanding reader transaction.
Any freed pages older than this will be reclaimed by the writer. The writer doesn't use any locks when
scanning this table. This means that there's no guarantee that the writer will see the most up-to-date
reader info, but that's not required for correct operation - all we need is to know the upper bound on
the oldest reader, we don't care at all about the newest reader. So the only consequence of reading stale
information here is that old pages might hang around a while longer before being reclaimed. That's
actually good anyway, because the longer we delay reclaiming old pages, the more likely it is that a
string of contiguous pages can be found after coalescing old pages from many old transactions together.

Field Documentation

uint32_tMDB_txbody::mtb_magic
       Stamp identifying this as an LMDB file. It must be set to MDB_MAGIC.

   uint32_tMDB_txbody::mtb_format
       Format of this lock file. Must be set to MDB_LOCK_FORMAT.

   mdb_mutex_tMDB_txbody::mtb_rmutex
       Mutex protecting access to this table. This is the reader table lock used with LOCK_MUTEX().

   volatiletxnid_tMDB_txbody::mtb_txnid
       The ID of the last transaction committed to the database. This is recorded here only for convenience; the
       value can always be determined by reading the main database meta pages.

   volatileunsignedMDB_txbody::mtb_numreaders
       The number of slots that have been used in the reader table. This always records the maximum count, it is
       not decremented when readers release their slots.

Macro Definition Documentation

#defineDEFAULT_READERS126
       Number of slots in the reader table. This value was chosen somewhat arbitrarily. 126 readers plus a
       couple mutexes fit exactly into 8KB on my development machine. Applications should set the table size
       using mdb_env_set_maxreaders().

   #defineCACHELINE64
       The size of a CPU cache line in bytes. We want our lock structures aligned to this size to avoid false
       cache line sharing in the lock table. This value works for most CPUs. For Itanium this should be 128.

   #defineMDB_LOCK_FORMATValue:.PP
           ((uint32_t) \
            ((MDB_LOCK_VERSION) \
             /* Flags which describe functionality */ \
             + (((MDB_PIDLOCK) != 0) << 16)))
       Lockfile format signature: version, features and field layout

Name

       readers - Reader Lock Table

Struct Mdb_Reader

       The actual reader record, with cacheline padding.

   DataFields

       union {
          MDB_rxbodymrx
          char pad [(sizeof(MDB_rxbody)+CACHELINE-1)
                 &~(CACHELINE-1)]"
       } mru

Struct Mdb_Rxbody

       The information we store in a single slot of the reader table. In addition to a transaction ID, we also
       record the process and thread ID that owns a slot, so that we can detect stale information, e.g. threads
       or processes that went away without cleaning up.

       Note
           We currently don't check for stale records. We simply re-init the table when we know that we're the
           only process opening the lock file.

   DataFields

       volatile txnid_tmrb_txnid
       volatile MDB_PID_T mrb_pid
       volatile MDB_THR_T mrb_tid

Struct Mdb_Txbody

       The header for the reader table. The table resides in a memory-mapped file. (This is a different file
       than is used for the main database.)

       For POSIX the actual mutexes reside in the shared memory of this mapped file. On Windows, mutexes are
       named objects allocated by the kernel; we store the mutex names in this mapped file so that other
       processes can grab them. This same approach is also used on MacOSX/Darwin (using named semaphores) since
       MacOSX doesn't support process-shared POSIX mutexes. For these cases where a named object is used, the
       object name is derived from a 64 bit FNV hash of the environment pathname. As such, naming collisions are
       extremely unlikely. If a collision occurs, the results are unpredictable.

   DataFields

       uint32_t mtb_magic
       uint32_t mtb_formatmdb_mutex_tmtb_rmutex
       volatile txnid_tmtb_txnid
       volatile unsigned mtb_numreaders

Struct Mdb_Txninfo

       The actual reader table definition.

   DataFields

       union {
          MDB_txbodymtb
          char pad [(sizeof(MDB_txbody)+CACHELINE-1)
                 &~(CACHELINE-1)]"
       } mt1
       union {
          mdb_mutex_tmt2_wmutex
          char pad [(MNAME_LEN+CACHELINE-1)
                 &~(CACHELINE-1)]"
       } mt2MDB_readermti_readers [1]

Synopsis

DataStructures
       struct MDB_rxbody
       struct MDB_reader
       struct MDB_txbody
       struct MDB_txninfoMacros
       #define DEFAULT_READERS   126
       #define CACHELINE   64
       #define MDB_LOCK_FORMAT