Pixels are arranged in memory in rows, with the top row first. Each row occupies an amount of memory
given by the pitch (sometimes known as the row stride in non-SDL APIs).
Within each row, pixels are arranged from left to right until the width is reached. Each pixel occupies a
number of bits appropriate for its format, with most formats representing each pixel as one or more whole
bytes (in some indexed formats, instead multiple pixels are packed into each byte), and a byte order
given by the format. After encoding all pixels, any remaining bytes to reach the pitch are used as
padding to reach a desired alignment, and have undefined contents.
When a surface holds YUV format data, the planes are assumed to be contiguous without padding between
them, e.g. a 32x32 surface in NV12 format with a pitch of 32 would consist of 32x32 bytes of Y plane
followed by 32x16 bytes of UV plane.
When a surface holds MJPG format data, pixels points at the compressed JPEG image and pitch is the length
of that data.