The audio data was initially read and written by chunks of 2352 bytes,
which is the usual sector size for audio CDs. However, this didn't work
on images with a 2448-byte sector size (2352 bytes data + 96 bytes
sub-channel data): we would end up reading data from the NRG footer.
The data is now read/written by 4 MiB (which is faster anyway), and we
make sure to read the remaining bytes if the audio data size is not a
multiple of 4 MiB.