Skip to content
Snippets Groups Projects
  • Christoph Wurst's avatar
    72a5703e
    Lower the memory footprint of the initial message cache sync · 72a5703e
    Christoph Wurst authored
    
    The initial message sync has to fetch potentially large amounts of data
    and insert that into the database. To work around limitations with sync
    requests triggered by web requests the process had already been made
    interruptable and resumable. This means we never insert all the data
    right away. Yet, the IMAP code fetched all UIDs before we capped it to a
    maximum number of results per sync attempt. Depending on the mailbox
    size this operation could require and allocate a lot of memory. On some
    setup with lower memory limits, the process was aborted by the web
    server due to a php memory exhaustion.
    
    This patch modifies the IMAP code to optimize the memory usage by
    limiting the amount of data that is fetched with each initial sync
    attempt. The algorithm works as follows.
    
    IMAP allows us to search in a range with a lower an upper bound UID.
    While we know the highest known UID from the current cache values, we
    can't derive the range for the next page from that as UIDs are not
    continuous but might have holes due to deleted messages. If we assume
    that messages of a mailbox are roughly distributed equally across the
    assigned UIDs we can guess the max UID for the next range.
    So we ask the server for min and max UIDs. The min or our known highest
    UID is always the lower bound. Then we can calculate the distribution
    rate from the min, max and number of messages and build the upper bound.
    On everage this will fetch about the expected number of messages. It
    could be more, but it could also be less. It shouldn't matter in most
    cases.
    
    Signed-off-by: default avatarChristoph Wurst <christoph@winzerhof-wurst.at>
    Lower the memory footprint of the initial message cache sync
    Christoph Wurst authored
    
    The initial message sync has to fetch potentially large amounts of data
    and insert that into the database. To work around limitations with sync
    requests triggered by web requests the process had already been made
    interruptable and resumable. This means we never insert all the data
    right away. Yet, the IMAP code fetched all UIDs before we capped it to a
    maximum number of results per sync attempt. Depending on the mailbox
    size this operation could require and allocate a lot of memory. On some
    setup with lower memory limits, the process was aborted by the web
    server due to a php memory exhaustion.
    
    This patch modifies the IMAP code to optimize the memory usage by
    limiting the amount of data that is fetched with each initial sync
    attempt. The algorithm works as follows.
    
    IMAP allows us to search in a range with a lower an upper bound UID.
    While we know the highest known UID from the current cache values, we
    can't derive the range for the next page from that as UIDs are not
    continuous but might have holes due to deleted messages. If we assume
    that messages of a mailbox are roughly distributed equally across the
    assigned UIDs we can guess the max UID for the next range.
    So we ask the server for min and max UIDs. The min or our known highest
    UID is always the lower bound. Then we can calculate the distribution
    rate from the min, max and number of messages and build the upper bound.
    On everage this will fetch about the expected number of messages. It
    could be more, but it could also be less. It shouldn't matter in most
    cases.
    
    Signed-off-by: default avatarChristoph Wurst <christoph@winzerhof-wurst.at>
ImapToDbSynchronizer.php 9.45 KiB