Schedule

  Arnt is away 2008-08-07 to -16 (good net access except mon-wed)


3.0.0: ?

  log-level default changed to significant, some logging adjustments
  to make significant DWIM.

  Added syslog support. Not done: Bypass logd, so we don't have to run
  logd at all if we're using syslog.

  Digest-MD5 disabled by default due to interoperability problems, can
  be enabled if desirable.

  IMAP VIEW command removed, views remain supported.

  PIDFILEDIR is now $(PREFIX)/lib/pidfiles.

  Note needed for upgraders: logrotate/newsyslog/whatever has to be
  told about the new location.

  Fetching is faster in many cases, particularly when many messages
  are involved.

  Added check-sender-addresses. It's neat.

  Schema changes: A partial index on header_fields, so we can locate
  messages based on message-id. Some indexes on (message) to speed up
  "aox vacuum" (by speeding up foreign key lookups to messages from
  date_fields and mailbox_messages).

  The connections table.

  Added protection against the race condition described in RFC 1047.

  The injector can now inject multiple messages with one transaction.

  aoximport uses the faster injector.

  The IMAP MULTIAPPEND extension is now supported (this is basically
  IMAP access to the faster injector).

  Bodyparts can have duplicate MD5 hashes now.

  aox add alias newalias@whatev.er otherlocaladdress@whatev.er

  aox show search. Needs more search types.

  aox undelete and aox add view now use search expressions, e.g.
    aox undelete /users/arnt/inbox from oryx.com

  New command aox tune database, to tune the database for the expected
  load. Three types: Mostly writing, mostly plain reading and mostly
  reading with much searching. (The third is a no-op still.)

  Not done: aox set retention, aox show retention.

  Not done: Views on the web.

  Not done: Turning thread-index into references.

  Note done: Axel wanted namespaces or something like that. I didn't
  quite understand, but I understood enough to think that it's a good
  idea and 3.0 is the right time to introduce it.

  Refer to Google's copy of jquery instead of our own.
  http://ajax.googleapis.com/ajax/libs/jquery/1.2.6/jquery.min.js


Release cycle

  Five weeks, t-35 to t (a Monday).

  t-38: we decide on a feature list for the release.

  t-14: Whatever hasn't been started is dropped, whatever cannot be
  completed before t-7 is laid aside for the next release. Write the
  announcement text.

  t-7: Fork the release and remove any code that needs to be removed.

  t-4: Roll the tarball and do the release chores.

  t-3: See t-38 above.

  t-0: A crontab does what it needs to do.


The value of features

  Archiveopteryx provides online archiving. We want the features we
  add to improve one or more of these:
  - adding mail to the archive
  - accessing the archive
  - managing the archive

  For nontechnical reasons I add a fourth goal:
  - making current users happy

  Each new feature should help in some way; the new features that help
  the most are the ones we need most (in the long term - in the short
  term we also need to help existing users).


The next eight months

  - Message retention
      User interface
      Message identification
      Logging changes to the message
  - aoximport improvements
  - Full text search
  - More documentation
      aox operations guide
      best practice papers
  - Better documentation
  - VIEW improvements
      INTHREAD
      multi-mailbox searches
      miscellaneous
  - Account management
      ACL management
      VIEW management
  - Recording the fate of outgoing mail
  - Proper full-text searching
  - Better exploratory search
  - Useful webmail
  - Address search
      Web UI
      IMAP X-extension
      Exporting to addressbooks
  - Monitoring
      Graphing


Installer doesn't take steps to ensure that the installation is usable

  It could do at least two things:

  Run all the same checks on the new installation as 'aox check' and
  archiveopteryx at startup.

  Try to connect to all the server addresses and if anything's
  listening anywhere, mention it on stdout.

  In addition to this, the installer does the wrong thing right now if
  it creates the database users and then fails to run psql to load the
  schema. It exits with an error, which means the randomly-generated
  passwords are lost, because the configuration file is not written.


db-address=localhost works, but needs improvement.

  - A new connect(addr, port) function resolves the given address and
    creates one SerialConnector object for each result. It starts the
    first one, which (after an error, or a delay of 1s) initiates the
    next connection in line and so on. The first one to connect swaps
    out the d of the original connection with its own, and makes the
    EventLoop act as though it had just connected.

  It works, but the code is a little ugly. The error handling logic
  needs a careful look after some time. Once that's solid, the other
  callers (SpoolManager/SmtpClient etc.) can be converted.


aox check schema

  This command would check several things.

  a) that dbuser has the needed rights
  b) that all the right tables are there, and all the right columns,
     with the right types, and no unexpected constraints
  c) that all the right indexes are there
  d) that dbowner owns everything
  i) that inserts that would duplicate a constraint are properly
     recognised

  As a bonus, perhaps it could list some unexpected/unknown deviations:
  e) locally added tables
  f) locally added columns
  g) locally added indices
  h) missing constraints


  Change 43939 and following move towards this: the idea is to
  introduce new functions e.g. Schema::checkIntegrity (in addition to
  checkRevision) and Schema::grantPrivileges, that can be used both by
  aox check schema/aox grant privileges/whatever, and also by the
  installer (instead of lib/grant-privileges, and instead of the
  half-hearted checking it does now). the server is essentially
  unaffected, it just uses Database::checkSchema/checkAccess for a
  quick check.

  this sounds ok, but it's ugly because Schema::execute is completely
  given to upgrading the schema, and neither can nor should be
  repurposed to do other things besides. so that means more static
  functions in Schema and separate EventHandlers to do the
  checking/granting/whatevering. but that's okay.


Items for web site

  In addition to stuff in the operations guide:

  - List/description of IMAP/POP/SMTP extensions
  - RFC pages
  - man pages (old ones too)
  - Source documentation
  - Best practices
  - FAQ
  - Version-specific pages
  - Download-specific pages
  - Client-specific pages


Various TODO items from earlier releases.

  Message retention policy

  - Soft-quota/archival stuff too?
  - Message arrival tag (for archiving)

  Miscellaneous:

  - Database replication support (local mirror)

  THREAD?

  Basic administration using the httpd

  Indexing for DOC bodyparts

  aox backup/restore (or similarly helpful procedure)

  Full-text search


The \Answered flag

  We could add a little code to help that flag.

  1. Disallow clearing it once set.
  2. Set it on messages when we see an outgoing reply.

  This would make it easier to force archiving.


5173

  5173 is new. I wrote an implementation, need to check that the RFC
  didn't shift and mention 5173 somewhere.


5183

  Want to implement? Not sure. I reviewed it some time ago.


5256

  Just mention (sort)


5258

  Listext needs verification.


5255

  Imapext-i18n. What?


5257

  No changes should be necessary


Cleartext passwords

  We help migrating away from cleartext/plaintext passwords:

  1. We also store SCRAM and similar secrets in the DB (secrets which
     aren't password equivalents)
  2. We extend the users table with two new columns, 'last time
     cleartext was needed' and 'number of successful authentications
     without cleartext password usage since cleartext'.
  3. If a client uses SCRAM, we increment the counter.
  4. If a client uses CRAM or PLAIN, we reset counter and set the time
     to today.
  5. We provide some helping code to delete passwords for users with a
     high count and a long-ago time.
  6. We add documentation saying that if you disable auth-this and
     auth-that, you can disable store-plaintext-passwords.
  7. We add configuration/db sanity checks for ditto.


Database schema range

  People occasionally need to access the db with an old version of
  mailstore. I suggest that we:

  a) add a 'writable_from' column specifying the oldest version that
     can write to the database.
  b) add a 'readable_from' column specifying the oldest version that
     can read the database

  aox upgrade schema would update writable_from to the oldest schema
  version for which a writer would do the right job. This would often
  change when a table changes, but not when a table is added.

  readable_from would be the oldest revision that can read the database.

  When the server starts up, it would check:

  - am I >writable_from? If so, mailboxes can be read-write
  - else, am I >readable_from? If so, startup can proceed, but all
    mailboxes are read-only. lmtp, smtp and smtp-submit do not start.
  - else, quit.

  And in order to handle database updates, I suggest another table,
  'features', with a single string column. When aox update database
  fixes something, it inserts a row into features. A modern database
  would have two rows in this table, 'numbered address fields' and 'no
  nulls in bodyparts'.


Sieve

  Alexey suggests the following extensions (in the following order):
  Vacation, reject, imapflags, subaddress.

  We still haven't done imapflags.

Reject or ereject may want MDNs instead of DSNs

  Our sieve code generates DSNs if it can't generate protocol-level
  refusal, which it always can in practice. MDNs should be used in at
  least some cases, if a non-zero percentage of zero can be considered
  "some cases".


RFCs 2852 and 4865

  Easy to do once we have port 587; we need to set the start and end
  columns right, that's about it.


Message tracking

  RFCs 3885-8 specify ways to track messages that have been sent. We
  can implement that fairly easily.

  If we route outgoing mail via a smarthost, that smarthost has to
  support MTRK in order for tracking to work well.

  We can track mail provided that at least one of these is true:

  - we deliver directly to the end server (we don't know whether
    that's the case, though)

  - we deliver via an MTRK-capable server

  - we deliver into our own database

  Sounds likely to be true maybe 80-90% of the time.

  If none are true, we can at least say, easily, where we delivered,
  when, and why.

  We could implement the tracking protocol (and I'd write a query blah
  in mailchen), and also provide a query interface via the web.


Message tracking 2

  We can recognize the ESMTP id for the most common MTAs. Postfix says:

    250 Ok: queued as D1A324AC85

  Sendmail and exim surely say something similar. We could keep that ID
  in delivery_recipients and use it in DSNs.


Generating bounces

  Our bounces would look better if they included the entire SMTP
  conversation (starting with RSET or EHLO).


Thread-Index from Exchange

  We could look at Thread-Index and reconstruct In-Reply-To in order to
  properly thread messages we a) aoximport from Exchange, or b) receive
  during a conversation with an Exchange user.


Bounces and DSNs

  Mail is currently fairly reliable. There is one big exception:
  Bounces aren't 100% parsable. But generally, if you work hard, you
  can know whether a message was delivered or not, and mostly they are
  delivered.

  So we benefit from converting the most common nonstandard bounces to
  DSNs, and then treat them as DSNs.

  For nonstandard bounces (like those of qmail) we identify the
  message by trying hard, do some hacky parsing, use the bounce
  (excluding trailing message) as first part of the DSN multipart,
  cook up a new DSN report based on the parsing, and save
  text/822-headers as a third part.

  Then, searches that tie bounces together with messages sent work
  even better.

  (Another trick we can/should use is to see whether the host we
  deliver to seems to be the final destination based on earlier
  (answered) messages.)


Memory use for common operations

  Some use _vastly_ too much memory. I saw a single IMAP FETCH for a
  mere 4200 messages use 173MB yesterday. (Later note: This should be
  gone, gone, gone, but it would still be good to check that these
  problems don't reappear, reappear, reappear.)

  The most elegant way to solve that would be to supervise memory
  usage for a known sequence. Inject these ten thousand messages,
  check memory use (via the grapher), connect to the imap port run
  this and that, check again, connect to the imap port run this and
  that, check again, connect to the imap port run this and that, check
  again... just a bunch of checks. Ignore output.


RFC 2554

  Have to look closer at that, I hadn't grasped the MAIL FROM AUTH
  issue and there may be more.

  addParams( "auth", ... ) in MAIL FROM needs consideration. Not
  important.


imapd/handlers/acl.cpp

  Different tasks, some shared code, same file. Separate this out into
  different classes inheriting something. Then add the right sort of
  logging statement to the end of parse().


Threads in Archiveopteryx 4.0

  I'm growing more and more fond of using a few threads, and not using
  server-processes any more.

  We'd replace server-processes with server-threads.

  The core event loop would create a queue of work to be done, and
  worker threads would take a piece of work, obtain the necessary
  lock, and do it.

  I think we'd have one lock for each SaslConnection and perhaps a
  couple more. Scope would know what locks are needed in order to work
  in that Scope.

  We'd be able to collect garbage without halts. We'd be able to serve
  all clients fairly, using all cores, using fewer database backends
  than with server-processes.


Full-text search

  There Be Problems.

  The code now assumes that the IMAP client searches for one or more
  words, rather than an arbitrary substring. Postgres uses word
  segmentation.

  If postgres were to use e.g. overlapping three-letter languageless
  substrings, we would do what IMAP wants. sounds senseless.

  We also have a requirement to stem search arguments less.
  Specifically, a search for ARM7TDMI should not return messages about
  the ARM6 or about my left arm.


Convert more parsers to use AbnfParser

  There are still a few places where we roll our own messy parsers and
  suffer for it (e.g. HTTP, DigestMD5). We know they work, but making
  them use AbnfParser in a spare moment would be an act of kindness.


aox/conf/tls-certificate

  Those variables are not well described. We need a bit more.

  Also, -secret is probably misnamed, we use -password for other
  cases. I expect that's why aox show cf tls-certificate-secret yields
  while e.g. aox show cf db-password does not.


METADATA

  Needed for lemonade, as easy as annotate.


Autoresponder

  We have vacation now, but it isn't quite right for autoresponses.
  Sieve autorespond should be like this:

  1. :quote should quote the first text/plain part if all of the
     following are true:

     1. The message is signed, and the signature verified (using any
        supported signature mechanism, DKIM SHOULD be supported).
     2. The first text/plain part does not have a Content-Disposition
        other than inline.

     If any of the conditions aren't true, :quote shouldn't quote.

     If there's a signature block, :quote shouldn't quote that.

     If the quoted text would be more than ten lines, :quote may crop
     it down as much as it wants, ideally by skipping lines starting
     with '>', otherwise by removing the last lines.

  2. :subject, :from and :addresses as for vacation.

  3. :cc can be used to send a copy to the specified From address.

  4. The default :handle should not be based on the quoted text.

  5. Two text arguments, one for text before the quoted text, one for
     text after the quoted text.

  6. The autoresponse goes to the envelope sender, as some RFC
     requires. So we want an option to skip the response unless the
     return-path matches reply-to (if present) or From (unless
     reply-to is present).


Received fields

  We tend to say "with esmtp", and we probably could be more lucid
  about submission users. Authenticated user etc.


Message arrival tag

  Once annotate is done, we want a tag, ie. a magic annotation which
  stays glued to the message wherever it goes, even after copy/move.

  We also want a way to store the original RFC822 format somewhere
  inside and/or outside the database, indexed by the arrival tag
  identifier. It's good if the tag is split, so we can have "x-y"
  where X is the CD/DVD number and Y is the file on the CD/DVD. Or
  something like that.


Message Retention Policy Framework

  A lot of sites will want explicit policies regarding what mail may
  not be deleted, what may be deleted, and what must be deleted. We
  can support that well.


C/R

  C/R sucks. But it has its uses, so we can benefit from implementing
  it somehow. Here are some classes of messages we may want to treat
  specially:

  - replies to own mail
  - messages in languages not understood by the user
  - mail from previously unknown addresses
  - mail from freemail providers
  - vacation responses from unknowns
  - messages likely, but not certain to be out-of-office-autoreply
  - dkim/mass-signed messages (if verified)

  The questions are: How can we ensure that we almost never challenge
  real mail, while simultaneously challenging most/all messages that
  don't come from valid senders? How can we provide suitable
  configuration?


Allocator

  The binary tree used in Allocator is likely to be extremely
  unbalanced sometimes, perhaps often. Using a Patricia tree will work
  well.


Using rrdtool

  What could we want to graph with rrdtool? Lots.

  - CPU seconds used
  - database size
  - messages in the db
  - average response time
  - 95th percentile response time
  - messages per user
  - message size per user
  - average query execution time
  - average query queue size

  More?

  http://jwatt.org/svg/authoring/ is interesting for generating graphs
  via the web interface.


Box features

  1. web ui to set up view mailboxes (and to search the archive
     generally)

  2. web ui to configure sieve

  3. web administration, to add users, etc.

  4. i18n for all web-accessible anythings, and after that, for aox.

  5. rrd stuff available next to aox in the boxes, perhaps nagios
     stuff too


SASL NTLM authentication

  It may be odd and undocumented, and it may not be as strong as
  DIGEST-MD5, but it's implemented in Certain Clients ;)

  http://www.innovation.ch/java/ntlm.html seems to be a reasonable
  description. Cyrus also implements it.

  http://davenport.sourceforge.net/ntlm.html ?


Split the folder view into pages.

  Need to decide on what a page is. "Most recent 25 threads" is a
  slippery concept when a new thread is created between page views.

  One probably good way: Use "after" and "before", so "next" would
  point to "most recent 25 threads after the last one on this
  page". Doable, not bad, and with the aid if a new Session subclass
  we can even include a note when there's new activity.


We should be able to use a read-only local database mirror.

  That way, we can play nicely with most replication systems.

  The way to do it: add a new db-mirror setting pointing to a
  read-only database mirror. all queries that update are sent to
  db-address, all selects are sent to db-mirror. db-mirror defaults to
  db-address.


Add Maildir support to the migrator.

  Currently works, with two exceptions: Submaildirs don't work, and
  courier's extended flags don't work.


We should test multipart/signed and multipart/encrypted support.

  We must add a selection of RFC 1847 messages to canonical, and make
  sure they survive the round trip. No doubt there will be bugs.


We should store bodyparts.text for PDF/DOC.

  We need non-GPLed code to convert PDF and DOC to plaintext.

  Or maybe we need a generic interface to talk to plugins.


Switch to using named constraints everywhere.


Default c-t-e of PGP signatures

  Right now we give them binary. q-p or 7bit would be better, I think.

  What other application/* types are really text?

  From a conversation the other day: we could avoid base64 encoding an
  entity whose content-type is not text if it contains only printable
  ASCII. I don't know if it's worth doing, though.

  The problem with doing that is that it treats sequences of CR LF, CR
  and LF as equivalent. An application/foobar object that happens to
  contain only CR, LF and printable ASCII can be broken.


recognising spam

  The good spam filters now all seem to require local training with
  both spam and nonspam corpora. We can do clever stuff... sometimes.

  Instead of filtering at delivery, we can filter when a message
  becomes \recent. When we increase first_recent, we hand each new
  message to the categoriser, and set $Spam or $Nonspam based on its
  answer.

  This lets the categoriser use all the information that's available
  right up to the moment the user looks at his mail.

  We can also build corpora for training easily. All messages to which
  users have replied are nonspam, replies to messages from local users
  are nonspam, messages in certain folders are spam, messages with a
  certain flag are spam.

  We can connect to a local server to ask whether a message is spam.
  They seem to work that way, but with n different protocols.


TLS client support (smtp, postgresql)


"Writing Secure Code"

  We have a page about security, /mailstore/security.html, and a
  section of the mailstore.7 man page mentions it too.

  We need to look at ISBN 0735617228 and improve security.html with
  points from it. It could also be that we'll improve the code itself.


udoc stuff:

  1. Support a single level of nested classes. (What file names to use
     for output?)
  2. Support enum annotation.
  3. Suppress empty <p>, duplicate anchor names in output.


Udoc web pages chores

  Add "Related Pages" etc. Clarify where background.html
  fits. usage.html is an orphan now; should it become a manpage?


Play with PITR and write /ams/pitr.html


Document IPC structure

  Some man page, or some web page, or both, should say who's
  connecting to who and why.


Add a web page about the charset encoding.

  It's a novel and good algorithm, so we can make a good page about
  it. We also can link to data sources there.

  The documentation for Codec::byString() should mention that page's
  URL.


Make a web page about our licensing

  Not sure what to say there. the purpose of the page would be to
  direct people to one of the two others, really. and to be linked to
  from the home page.


The "Database" link on home page

  Where should it go? People might click it wondering why to use a
  database instead of flat files and wanting to know what we do with
  databases.


Search ourselves, not via google

  Or maybe farm that out, get google to search with an approximation
  of our design. http://www.google.com/faq_freewebsearch.html may be
  interesting.


System flags in mailbox_messages

  We're doing a lot of work on system flags, and we update
  mailbox_messages for every single system flag change. Maybe we
  should go back to storing \seen and \deleted in mailbox_messages.

  Changes needed:

  1. Store needs to learn a new query. Dead easy
  2. Fetcher needs to learn to fetch the flags. Slightly
     tricky. An hour's work.
  3. Selector needs to learn to search on the special cases.
  4. Injector needs to set those flags if necessary.

  The schema change is a little evil. Lots of rows changed.

  Unfortunately I don't think this is doable in 3.0. 3.1, yes. 3.0,
  no. We want 2.11 to use the same schema as 3.0.0, and practically
  all of the code involved has been rewritten since 2.11.

  I think this is something we want to do in 3.1.0 and 3.0.2. We make
  2.11 schema-compatible with 3.0.0, 2.12 with 3.0.1, but after 2.12
  that stops. 3.0.2 can be schema-compatible with 3.1.0.


Logging to syslog

  The logd HUP handler can switch to syslog if it can't reopen the
  logfile instead of exiting.


Rendering webmail HTML is presumably good, but...

  We have a potential security hole: A malevolent HTML bodypart is
  forwarded as is on the "download bodypart" page.

  See http://ha.ckers.org/xss.html


Faster mapping from unicode to 8-bit encodings

  At the moment, we use a while loop to find the right codepoint in an
  array[256]. Mapping U+00EF to latin-1 requires looping from 0 to
  0xEF, checking those 239 entries.

  We could use a DAG of partial mappings to make it faster. Much
  faster. Mapping U+20AC to 8895-15 would require just one lookup: In
  the first partial table for 8859-15. Mapping U+0065 to 8859-15 would
  require three: In the first (U+20AC, one entry long), in the
  fallback (U+00A0, 96 entries long) and in the last (U+0000, 160
  entries long).

  Effectively, 8859-15 would be a first table of exceptions and then
  fall back to 8859-1.

  The tables could be built automatically, compiled in, and would be
  tested by our existing apparatus.

  Or we could do it simpler and perhaps even faster: Make a local
  array from unicode to target at the start, fill it in as we go, and
  do the slow scan only when we see a codepoint for the first time.


Multipart/signed automatic processing

  We could check signatures automatically on delivery, and reject bad
  signed messages.

  The big benefit is that some forgeries are rejected, even though the
  reader and the reading MUA doesn't do anything different.

  The disadvantage is that we (probably?) can't verify all signatures,
  which gives a false sense of security for the undetectable forgeries.

  In case of PKCS7, it's possible to self-sign. Those we cannot
  check. In that case we remove the signature entirely from the MIME
  structure, so it doesn't look checked to the end-user.

  PGP cannot be checked, except it sort of can. We can have a small
  default keyring including the heise.de CA key and so on, and treat
  that as root CAs, using the keyservers to dig up intermediate keys.


3.1.0: ?

  Make Query::bind() completely binary.

  Move the most common flags into mailbox_messages.


PGP automatic processing

  Apparently there are five different PGP wrapping formats. We could
  detect four and transform them to the proper MIME format.


Plugins

  It's not given that we want to accept all mail. If we don't, who
  makes the decision? A sieve script may, and refuse/reject mail it
  does not like. And a little bit of pluginnery may. I think we'd do
  well to support the postfix plugin protocol, so all postfix policy
  servers can work with aox. (All? Or just half? Doesn't postfix have
  two types of policy plugins?)

  We may even support site-wide and group-wide sieve scripts and
  permit a sieve script to invoke the plugin. A sieve statement like
  this?

     UsePolicyServer localhost 10023 ;


BURL

  If the message is multipart and the boundary occurs in a part, that
  part needs encoding. Or else switch to a different body.


Cybertec replicator

  Is it good? What are people saying about it? What should we do about it?

  Ewald Geschwinde on IRC (the first time I've seen him; 2007-07-11) said:
  *egeschwinde* When I have time I will test your product on our
  multimaster cluster
  *egeschwinde* maybe also on the queuing system


Delaying seen-flag setting

  We can move the seenflagsetter to imapsession, build up flags to
  set, flush the write cache before fetch flags, store, state-altering
  commands and searches which use either modseq or flags.

  This ought to cut down the number of transactions issued per imap
  command nicely.


The web interface should offer a download link for attachments.

  For patches attached as text/plain with the right Content-Disposition,
  for example.


Per-group and systemwide sieves

  People always seem to want such things. It'll be easy to implement.
  Most of the tricky issues are described in
  http://tools.ietf.org/html/draft-degener-sieve-multiscript-00


The Sieve "header" test may fail

  Write a test or three that feeds the thing a 2047-encoded header
  field and checks that it's correctly matched/not matched. Then make
  it pass.


The subaddress specification says foo@ != foo+@ wrt. :detail

  The former causes any :detail tests to evaluate to false, while the
  latter treats :detail as an empty string. We treat both as an empty
  string.

  (We could set detail to a single null byte, to \0\r\0\n\0, to a
  sequence of private-use unicode characters, or even to
  Entropy::string( 8 ) if there is no separator. The chance of that
  appearing in an address is negligible.)


Distribution packages

  - RPM: silug has offered to help.
  - FreeBSD port: devin (Tod McQuillin) has offered to help.
  - Debian: ? (license problems, but we could provide a .deb).
  - Ubuntu: ?


SMTP extensions

  Here are the ones we still don't implement, but ought to implement
  at some point:

  SUBMITTER? perhaps
  DELIVERBY (RFC 2852): At some time.
  FUTURERELEASE (RFC 4865): At some time.
  MTRK: As soon as someone else does it.

  http://www.iana.org/assignments/mail-parameters

  DELIVERBY has the funny little characteristic that we can support it
  with great ease iff the smarthost does, so we ought to advertise iff
  if the smarthost does.


SUBMITTER

  The work needed to support that:

  1. Pass a submitter to SmtpClient.
  2. Pass the sieve owner or logged-in user's address as submitter.
  3. Send that as SUBMITTER= if the smarthost advertises SUBMITTER and
     the submitter is different from mail-from.

  4. Advertise SUBMITTER.
  5. If the SMTP/LMTP/Submit client sends SUBMITTER and it's different
     from the mail-from, record the address in Received.
  6. If there's a sieve extension specifying how, push the submitter
     into the envelope so the sieve can see it.

  Easy peasy. But there's no value to offset this (small) cost, is
  there? Maybe as a hack when one of us is fed up.


The groups and group_members tables seem a little underused

  We do not use them at all. We meant to use them for "advanced" ACL
  support, but nobody ever asked, and it didn't seem worthwhile.

  I now think it's worthwhile.

  Here's what I want to add:

  Make a superusers group, which members can authenticate as anyone,
  and the notion of group admins, who can authenticate as other
  members of the group.

  Or maybe an administrator table, linking a user to either a group or
  to null. If a group, then the admin can authenticate as other
  members of that group and (importantly) has 'a' right on their
  mailboxes, if null, then ditto for all groups.

  Extend Permissions to link against group_members when selecting
  applicable permissions.

  Make groups be permissible ACL identifiers.


We need to be able to disable users

  - Reject mail with 5xx/4xx.
  - Prevent login.
  - 1+2.
  - a group admin can enable/disable group members
  - a superadmin can enable/disable anyone
  - a group admin cannot unblock an overall blockage


Assigning the "l" right automatically hurts shared hosting.

  Casey Shobe <casey@shobe.info> wants to host multiple customers in a
  single Archiveopteryx installation, and doesn't want them to be able
  to see each other's mailboxes. We could support that now (2008-01).


Showing email addresses in public archives is... well...

  There are three common solutions:

  1. Show the address. What we do now. Gives addresses to spammers,
     which is undesirable.

  2. Replace @ with at or some other easily reversible change. I
     assume gigamega.com, litefinder.net and other address scrapers
     already detect the common obfuscations, so this is pointless.

  3. Show part of the address, e.g. arnt@ory... or
     ar...@oryx.co... for arnt@oryx.com. Isn't reversible, so the
     spammers can't undo it, but also makes the archive less usable
     for people.

  We might change to 3, but with a captcha-protected option to show
  1. That would give us all the advantages and no serious disadvantage.
  But we would have to implement captchas somehow, which would be a
  moderate pain.


aox.org/badmail/

  Explain that aox can't store everything, why not (in short), that it
  has many workarounds and point to examples/, how to detect/report
  bad messages and how to fix things with reparse. Point to
  /aox/reparse for more detail.

  Subpages:

  badmail/examples/n for 1<=n<=8, with good and bad blah, generated
  from chosen canonicals, to show how we fix things up. Each page
  showing old and new, with differences indicated, and they should be
  ordered from reasonable/common to outrageous.

  badmail/examples/ summing up 1-8 and giving one or two truly
  hopeless cases. The hopeless case(s) should also be shown in
  anonymised form.

  badmail/examples/comparison if I feel nasty and bored one day,
  showing how a few IMAP servers handle messages 1-8 and the
  impossible one(s). Does "fetch envelope" return the right thing? 
  "fetch bodystructure"? Some choice searches? We don't want to link
  to this page very much. It gets a fine <table> containing many/few
  &#x2713; cells.

  Possibly we want to include screenshots showing how Thunderbird or
  another GUI client that uses envelope/bodystructure renders a
  mailbox containing 1-8. Screenshots using aox and using another
  server, one that gets few &#x2713; cells in the table. I'm not sure
  where to link to these screenshots. Apple Mail?

  We also need aox.org/aox/reparse and I suppose other /aox/<command>
  pages.


Autoexpunge

  Some IMAP clients, and some users, basically don't send
  expunge. They don't look at the \deleted mail, but also never
  expunge it. Outlook is a notorious case.

  If/when we move \seen and \deleted to mailbox_messages, we can store
  \deleted as a timestamptz. Null means not \deleted, a time means
  \deleted at that time. And we can add a new variable,
  auto-expunge-time, which automatically expunges mail when it's been
  marked \deleted for that long, and suggest setting it to 14 days or
  so on sites where Outlook and friends are a problem.


Some unfortunate logging

  This is the tail end of a connection Thunderbird used to save
  something to the Sent folder, rewrapped for easier reading:

    imap/info: 2/1/1/2442/6: 2008-06-18 12:53:54.109
        Execution time 550ms
    imap/debug: 2/1/1/2442/6: 2008-06-18 12:53:54.109
        Finished
    imap/debug: 2/1/1/2442: 2008-06-18 12:53:54.109
        IMAP::runCommands, 1 commands
    imap/info: 2/1/1/2442/6: 2008-06-18 12:53:54.109
        Result: OK [APPENDUID 19] done
    imap/debug: 2/1/1/2442/6: 2008-06-18 12:53:54.110
        Retired
    imap/debug: 2/1/1/2442: 2008-06-18 12:53:54.110
        IMAP::runCommands, 0 commands
    imap/info: 2/1/1/2442: 2008-06-18 13:23:53.458
        Idle timeout
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        Closing: IMAP server 195.30.37.30:143
                 connected to client 195.30.37.9:52209, on fd 26
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        IMAP::runCommands, 0 commands
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        IMAP::runCommands, 0 commands
    imap/info: 2/1/1/2442: 2008-06-18 13:23:53.459
        Unexpected close by client
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        IMAP::runCommands, 0 commands
    general/debug: 2/1/1/2442/2/2/2: 2008-06-18 13:23:53.461
        Closing: Byte forwarder 127.0.0.1:56695
                 connected to client 127.0.0.1:2061, on fd 27
    general/info: 2/1/1/2442/2/2/2: 2008-06-18 13:23:53.461
        Shutting down byte forwarder due to peer close.
    imap/info: 2/1/1/2442: 2008-06-18 13:23:53.464
        Unexpected close by client
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.464
        IMAP::runCommands, 0 commands

  Hardly unexpected. Will fix later, not sure how.


Dynamically preparing often-used queries

  We can prepare queries cleverly.

  Inside Query, at submit time, we first check whether a Query's text
  matches a PreparedStatement, and uses it if so.

  If not, we check whether the query looks preparable. The condition
  seems to be simple: Starts with 'select ' and contains no numbers.
  If it's preparable we add it to a cache, which is discarded at GC
  time.

  If a preparable query is used more than n times before the cache is
  discarded, we prepare the query and keep the PreparedStatement
  around.


aox.org/clients/

  Move the list of clients from /imap/ and /pop here. Make a per-client
  page, e.g. /clients/outlook, with notes.

  1. Which protocols can you use? Usually IMAP+Submit.

  2. Any bugs worth speaking of?

  3. Any particular configuration advice? for /clients/outlook we say
     "enable use-smtps". For /clients/applemail we point to that IDLE
     plugin.

  That's it, right?


aox undelete is almost useless

  Sure, it does the job, but finding the message to undelete demands a
  more intimate knowledge of our schema than justified. Every time I've
  needed to use it, I've wanted to undelete a message I've trashed a few
  seconds earlier by mistake. aox undelete could easily take a mailbox
  and tell me the from/subject and uid of the most recently deleted n
  messages. That would make it a whole lot more useful.


SMS gateway

  We want Archiveopteryx to work as installed. We want to support
  Sieve notify, including SMS.

  I think that means Oryx needs to operate an Archiveopteryx->SMS
  gateway, allow people do send a few SMSes, and provide people with
  the ability to operate a gateway of their own.

  There are many IP->SMS gateways in the world. Some free, but we
  don't want to use those, they're unreliable. Many paid for, those
  are reliable. Most of them work using HTTP requests: You POST a
  query with your credentials and the gateway reports.

  So my plan is as follows:

  1. Become a customer of someone like that.

  2. Write a program which accepts requests in a format we define,
     forwards them in the HTTP-based format our provider uses, and
     relays the response back.

  3. Provide that service to new Archiveopteryx installations, with
     limitations on use.

  4. Provide the gateway program along with Archiveopteryx, so people
     can run it themselves.

  I haven't thought of a good way to provide the service to
  Archiveopteryx users and weed out most other people.

  Perhaps a better alternative: Automatically register with clickatell
  if SMS is enabled and not configured when it's first used. (But it's
  not possible to register with clickatell without intervention.)


Bug confusing U+ED00 and U+0000 in the message cache

  When we write to the database, U+0000 (which occasionally occurs,
  mostly by mistake but sometimes on purpose) is transformed to
  U+ED00, and when we read it, back.

  So if U+ED00 is written to the DB, it comes back as U+0000.

  This means that Archiveopteryx works differently depending on
  whether the cache is used or not. That has to be resolved somehow.


Fetching headers is too slow, really.

  Maybe other stuff. We know the results are ordered, so we should
  take advantage of the ordering and keep state across rows. Right now
  we're locating the Message/Bodypart/Header from first principles for
  each row.
