Schedule

  Nothing yet.


P/W: Stuff we may want to keep

  List::take()->remove() was good

  also keeping iterators working

  Scope

  Having EH make a new Log by default

  Simplifying the way Connection objects are added to the main loop

  List::append( List ) -> appendList()

  smtpclient has simpler logic... but don't break what works

  Transaction::enqueue( const char * )

  cancelQuery rewrite. not sure.

  Do (some of) these, one at a time, making sure nothing breaks after
  each one. None soon.


3.0.5: 2009-01-??

  Have some good ideas.


3.1.0: 2009-01-??

  IMAP RENAME now throws out any sessions accessing the old mailbox.
  Sometimes clients keep sessions active without asking the user; that
  prevented a rename with the old code.

  Add a script to do ldapsearch and adapt the aox user list
  accordingly. New ldap users result in aox add users, removed LDAP
  users are removed from the database too. (What to do about their
  mail? Move it to someone, keep the user, but disabled, or delete
  it? Configurable I think.)

  Move the most common flags into mailbox_messages.

  aox set retention, aox show retention.

  Views on the web.

  Thread-Index mostly becomes References.

  The l right is no longer granted to everyone. (Error301 needs
  consideration.)

  Now sends FLAGS updates, although their order may be less than
  ideal.

  aoxexport exists.

  Optimisations to defeat common webmail systems.

  Optimisation to defeat overeager use of STATUS.


Functions and views

  Views that'll be good for people:
  - Valid local email addresses and users
  - Sender information
    - How many earlier messages from that address
    - How many messages to that address
    - How many earlier messages to/from the domain

  Functions:
  - Create user
  - Delete user
  - Rename user
  - Change password
  - Change password, given old and new password, checking
  - Enable/disable alias
  - Add/delete/rename alias


Release cycle

  Five weeks, t-35 to t (a Monday).

  t-38: we decide on a feature list for the release.

  t-14: Whatever hasn't been started is dropped, whatever cannot be
  completed before t-7 is laid aside for the next release. Write the
  announcement text.

  t-7: Fork the release and remove any code that needs to be removed.

  t-4: Roll the tarball and do the release chores.

  t-3: See t-38 above.

  t-0: A crontab does what it needs to do.


The value of features

  Archiveopteryx provides online archiving. We want the features we
  add to improve one or more of these:
  - adding mail to the archive
  - accessing the archive
  - managing the archive

  For nontechnical reasons I add a fourth goal:
  - making current users happy

  Each new feature should help in some way; the new features that help
  the most are the ones we need most (in the long term - in the short
  term we also need to help existing users).


The next eight months

  - Message retention
      User interface
      Message identification
      Logging changes to the message
  - aoximport improvements
  - Full text search
  - More documentation
      aox operations guide
      best practice papers
  - Better documentation
  - VIEW improvements
      INTHREAD
      multi-mailbox searches
      miscellaneous
  - Account management
      ACL management
      VIEW management
  - Recording the fate of outgoing mail
  - Proper full-text searching
  - Better exploratory search
  - Useful webmail
  - Address search
      Web UI
      IMAP X-extension
      Exporting to addressbooks
  - Monitoring
      Graphing


Installer doesn't take steps to ensure that the installation is usable

  It could do at least two things:

  Run all the same checks on the new installation as 'aox check' and
  archiveopteryx at startup.

  Try to connect to all the server addresses and if anything's
  listening anywhere, mention it on stdout.

  In addition to this, the installer does the wrong thing right now if
  it creates the database users and then fails to run psql to load the
  schema. It exits with an error, which means the randomly-generated
  passwords are lost, because the configuration file is not written.


db-address=localhost works, but needs improvement.

  - A new connect(addr, port) function resolves the given address and
    creates one SerialConnector object for each result. It starts the
    first one, which (after an error, or a delay of 1s) initiates the
    next connection in line and so on. The first one to connect swaps
    out the d of the original connection with its own, and makes the
    EventLoop act as though it had just connected.

  It works, but the code is a little ugly. The error handling logic
  needs a careful look after some time. Once that's solid, the other
  callers (SpoolManager/SmtpClient etc.) can be converted.


aox check schema

  This command would check several things.

  a) that dbuser has the needed rights
  b) that all the right tables are there, and all the right columns,
     with the right types, and no unexpected constraints
  c) that all the right indexes are there
  d) that dbowner owns everything
  i) that inserts that would duplicate a constraint are properly
     recognised

  As a bonus, perhaps it could list some unexpected/unknown deviations:
  e) locally added tables
  f) locally added columns
  g) locally added indices
  h) missing constraints

  Change 43939 and following move towards this: the idea is to
  introduce new functions e.g. Schema::checkIntegrity (in addition to
  checkRevision) and Schema::grantPrivileges, that can be used both by
  aox check schema/aox grant privileges/whatever, and also by the
  installer (instead of lib/grant-privileges, and instead of the
  half-hearted checking it does now). the server is essentially
  unaffected, it just uses Database::checkSchema/checkAccess for a
  quick check.

  this sounds ok, but it's ugly because Schema::execute is completely
  given to upgrading the schema, and neither can nor should be
  repurposed to do other things besides. so that means more static
  functions in Schema and separate EventHandlers to do the
  checking/granting/whatevering. but that's okay.


aox upgrade schema, in reverse

  Each schema starting with 86 can write a pgsql function called
  downgrade_to_x (downgrade_to_85 in the case of stepTo86()), and if
  aox upgrade schema is called when the current schema version is
  higher than its goal, then it does:

     while ( current > target ) {
          EString fn( "downgrade_to_" + fn( current-1 ) + "()" );
          t->enqueue( new Query( fn, 0 ) );
          t->enqueue( new Query( "drop function " + fn, 0 );
          current--;
     }
     t->enqueue( "update mailstore set revision=" + fn( target ), 0 );

  Done. A little hacky but really quite good.

  stepTo85() would add the three functions necessary to step back to
  where 3.0.7 is. Then 3.0.7 doesn't need any substantial new code
  from 3.1.0, only bugfixes and the really quite small code above.


Items for web site

  In addition to stuff in the operations guide:

  - List/description of IMAP/POP/SMTP extensions
  - RFC pages
  - man pages (old ones too)
  - Source documentation
  - Best practices
  - FAQ
  - Version-specific pages
  - Download-specific pages
  - Client-specific pages


Various TODO items from earlier releases.

  Message retention policy

  - Soft-quota/archival stuff too?
  - Message arrival tag (for archiving)

  Miscellaneous:

  - Database replication support (local mirror)

  THREAD?

  Basic administration using the httpd

  Indexing for DOC bodyparts

  aox backup/restore (or similarly helpful procedure)


The \Answered flag

  We could add a little code to help that flag.

  1. Disallow clearing it once set.
  2. Set it on messages when we see an outgoing reply.

  This would make it easier to force archiving.


5255

  We already inplement I18NLEVEL=1, so I advertised that. I18NLEVEL=2
  doesn't seem very useful. I also don't know how to do that in SQL.


Other RFCs

  I think we need to consider these RFCs, or at least mention them
  somewhere in the documentation so we know we've considered them:

  821
  934  (?)
  974
  1049 (should be handled fine, so check and mention)
  1641 (old mime?)
  1731 (old imap?)
  1893 (ditto)
  1894 (older version of something we handle, right?)
  2044 (?)
  2068 (HTTP? surely we do that)
  2222
  2244


Cleartext passwords

  We help migrating away from cleartext/plaintext passwords:

  1. We also store SCRAM and similar secrets in the DB (secrets which
     aren't password equivalents)
  2. We extend the users table with two new columns, 'last time
     cleartext was needed' and 'number of successful authentications
     without cleartext password usage since cleartext'.
  3. If a client uses SCRAM, we increment the counter.
  4. If a client uses CRAM or PLAIN, we reset counter and set the time
     to today.
  5. We provide some helping code to delete passwords for users with a
     high count and a long-ago time.
  6. We add documentation saying that if you disable auth-this and
     auth-that, you can disable store-plaintext-passwords.
  7. We add configuration/db sanity checks for ditto.


Database schema range

  People occasionally need to access the db with an old version of
  mailstore. I suggest that we:

  a) add a 'writable_from' column specifying the oldest version that
     can write to the database.
  b) add a 'readable_from' column specifying the oldest version that
     can read the database

  aox upgrade schema would update writable_from to the oldest schema
  version for which a writer would do the right job. This would often
  change when a table changes, but not when a table is added.

  readable_from would be the oldest revision that can read the database.

  When the server starts up, it would check:

  - am I >writable_from? If so, mailboxes can be read-write
  - else, am I >readable_from? If so, startup can proceed, but all
    mailboxes are read-only. lmtp, smtp and smtp-submit do not start.
  - else, quit.

  And in order to handle database updates, I suggest another table,
  'features', with a single string column. When aox update database
  fixes something, it inserts a row into features. A modern database
  would have two rows in this table, 'numbered address fields' and 'no
  nulls in bodyparts'.


Sieve

  Alexey suggests the following extensions (in the following order):
  Vacation, reject, imapflags, subaddress.

  We still haven't done imapflags.


Reject or ereject may want MDNs instead of DSNs

  Our sieve code generates DSNs if it can't generate protocol-level
  refusal, which it always can in practice. MDNs should be used in at
  least some cases, if a non-zero percentage of zero can be considered
  "some cases".


RFCs 2852 and 4865

  Easy to do once we have port 587; we need to set the start and end
  columns right, that's about it.


Message tracking

  RFCs 3885-8 specify ways to track messages that have been sent. We
  can implement that fairly easily.

  If we route outgoing mail via a smarthost, that smarthost has to
  support MTRK in order for tracking to work well.

  We can track mail provided that at least one of these is true:

  - we deliver directly to the end server (we don't know whether
    that's the case, though)

  - we deliver via an MTRK-capable server

  - we deliver into our own database

  Sounds likely to be true maybe 80-90% of the time.

  If none are true, we can at least say, easily, where we delivered,
  when, and why.

  We could implement the tracking protocol (and I'd write a query blah
  in mailchen), and also provide a query interface via the web.


Message tracking 2

  We can recognize the ESMTP id for the most common MTAs. Postfix says:

    250 Ok: queued as D1A324AC85

  Sendmail and exim surely say something similar. We could keep that ID
  in delivery_recipients and use it in DSNs.


Generating bounces

  Our bounces would look better if they included the entire SMTP
  conversation (starting with RSET or EHLO).


Bounces and DSNs

  Mail is currently fairly reliable. There is one big exception:
  Bounces aren't 100% parsable. But generally, if you work hard, you
  can know whether a message was delivered or not, and mostly they are
  delivered.

  So we benefit from converting the most common nonstandard bounces to
  DSNs, and then treat them as DSNs.

  For nonstandard bounces (like those of qmail) we identify the
  message by trying hard, do some hacky parsing, use the bounce
  (excluding trailing message) as first part of the DSN multipart,
  cook up a new DSN report based on the parsing, and save
  text/822-headers as a third part.

  Then, searches that tie bounces together with messages sent work
  even better.

  (Another trick we can/should use is to see whether the host we
  deliver to seems to be the final destination based on earlier
  (answered) messages.)


Memory use for common operations

  Some use _vastly_ too much memory. I saw a single IMAP FETCH for a
  mere 4200 messages use 173MB yesterday. (Later note: This should be
  gone, gone, gone, but it would still be good to check that these
  problems don't reappear, reappear, reappear.)

  The most elegant way to solve that would be to supervise memory
  usage for a known sequence. Inject these ten thousand messages,
  check memory use (via the grapher), connect to the imap port run
  this and that, check again, connect to the imap port run this and
  that, check again, connect to the imap port run this and that, check
  again... just a bunch of checks. Ignore output.


RFC 2554

  Have to look closer at that, I hadn't grasped the MAIL FROM AUTH
  issue and there may be more.

  addParams( "auth", ... ) in MAIL FROM needs consideration. Not
  important.


imapd/handlers/acl.cpp

  Different tasks, some shared code, same file. Separate this out into
  different classes inheriting something. Then add the right sort of
  logging statement to the end of parse().


Threads in Archiveopteryx 4.0

  I'm growing more and more fond of using a few threads, and not using
  server-processes any more.  We'd replace server-processes with
  server-threads, or just keep the name.

  The core event loop would create a queue of work to be done based on
  which file descriptors have input, and worker threads would take a
  piece of work, obtain the fd's lock, and do it.

  The Query would have a optional Connection pointer, the Transaction
  would have a mandatory one. Scope would probably have one. Perhaps
  Q+T could copy Scope's. A worker thread which processes database
  input would have to obtain the lock for the scope's fd whenever it
  enters the scope.

  We'd be able to collect garbage without halts. Large IMAP Fetch
  commands would also not cause halts. We'd be able to serve all
  clients fairly, using all cores, using fewer database backends than
  with server-processes.

  The Apple Autozone GC looks good for this.

  The default for server-processes ought to change to match the number
  of processors:

    Linux, Solaris, & AIX (per comments):
        numCPU = sysconf( _SC_NPROCESSORS_ONLN );

    FreeBSD, macosx, NetBSD, OpenBSD, etc.: 
        int mib[4];
        size_t len; 
        
        /* set the mib for hw.ncpu */
        mib[0] = CTL_HW;
        mib[1] = HW_AVAILCPU;  // alternatively, try HW_NCPU;
        
        /* get the number of CPUs from the system */
        sysctl(mib, 2, &numCPU, &len, NULL, 0);
        
        if( numCPU < 1 ) 
        {
             mib[1] = HW_NCPU;
             sysctl( mib, 2, &numCPU, &len, NULL, 0 );
        
             if( numCPU < 1 )
                  numCPU = 1;
        }


Full-text search

  There Be Problems.

  The code now assumes that the IMAP client searches for one or more
  words, rather than an arbitrary substring. Postgres uses word
  segmentation.

  If postgres were to use e.g. overlapping three-letter languageless
  substrings, we would do what IMAP wants. sounds senseless.

  We also have a requirement to stem search arguments less.
  Specifically, a search for ARM7TDMI should not return messages about
  the ARM6 or about my left arm.


Convert more parsers to use AbnfParser

  There are still a few places where we roll our own messy parsers and
  suffer for it (e.g. HTTP, DigestMD5). We know they work, but making
  them use AbnfParser in a spare moment would be an act of kindness.


aox/conf/tls-certificate

  Those variables are not well described. We need a bit more.

  Also, -secret is probably misnamed, we use -password for other
  cases. I expect that's why aox show cf tls-certificate-secret yields
  while e.g. aox show cf db-password does not.


METADATA

  Needed for lemonade, as easy as annotate.


Autoresponder

  We have vacation now, but it isn't quite right for autoresponses.
  Sieve autorespond should be like this:

  1. :quote should quote the first text/plain part if all of the
     following are true:

     1. The message is signed, and the signature verified (using any
        supported signature mechanism, DKIM SHOULD be supported).
     2. The first text/plain part does not have a Content-Disposition
        other than inline.

     If any of the conditions aren't true, :quote shouldn't quote.

     If there's a signature block, :quote shouldn't quote that.

     If the quoted text would be more than ten lines, :quote may crop
     it down as much as it wants, ideally by skipping lines starting
     with '>', otherwise by removing the last lines.

  2. :subject, :from and :addresses as for vacation.

  3. :cc can be used to send a copy to the specified From address.

  4. The default :handle should not be based on the quoted text.

  5. Two text arguments, one for text before the quoted text, one for
     text after the quoted text.

  6. The autoresponse goes to the envelope sender, as some RFC
     requires. So we want an option to skip the response unless the
     return-path matches reply-to (if present) or From (unless
     reply-to is present).


Message arrival tag

  Once annotate is done, we want a tag, ie. a magic annotation which
  stays glued to the message wherever it goes, even after copy/move.

  We also want a way to store the original RFC822 format somewhere
  inside and/or outside the database, indexed by the arrival tag
  identifier. It's good if the tag is split, so we can have "x-y"
  where X is the CD/DVD number and Y is the file on the CD/DVD. Or
  something like that.


Message Retention Policy Framework

  A lot of sites will want explicit policies regarding what mail may
  not be deleted, what may be deleted, and what must be deleted. We
  can support that well.


Sieve ihave

  There are three holes in our ihave rules.

  Single-child anyof doesn't promote the ihave:

    if anyof( ihave "foo" ) {
        foo; # errors should not be reported here
    }

  Not doesn't promote:

    if not not ihave "foo" {
        foo; # errors should not be reported here
    }

  Finally, if/elsif always applies the ihave to its own block, instead
  of walking along elsif/else to find the block that might be executed
  if ihave returns true:

    if not ihave "foo" {
        # errors should be reported here
    } else {
        foo; # but not here
    }


C/R

  C/R sucks. But it has its uses, so we can benefit from implementing
  it somehow. Here are some classes of messages we may want to treat
  specially:

  - replies to own mail
  - messages in languages not understood by the user
  - mail from previously unknown addresses
  - mail from freemail providers
  - vacation responses from unknowns
  - messages likely, but not certain to be out-of-office-autoreply
  - dkim/mass-signed messages (if verified)

  The questions are: How can we ensure that we almost never challenge
  real mail, while simultaneously challenging most/all messages that
  don't come from valid senders? How can we provide suitable
  configuration?

  Mail from freemail vendors tends to have a "Received: ... via HTT"
  field.


Squirrelmail

  Inefficiency has a name.

  1. Too many LOGINs. We can cache Users using a Cache, that'll solve
     that. But LOGIN isn't that slow, so I'm not sure it's worth it.

  2. Too many SELECTs and EXAMINEs. SessionCAche and FirstUnreadCache
     solve that.

  3. Too many EXPUNGEs. If we keep a "last expunged at modseq" in
     ImapSession, check and set it in Expunge, and check and set it in
     store ("if the last expunge was the previous modseq, and I'm not
     adding any \deleted flags, then increase"), then we can turn
     those expunges into noops.

  That should speed up SM and probably other webmail systems nicely.


Allocator

  The binary tree used in Allocator is likely to be extremely
  unbalanced sometimes, perhaps often. Using a Patricia tree will work
  well.


Using rrdtool

  What could we want to graph with rrdtool? Lots.

  - CPU seconds used
  - database size
  - messages in the db
  - average response time
  - 95th percentile response time
  - messages per user
  - message size per user
  - average query execution time
  - average query queue size

  More?

  http://jwatt.org/svg/authoring/ is interesting for generating graphs
  via the web interface.


Box features

  1. web ui to set up view mailboxes (and to search the archive
     generally)

  2. web ui to configure sieve

  3. web administration, to add users, etc.

  4. i18n for all web-accessible anythings, and after that, for aox.

  5. rrd stuff available next to aox in the boxes, perhaps nagios
     stuff too


SASL NTLM authentication

  It may be odd and undocumented, and it may not be as strong as
  DIGEST-MD5, but it's implemented in Certain Clients ;)

  http://www.innovation.ch/java/ntlm.html seems to be a reasonable
  description. Cyrus also implements it.

  http://davenport.sourceforge.net/ntlm.html ?


Split the folder view into pages.

  Need to decide on what a page is. "Most recent 25 threads" is a
  slippery concept when a new thread is created between page views.

  One probably good way: Use "after" and "before", so "next" would
  point to "most recent 25 threads after the last one on this
  page". Doable, not bad, and with the aid if a new Session subclass
  we can even include a note when there's new activity.


We should be able to use a read-only local database mirror.

  That way, we can play nicely with most replication systems.

  The way to do it: add a new db-mirror setting pointing to a
  read-only database mirror. all queries that update are sent to
  db-address, all selects are sent to db-mirror. db-mirror defaults to
  db-address.


We should test multipart/signed and multipart/encrypted support.

  We must add a selection of RFC 1847 messages to canonical, and make
  sure they survive the round trip. No doubt there will be bugs.


Per-user client certificates

  We could store zero or more client certificates (or fingerprints, or
  whatever) per user. When a user has logged in, we'd check whether
  that user has a non-zero list of certificates, and if so, we'd do a
  TLS renegotiation, this time demanding a client certificate. If the
  client certificate matches, we allow access, otherwise we don't (and
  we alert the user).

  A bit difficult to do with the hands-off tlsproxy.


We should store bodyparts.text for PDF/DOC.

  We need non-GPLed code to convert PDF and DOC to plaintext.

  Or maybe we need a generic interface to talk to plugins.


Switch to using named constraints everywhere.


Default c-t-e of PGP signatures

  Right now we give them binary. q-p or 7bit would be better, I think.

  What other application/* types are really text?

  From a conversation the other day: we could avoid base64 encoding an
  entity whose content-type is not text if it contains only printable
  ASCII. I don't know if it's worth doing, though.

  The problem with doing that is that it treats sequences of CR LF, CR
  and LF as equivalent. An application/foobar object that happens to
  contain only CR, LF and printable ASCII can be broken.


Recognising spam

  The good spam filters now all seem to require local training with
  both spam and nonspam corpora. We can do clever stuff... sometimes.

  Instead of filtering at delivery, we can filter when a message
  becomes \recent. When we increase first_recent, we hand each new
  message to the categoriser, and set $Spam or $Nonspam based on its
  answer.

  This lets the categoriser use all the information that's available
  right up to the moment the user looks at his mail.

  We can also build corpora for training easily. All messages to which
  users have replied are nonspam, replies to messages from local users
  are nonspam, messages in certain folders are spam, messages with a
  certain flag are spam.

  We can connect to a local server to ask whether a message is spam.
  They seem to work that way, but with n different protocols.


TLS client support (smtp, postgresql)


"Writing Secure Code"

  We have a page about security, /mailstore/security.html, and a
  section of the mailstore.7 man page mentions it too.

  We need to look at ISBN 0735617228 and improve security.html with
  points from it. It could also be that we'll improve the code itself.


Play with PITR and write /ams/pitr.html


Document IPC structure

  Some man page, or some web page, or both, should say who's
  connecting to who and why.


Reference counting GC

  We can switch to using shared_ptr, intrusive_ptr, or a clone of
  either.

  shared_ptr and intrusive_ptr both are particular to the pointer
  value, ie. if we allocate a MailboxData and it contains an UString
  member, pointing to that UString isn't safe.

  We can work around that with intrustive_ptr, if we provide a boolean
  in Garbage: allocatedUsingNew(), and a hack in the Garbage
  constructor to make sure we know.

    union {
        Garbage * surrounder;
        uint refcount;
    }

  intrusive_ptr_add_ref( Garbage * ) would have to do this:

    if ( g->allocatedUsingNew() )
        g->refcount++;
    else
        g->surrounder->refcount++;

  If we do this, then everything we currently do should be safe. But
  the hack would be hacky.

  Reference counting has problems with circular structures. We have
  two of those: Selector and Message.

  Selector is solvable. Its parent pointers are used only to access
  the root's d pointer, and we could change the implementation
  accordingly.

  Message seems more difficult. On one hand its circularity has
  brought problems before so it would be nice to change. On the other
  I don't know any way to get rid of the blah.  Functions like
  Bodypart::contentType() seem to need Multipart::parent().


Add a web page about the charset encoding.

  It's a novel and good algorithm, so we can make a good page about
  it. We also can link to data sources there.

  The documentation for Codec::byString() should mention that page's
  URL.


Make a web page about our licensing

  Not sure what to say there. the purpose of the page would be to
  direct people to one of the two others, really. and to be linked to
  from the home page.


The "Database" link on home page

  Where should it go? People might click it wondering why to use a
  database instead of flat files and wanting to know what we do with
  databases.


Search ourselves, not via google

  Or maybe farm that out, get google to search with an approximation
  of our design. http://www.google.com/faq_freewebsearch.html may be
  interesting.


System flags in mailbox_messages

  We're doing a lot of work on system flags, and we update
  mailbox_messages for every single system flag change. Maybe we
  should go back to storing \seen and \deleted in mailbox_messages.

  Changes needed:

  1. Store needs to learn a new query. Dead easy
  2. Fetcher needs to learn to fetch the flags. Slightly
     tricky. An hour's work.
  3. Selector needs to learn to search on the special cases.
  4. Injector needs to set those flags if necessary.

  The schema change is a little evil. Lots of rows changed.

  Unfortunately I don't think this is doable in 3.0. 3.1, yes. 3.0,
  no. We want 2.11 to use the same schema as 3.0.0, and practically
  all of the code involved has been rewritten since 2.11.

  I think this is something we want to do in 3.1.0 and 3.0.2. We make
  2.11 schema-compatible with 3.0.0, 2.12 with 3.0.1, but after 2.12
  that stops. 3.0.2 can be schema-compatible with 3.1.0.


Logging to syslog

  The logd HUP handler can switch to syslog if it can't reopen the
  logfile instead of exiting.


Rendering webmail HTML is presumably good, but...

  We have a potential security hole: A malevolent HTML bodypart is
  forwarded as is on the "download bodypart" page.

  See http://ha.ckers.org/xss.html


Faster mapping from unicode to 8-bit encodings

  At the moment, we use a while loop to find the right codepoint in an
  array[256]. Mapping U+00EF to latin-1 requires looping from 0 to
  0xEF, checking those 239 entries.

  We could use a DAG of partial mappings to make it faster. Much
  faster. Mapping U+20AC to 8895-15 would require just one lookup: In
  the first partial table for 8859-15. Mapping U+0065 to 8859-15 would
  require three: In the first (U+20AC, one entry long), in the
  fallback (U+00A0, 96 entries long) and in the last (U+0000, 160
  entries long).

  Effectively, 8859-15 would be a first table of exceptions and then
  fall back to 8859-1.

  The tables could be built automatically, compiled in, and would be
  tested by our existing apparatus.

  Or we could do it simpler and perhaps even faster: Make a local
  array from unicode to target at the start, fill it in as we go, and
  do the slow scan only when we see a codepoint for the first time.


Multipart/signed automatic processing

  We could check signatures automatically on delivery, and reject bad
  signed messages.

  The big benefit is that some forgeries are rejected, even though the
  reader and the reading MUA doesn't do anything different.

  The disadvantage is that we (probably?) can't verify all signatures,
  which gives a false sense of security for the undetectable forgeries.

  In case of PKCS7, it's possible to self-sign. Those we cannot
  check. In that case we remove the signature entirely from the MIME
  structure, so it doesn't look checked to the end-user.

  PGP cannot be checked, except it sort of can. We can have a small
  default keyring including the heise.de CA key and so on, and treat
  that as root CAs, using the keyservers to dig up intermediate keys.


PGP automatic processing

  Apparently there are five different PGP wrapping formats. We could
  detect four and transform them to the proper MIME format.


Plugins

  It's not given that we want to accept all mail. If we don't, who
  makes the decision? A sieve script may, and refuse/reject mail it
  does not like. And a little bit of pluginnery may. I think we'd do
  well to support the postfix plugin protocol, so all postfix policy
  servers can work with aox. (All? Or just half? Doesn't postfix have
  two types of policy plugins?)

  We may even support site-wide and group-wide sieve scripts and
  permit a sieve script to invoke the plugin. A sieve statement like
  this?

     UsePolicyServer localhost 10023 ;


BURL

  If the message is multipart and the boundary occurs in a part, that
  part needs encoding. Or else switch to a different body.


Cybertec replicator

  Is it good? What are people saying about it? What should we do about it?

  Ewald Geschwinde on IRC (the first time I've seen him; 2007-07-11) said:
  *egeschwinde* When I have time I will test your product on our
  multimaster cluster
  *egeschwinde* maybe also on the queuing system


Delaying seen-flag setting

  We can move the seenflagsetter to imapsession, build up flags to
  set, flush the write cache before fetch flags, store, state-altering
  commands and searches which use either modseq or flags.

  This ought to cut down the number of transactions issued per imap
  command nicely.


The web interface should offer a download link for attachments.

  For patches attached as text/plain with the right Content-Disposition,
  for example.


Per-group and systemwide sieves

  People always seem to want such things. It'll be easy to implement.
  Most of the tricky issues are described in
  http://tools.ietf.org/html/draft-degener-sieve-multiscript-00


The Sieve "header" test may fail

  Write a test or three that feeds the thing a 2047-encoded header
  field and checks that it's correctly matched/not matched. Then make
  it pass.


The subaddress specification says foo@ != foo+@ wrt. :detail

  The former causes any :detail tests to evaluate to false, while the
  latter treats :detail as an empty string. We treat both as an empty
  string.

  (We could set detail to a single null byte, to \0\r\0\n\0, to a
  sequence of private-use unicode characters, or even to
  Entropy::string( 8 ) if there is no separator. The chance of that
  appearing in an address is negligible.)


Distribution packages

  - RPM: silug has offered to help.
  - FreeBSD port: devin (Tod McQuillin) has offered to help.
  - Debian: ? (license problems, but we could provide a .deb).
  - Ubuntu: ?


SMTP extensions

  Here are the ones we still don't implement, but ought to implement
  at some point:

  SUBMITTER? perhaps
  DELIVERBY (RFC 2852): At some time.
  FUTURERELEASE (RFC 4865): At some time.
  MTRK: As soon as someone else does it.

  http://www.iana.org/assignments/mail-parameters

  DELIVERBY has the funny little characteristic that we can support it
  with great ease iff the smarthost does, so we ought to advertise iff
  if the smarthost does.


SUBMITTER

  The work needed to support that:

  1. Pass a submitter to SmtpClient.
  2. Pass the sieve owner or logged-in user's address as submitter.
  3. Send that as SUBMITTER= if the smarthost advertises SUBMITTER and
     the submitter is different from mail-from.

  4. Advertise SUBMITTER.
  5. If the SMTP/LMTP/Submit client sends SUBMITTER and it's different
     from the mail-from, record the address in Received.
  6. If there's a sieve extension specifying how, push the submitter
     into the envelope so the sieve can see it.

  Easy peasy. But there's no value to offset this (small) cost, is
  there? Maybe as a hack when one of us is fed up.


The groups and group_members tables seem a little underused

  We do not use them at all. We meant to use them for "advanced" ACL
  support, but nobody ever asked, and it didn't seem worthwhile.

  I now think it's worthwhile.

  Here's what I want to add:

  Make a superusers group, which members can authenticate as anyone,
  and the notion of group admins, who can authenticate as other
  members of the group.

  Or maybe an administrator table, linking a user to either a group or
  to null. If a group, then the admin can authenticate as other
  members of that group and (importantly) has 'a' right on their
  mailboxes, if null, then ditto for all groups.

  Extend Permissions to link against group_members when selecting
  applicable permissions.

  Make groups be permissible ACL identifiers.


We need to be able to disable users

  - Reject mail with 5xx/4xx.
  - Prevent login.
  - 1+2.
  - a group admin can enable/disable group members
  - a superadmin can enable/disable anyone
  - a group admin cannot unblock an overall blockage


Assigning the "l" right automatically hurts shared hosting.

  Casey Shobe <casey@shobe.info> wants to host multiple customers in a
  single Archiveopteryx installation, and doesn't want them to be able
  to see each other's mailboxes. We could support that now (2008-01).


Showing email addresses in public archives is... well...

  There are three common solutions:

  1. Show the address. What we do now. Gives addresses to spammers,
     which is undesirable.

  2. Replace @ with at or some other easily reversible change. I
     assume gigamega.com, litefinder.net and other address scrapers
     already detect the common obfuscations, so this is pointless.

  3. Show part of the address, e.g. arnt@ory... or
     ar...@oryx.co... for arnt@oryx.com. Isn't reversible, so the
     spammers can't undo it, but also makes the archive less usable
     for people.

  We might change to 3, but with a captcha-protected option to show
  1. That would give us all the advantages and no serious disadvantage.
  But we would have to implement captchas somehow, which would be a
  moderate pain.


aox.org/badmail/

  Explain that aox can't store everything, why not (in short), that it
  has many workarounds and point to examples/, how to detect/report
  bad messages and how to fix things with reparse. Point to
  /aox/reparse for more detail.

  Subpages:

  badmail/examples/n for 1<=n<=8, with good and bad blah, generated
  from chosen canonicals, to show how we fix things up. Each page
  showing old and new, with differences indicated, and they should be
  ordered from reasonable/common to outrageous.

  badmail/examples/ summing up 1-8 and giving one or two truly
  hopeless cases. The hopeless case(s) should also be shown in
  anonymised form.

  badmail/examples/comparison if I feel nasty and bored one day,
  showing how a few IMAP servers handle messages 1-8 and the
  impossible one(s). Does "fetch envelope" return the right thing?
  "fetch bodystructure"? Some choice searches? We don't want to link
  to this page very much. It gets a fine <table> containing many/few
  &#x2713; cells.

  Possibly we want to include screenshots showing how Thunderbird or
  another GUI client that uses envelope/bodystructure renders a
  mailbox containing 1-8. Screenshots using aox and using another
  server, one that gets few &#x2713; cells in the table. I'm not sure
  where to link to these screenshots. Apple Mail?

  We also need aox.org/aox/reparse and I suppose other /aox/<command>
  pages.


Some unfortunate logging

  This is the tail end of a connection Thunderbird used to save
  something to the Sent folder, rewrapped for easier reading:

    imap/info: 2/1/1/2442/6: 2008-06-18 12:53:54.109
        Execution time 550ms
    imap/debug: 2/1/1/2442/6: 2008-06-18 12:53:54.109
        Finished
    imap/debug: 2/1/1/2442: 2008-06-18 12:53:54.109
        IMAP::runCommands, 1 commands
    imap/info: 2/1/1/2442/6: 2008-06-18 12:53:54.109
        Result: OK [APPENDUID 19] done
    imap/debug: 2/1/1/2442/6: 2008-06-18 12:53:54.110
        Retired
    imap/debug: 2/1/1/2442: 2008-06-18 12:53:54.110
        IMAP::runCommands, 0 commands
    imap/info: 2/1/1/2442: 2008-06-18 13:23:53.458
        Idle timeout
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        Closing: IMAP server 195.30.37.30:143
                 connected to client 195.30.37.9:52209, on fd 26
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        IMAP::runCommands, 0 commands
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        IMAP::runCommands, 0 commands
    imap/info: 2/1/1/2442: 2008-06-18 13:23:53.459
        Unexpected close by client
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.459
        IMAP::runCommands, 0 commands
    general/debug: 2/1/1/2442/2/2/2: 2008-06-18 13:23:53.461
        Closing: Byte forwarder 127.0.0.1:56695
                 connected to client 127.0.0.1:2061, on fd 27
    general/info: 2/1/1/2442/2/2/2: 2008-06-18 13:23:53.461
        Shutting down byte forwarder due to peer close.
    imap/info: 2/1/1/2442: 2008-06-18 13:23:53.464
        Unexpected close by client
    imap/debug: 2/1/1/2442: 2008-06-18 13:23:53.464
        IMAP::runCommands, 0 commands

  Hardly unexpected. Will fix later, not sure how.


Dynamically preparing often-used queries

  We can prepare queries cleverly.

  Inside Query, at submit time, we first check whether a Query's text
  matches a PreparedStatement, and uses it if so.

  If not, we check whether the query looks preparable. The condition
  seems to be simple: Starts with 'select ' and contains no numbers.
  If it's preparable we add it to a cache, which is discarded at GC
  time.

  If a preparable query is used more than n times before the cache is
  discarded, we prepare the query and keep the PreparedStatement
  around.


aox.org/clients/

  Move the list of clients from /imap/ and /pop here. Make a per-client
  page, e.g. /clients/outlook, with notes.

  1. Which protocols can you use? Usually IMAP+Submit.

  2. Any bugs worth speaking of?

  3. Any particular configuration advice? for /clients/outlook we say
     "enable use-smtps". For /clients/applemail we point to that IDLE
     plugin.

  That's it, right?


SMS gateway

  We want Archiveopteryx to work as installed. We want to support
  Sieve notify, including SMS.

  I think that means Oryx needs to operate an Archiveopteryx->SMS
  gateway, allow people do send a few SMSes, and provide people with
  the ability to operate a gateway of their own.

  There are many IP->SMS gateways in the world. Some free, but we
  don't want to use those, they're unreliable. Many paid for, those
  are reliable. Most of them work using HTTP requests: You POST a
  query with your credentials and the gateway reports.

  So my plan is as follows:

  1. Become a customer of someone like that.

  2. Write a program which accepts requests in a format we define,
     forwards them in the HTTP-based format our provider uses, and
     relays the response back.

  3. Provide that service to new Archiveopteryx installations, with
     limitations on use.

  4. Provide the gateway program along with Archiveopteryx, so people
     can run it themselves.

  I haven't thought of a good way to provide the service to
  Archiveopteryx users and weed out most other people.

  Perhaps a better alternative: Automatically register with clickatell
  if SMS is enabled and not configured when it's first used. (But it's
  not possible to register with clickatell without intervention.)


New RFCs

  5435: sieve notify
  5436: sieve notify mailto
  5437: sieve notify xmpp
  5463: sieve ihave

  5442: lemonade profile-bis
  

Bug confusing U+ED00 and U+0000 in the message cache

  When we write to the database, U+0000 (which occasionally occurs,
  mostly by mistake but sometimes on purpose) is transformed to
  U+ED00, and when we read it, back.

  So if U+ED00 is written to the DB, it comes back as U+0000.

  This means that Archiveopteryx works differently depending on
  whether the cache is used or not. That has to be resolved somehow.


Axel, /Mime problem

  The problem is that the VCF file contains literal NUL bytes, but is
  sent with a text/* MIME type, and we're mangling the NULs during
  charset conversion (or so I guess, given that they become '?'s
  instead).


Various alias-related feature requests

  e.g. Benjamin wants empty localparts, a number of people want multiple
  targets (Axel, Ingo).


String::wrapped() vs. canonical 13

  Arnt commented out some code which shouldn't have made i/13 fail, but
  did for him. Must investigate.


Axel, /Unable to fetch 12MB mail

  Some sort of loop in the fetcher? I didn't look.


Axel, /2.10 status

  Mail.app occasionally shows a panel, saying "SSL error: Identity of
  mail3.chaos1.de can't be verified, hit "continue" or "abort".


Simon, /Segfault when accessing web-archive

  Not looked yet, but probably the same as the one on archives. Will fix
  with the other known bugs.


Caches that aren't

  There's a session "cache" in HTTP.


Mike Geiger's Thunderbird problems

  TLS breakage (not yet reproduced) and copy-to-sent hanging.


Problems found in 3.0.0 by Timo

  - SEARCH SENTON/SENTBEFORE/SENTAFTER have some bugs.
    (Not verified because of segfault; will check later.)

    Not fixed; I lean towards fixing it if it's the only thing
    imaptest complains about.


Use current_setting('server_version_num') instead of parsing version()

  (But only under 8.2+)


HTML representation of selectors

  We need:

    - An HTML representation of selectors.

      Should be easy to generate, display reasonably even without CSS,
      and easy to parse when we see it again.

        <div id=1>
         <span class=selector>and 2 3</span>
         <span class=translated>Alle Begingungen müssen für jede Nachricht zutreffen</span>
         <div id=2>
          <span class=selector>header 4</span>
          <span class=translated>Headers enthalten <span id=4>arnt</span></span>
         </div>
         <div id=3>
          <span class=selector>body 5</span>
          <span class=translated>Körpertext enthält <span id=5>expunge</span></span>
         </div>
        </div>

      (We don't want to parse translated text because of the ambiguity
      in mapping between texts and conditions if e.g. two conditions map
      to the same Bangla text. Hence the selector/translated split.)

    - A parser to turn HTML into a selector; and a way to turn a
      selector into HTML (plus tests for the conversion).

      We have HtmlParser.

    - A component to display a view, and a component to edit a view
      (complete with evil Javascript) and a magic URL to accept the
      edited representation.


Caching search results

  If a selector is !dynamic(), its results can be cached until the next
  modseq change on the mailbox.



