This dictionary contains list of common words in UTF-8. Each file is
named for a language and contains common words in that language, one
word per line.

Any lines starting with '#' are disregarded.

A note regarding licensing:

The code and data in this directory are licensed under the OSL 2.1 by
virtue of being in this source tree. Please write to info@oryx.com if
that's a problem for you. If anyone else wants to use this algorithm,
we'll be very flexible.

The data files in this directory are based on the following sources:

1. http://wortschatz.uni-leipzig.de/html/wliste.html

   The files german.words, dutch.words and french.words are based on
   Wortschatz material, transcoded to UTF-8.

2. Eva Schlittermann via email

   The file czech.words is largely based on a list supplied by Eva
   Schlittermann. Supplements desired.

3. These ten pages contain the 10,000 most frequent words in Norwegian
   newspapers, as counted by the University of Oslo's Tekstlab project
   (http://www.hf.uio.no/tekstlab/).

   The original web pages have been deleted sometime since we fetched
   them. Archive.org has copies:

   http://web.archive.org/web/20050324200652/http://www.hf.uio.no/tekstlab/frekvensordlister/aviser.frek.html
   .../aviser.frek2.html etc
   ...
   .../aviser.frek10.html

4. ftp://ftp.spraakbanken.gu.se/pub/statistik/PAROLE/parole_most_freq_10k.tgz

   Note that swedish.words contains less than 25% of the
   parole_most_freq_10k and is modified a little. For any purpose
   other than this algorithm, we recommend going to the source,
   http://spraakbanken.gu.se.

   GU distributes its language data under the following license:

   # --------------------------------------------------------- #
   # ---- license                                         ---- #
   #---------------------------------------------------------- #
   Copyright (c) 2003 Sprkbanken, Gteborgs universitet

   Permission is hereby granted, free of charge, to any person obtaining a
   copy of this resource and associated documentation files (the
   "Resource"), to deal in the Resource without restriction, including
   without limitation the rights to use, copy, modify, merge, publish,
   distribute, sublicense, and/or sell copies of the Resource, and to
   permit persons to whom the Resource is furnished to do so, subject to
   the following conditions:

   The above copyright notice and this permission notice shall be included
   in all copies or substantial portions of the Resource.

   THE RESOURCE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
   IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
   CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
   TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
   RESOURCE OR THE USE OR OTHER DEALINGS IN THE RESOURCE.
   #---------------------------------------------------------- #

