Swedish Letter Frequencies

All text files provided are encoded in utf-8. The frequencies from this page are generated from about 90 Million characters of Swedish text, sourced from Wortschatz. The text files containing the counts can be used with ngram_score.py and used for breaking ciphers, see this page for details, just substitute out the English ngram file for the one you want. If you want to compute the letter frequencies of your own piece of text you can use this page.

Monogram Frequencies §

Swedish single letter frequencies are as follows (in percent %):

A :  9.38        K :  3.14        U :  1.92
B :  1.54        L :  5.28        V :  2.42
C :  1.49        M :  3.47        W :  0.14
D :  4.70        N :  8.54        X :  0.16
E : 10.15        O :  4.48        Y :  0.71
F :  2.03        P :  1.84        Z :  0.07
G :  2.86        Q :  0.02        Å :  1.34
H :  2.09        R :  8.43        Ä :  1.80
I :  5.82        S :  6.59        Ö :  1.31
J :  0.61        T :  7.69                 

The swedish_monograms.txt file provides the counts used to generate the frequencies above:

Common Swedish Words §

The following table shows the 30 most common swedish words. The percentages represent how often the word occurs, e.g. OCH represents around 3.45% of all words in Swedish text.

    I :  3.55          MED :  1.25        UNDER :  0.42
  OCH :  3.45          FÖR :  1.08          VID :  0.38
   EN :  2.13          DET :  0.93          MEN :  0.36
   AV :  2.03          ETT :  0.83          SIG :  0.35
  SOM :  1.95           DE :  0.80          MAN :  0.33
   ÄR :  1.79          HAN :  0.77         ÄVEN :  0.33
  ATT :  1.45          VAR :  0.73         INTE :  0.33
   PÅ :  1.36          HAR :  0.69           ÅR :  0.32
  DEN :  1.32         FRÅN :  0.60        ELLER :  0.30
 TILL :  1.27           OM :  0.43        EFTER :  0.30

The file below has the counts for all the words used to generate the percentages above. These words come from a 'news' corpus, so the words may be skewed to this topic. You can use the word list here to rank text with Word Statistics as a Fitness Measure.

Bigram Frequencies §

We can't list all of the bigram frequencies here, the top 30 are the following (in percent %):

EN :  2.44        RE :  1.10        OM :  0.88
DE :  2.11        ND :  1.07        RI :  0.86
ER :  2.10        TA :  1.03        NG :  0.82
AN :  1.75        TI :  1.01        SK :  0.80
AR :  1.61        NA :  0.98        KA :  0.80
ET :  1.27        NS :  0.97        OC :  0.78
ST :  1.27        TT :  0.97        ME :  0.77
IN :  1.22        LL :  0.94        CH :  0.75
RA :  1.21        AT :  0.91        EL :  0.73
TE :  1.18        LA :  0.88        ÖR :  0.73

The swedish_bigrams.txt file provides the counts used to generate the frequencies above:

Trigram Frequencies §

We can't list all of the trigram frequencies here, the top 30 are the following (in percent %):

OCH :  0.65        ILL :  0.38        DES :  0.30
FÖR :  0.55        ATT :  0.37        DER :  0.29
DEN :  0.48        TIL :  0.34        GEN :  0.29
NDE :  0.47        DET :  0.33        NIN :  0.28
AND :  0.47        ERA :  0.33        REN :  0.27
ADE :  0.46        ARE :  0.33        ANS :  0.26
TER :  0.45        SKA :  0.32        ETT :  0.25
ING :  0.44        STA :  0.31        HAN :  0.25
SOM :  0.41        MED :  0.30        LAN :  0.25
ENS :  0.38        VAR :  0.30        ERS :  0.25

The swedish_trigrams.txt file provides the counts used to generate the frequencies above:

Quadgram Frequencies §

We can't list all of the quadgram frequencies here, the top 30 are the following (in percent %):

TILL :  0.33        NGEN :  0.12        ENSK :  0.09
NING :  0.23        FRÅN :  0.12        ERAD :  0.09
ANDE :  0.19        NOCH :  0.12        FTER :  0.09
LAND :  0.16        INGE :  0.11        NFÖR :  0.09
NDER :  0.15        RADE :  0.10        INGA :  0.09
ADES :  0.13        ROCH :  0.10        ÄREN :  0.09
FÖRS :  0.13        ISKA :  0.10        INTE :  0.09
UNDE :  0.13        NSKA :  0.10        EFTE :  0.09
ERNA :  0.13        LLER :  0.10        STER :  0.08
TION :  0.12        RATT :  0.10        STOR :  0.08

The swedish_quadgrams.txt file provides the counts used to generate the frequencies above:

comments powered by Disqus

Further reading

We recommend these books if you're interested in finding out more.

Cover of Battle of Wits: The Complete Story of Codebreaking in World War II Battle of Wits: The Complete Story of Codebreaking in World War II ASIN/ISBN: 978-0743217347 A good history of the breaking of Enigma Buy from Amazon.com
Cover of Cryptanalysis: A Study of Ciphers and Their Solution Cryptanalysis: A Study of Ciphers and Their Solution ASIN/ISBN: 978-0486200972 Buy from Amazon.com
Cover of Elementary Cryptanalysis: A Mathematical Approach Elementary Cryptanalysis: A Mathematical Approach ASIN/ISBN: 978-0883856475 Buy from Amazon.com
Cover of The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography ASIN/ISBN: 978-1857028799 Simon Singh's 'The Code Book' is an excellent introduction to ciphers and codes Buy from Amazon.com
Cover of The Codebreakers - The Story of Secret Writing The Codebreakers - The Story of Secret Writing ASIN/ISBN: 0-684-83130-9 Buy from Amazon.com
GQQ RPIGD GSCUWDE RGJO WDO WT IWTO WA CROEO EOJOD SGPEOE: SRGDSO, DGCPTO, SWIBPQEUWD, RGFUC, TOGEWD, BGEEUWD GDY YOEUTO - GTUECWCQO