nGramIzer

created by Károly Füzessi and Lilla Magyari

This program calculates the forward conditional probability and backward conditional probability of Norwegian Bokmål words in sentences in a text file. The probability calculation is based the formulae in Onnis et al. (2022) and uses data from the n-gram database from the National Library of Norway.

Prerequisites

Database of N-grams in Norwegian Bokmål

Download the Norwegian Bokmål n-gram database from the National Library of Norway to the project directory.
Decompress the archive, e.g. by opening a command line and issuing the command: tar xf ngram_nob.tar.gz

You should have a bokm folder in the project directory after some minutes.

Python

Download Python 3.x

Run

Run the nGramIzer on any number of text files with the command

py -3 ngram.py input_file [input_file_2]...

Output files will be generated with a _result.csv postfix.

Note: building the dictionaries takes some time before the actual analysis runs.

Output format

The generated CSV will contain the following columns:

Sentence Number
Word Number
Word
Forward Probability
Backward Probability

Funding statement

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 845343. Project's website: https://www.uis.no/nb/lesesenteret/fictdial

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ngram.py		ngram.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

nGramIzer

Prerequisites

Database of N-grams in Norwegian Bokmål

Python

Run

Output format

Funding statement

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

FictDial/ngramizer

Folders and files

Latest commit

History

Repository files navigation

nGramIzer

Prerequisites

Database of N-grams in Norwegian Bokmål

Python

Run

Output format

Funding statement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages