Skip to content

miserman/dictionary_builder

Repository files navigation

React app to help create and analyze text analysis dictionaries.

Features

Import an existing dictionary (such as from osf.io/y6g5b), or start from scratch.

Imported dictionaries are saved locally with every edit, which can be encrypted.

As part of creation, you can...

  • add fixed or fuzzy (glob or regex) terms,
  • add suggested terms based on word-form matching, embeddings-based similarity, or wordnet-based similarity,
  • and assign terms a sense and category weights (directly, or automatically based on similarity to a given set of terms).

As part of analysis, the tool will...

  • expand fuzzy terms using a word list extracted from embeddings,
  • suggest senses from a wordnet, ranked by similarity to other terms that share a category,
  • and calculate similarity to terms within select categories to visualize as a graph.

Export dictionaries in common formats, such as those accepts by lingmatch for processing in R, and adicat for processing in browser.

Sources

Term associations come from the pre-trained embeddings spaces available at osf.io/489he.

Synsets are from the Open English WordNet, with some added information:

Additional associated terms are from ConceptNet.

The preprocess.R script was used to make the resources from these sources that are used within the app.

Some background to this tool is discussed in Introduction to Dictionary Creation.

Testing

Tests depend on a running dev server, which can be started with the test-serve command:

npm run test-serve

After the site has been compiled, the tests can be run:

npm run test

Note: visit http://localhost:3000/dictionary_builder to compile before running tests.

See current coverage.

About

Web tool to help build dictionaries

Resources

License

Stars

Watchers

Forks