React app to help create and analyze text analysis dictionaries.
Import an existing dictionary (such as from osf.io/y6g5b), or start from scratch.
Imported dictionaries are saved locally with every edit, which can be encrypted.
As part of creation, you can...
- add fixed or fuzzy (glob or regex) terms,
- add suggested terms based on word-form matching, embeddings-based similarity, or wordnet-based similarity,
- and assign terms a sense and category weights (directly, or automatically based on similarity to a given set of terms).
As part of analysis, the tool will...
- expand fuzzy terms using a word list extracted from embeddings,
- suggest senses from a wordnet, ranked by similarity to other terms that share a category,
- and calculate similarity to terms within select categories to visualize as a graph.
Export dictionaries in common formats, such as those accepts by lingmatch for processing in R, and adicat for processing in browser.
Term associations come from the pre-trained embeddings spaces available at osf.io/489he.
Synsets are from the Open English WordNet, with some added information:
- clusters from a Coarse Sense Inventory
- frequencies from an evaluation framework, which come from SemCor and OMSTI
- BabelNet IDs from another evaluation framwork, as mapped from SemCor labels
Additional associated terms are from ConceptNet.
The preprocess.R script was used to make the resources from these sources that are used within the app.
Some background to this tool is discussed in Introduction to Dictionary Creation.
Tests depend on a running dev server, which can be started with the test-serve
command:
npm run test-serve
After the site has been compiled, the tests can be run:
npm run test
Note: visit http://localhost:3000/dictionary_builder
to compile before running tests.
See current coverage.