Open
Description
Now that the first Creative Commons complete arXiv dataset has been published (ipfs-inactive/archives#2), it's time to build some cool apps on top of it!
Some ideas (please add more in the comments):
- search engine for LaTeX equations, like http://latexsearch.com/
- publishing papers in alternative formats (html, etc)
- building a citation graph
- training a topic model (latent Dirichlet allocation / hierarchical Dirichlet processes)
- automatically extracting definitions to build a dictionary of terminology
- semantic markup of math (connecting variables in equations to their textual descriptions, disambiguating notation, etc)
Metadata
Metadata
Assignees
Labels
No labels