Skip to content

tmbdev-tutorials/icdar2019-readings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Readings for the ICDAR2019 Deep Learning Tutorial

Original Convolutional Networks

  • 1995-lecun-convolutional
    • convolutional networks, sigmoid, average pooling
    • precursor of RCNN for multi-object recognition
    • digits and handwriting

Convolutional Networks on GPUs

OCR:

Segmentation, Superresolution with Convolutional Networks

OCR:

RCNN and Overfeat

  • 2014-lecun-overfeat
    • convolutional network, generic feature extraction
    • sliding window at multiple scales across image
    • regression network
  • 2015-liu-multibox
    • input image and ground truth boxes
  • 2015-ren-faster-rcnn-v3
    • region proposal network (object/not object, box coords at each loc)
    • translation invariant anchors

OCR:

  • 2014-jaderberg-convnet-ocr-wild
    • convnet, R-CNN, bounding box regression
    • synthetic, ICDAR scene text, IIT Scene Text, IIT 5k words, IIT Sports-10k, BBC News
    • no bounding boxes in general; initial detector trained on positive word samples, negative images
    • 10k proposals per image

Saliency, Attention, Visualization

LSTM, CTC, GRU

OCR:

2D LSTM

OCR:

Seq2Seq, Attention

OCR:

  • 2015-sahu-s2s-ocr
    • standard seq2seq encoder/decoder approach
    • TSNE visualizations of encoded word images
    • word images from scanned books

Visual Attention

OCR:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published