raw embedding file

  • GoogleNews-vectors-negative300.bin is download from google

processed file

  • *.dat is array stored in a binary file on disk.using numpy.memmap api
  • .vocab is the vocab related to .dat

SNLI模型

了解前沿模型,

刷库榜

这个库刷了好多模型,刷不动了吧?18年还刷了3个模型。
后面都是Ensemble模型了

但是这是一个检验模型的很好数据库,同时能跟进前沿模型。

NLI using Bi-LSTM and Inner-Attention | 2016 Yang Liu

Directional self-attention network encoders (code)

NN - NLI - Yang Liu 2016.jpg

code

待看博客

tfidf

1
2
3
4
5
>>> from sklearn.feature_extraction.text import TfidfTransformer
>>> transformer = TfidfTransformer(smooth_idf=False)
>>> transformer
TfidfTransformer(norm=...'l2', smooth_idf=False, sublinear_tf=False,
use_idf=True)

web camera using Java

I have tried number of ways to do this, from a long time.

JMF - Now it is dead
FMJ - Now it is dead too
VLCJ - too much because I am not creating a music/video player and it expect VLC to be installed
Xuggler - too much and hard work
JMyron - didn’t work
JavaFX - I thought it could do it, but seems like it can’t

JavaCV http://docs.opencv.org/2.4/doc/tutorials/introduction/desktop_java/java_dev_intro.html

https://github.com/sarxos/webcam-capture

##

document layout ocr

全局layout–

tesera oCR出样式吗?
上下级,平行关系。

法律条款的额结构分析。

OCR结构 – 识别内容,,相互加强。multi-task 学习。。

collection –