Apache Solr Language Identifier


Apache Solr Language Identifier

This module is intended to be used while indexing documents. It is implemented as an UpdateProcessor to be placed in an UpdateChain. Its purpose is to identify language from documents and tag the document with language code.

Compile зависимости (124)

Группа / Артифакт Версия Более новая версия
com.cybozu.labs » langdetect 1.1-20120112 Нет
org.locationtech.spatial4j » spatial4j 0.6 0.8
org.eclipse.jetty » jetty-http 9.3.14.v20161028 10.0.6
org.eclipse.jetty » jetty-xml 9.3.14.v20161028 9.4.44.v20210927
org.eclipse.jetty » jetty-io 9.3.14.v20161028 9.4.44.v20210927
org.eclipse.jetty » jetty-util 9.3.14.v20161028 9.4.44.v20210927
io.airlift » slice 0.10 2.0
org.tukaani » xz 1.5 1.9
org.eclipse.jetty » jetty-webapp 9.3.14.v20161028 9.4.44.v20210927
jdom » jdom 1.0 1.1
com.pff » java-libpst 0.8.1 0.9.3
com.fasterxml.jackson.core » jackson-core 2.5.4 2.17.1
org.eclipse.jetty » jetty-server 9.3.14.v20161028 9.4.44.v20210927
org.eclipse.jetty » jetty-jmx 9.3.14.v20161028 9.4.44.v20210927
com.github.ben-manes.caffeine » caffeine 1.0.1 3.1.8
org.eclipse.jetty » jetty-servlet 9.3.14.v20161028 10.0.12
org.eclipse.jetty » jetty-security 9.3.14.v20161028 9.4.44.v20210927
net.arnx » jsonic 1.2.7 1.3.10
log4j » log4j 1.2.17 Нет
org.eclipse.jetty » jetty-continuation 9.3.14.v20161028 9.4.44.v20210927
dom4j » dom4j 1.6.1 1.4-dev-8
org.eclipse.jetty » jetty-servlets 9.3.14.v20161028 10.0.11
commons-codec » commons-codec 1.10 1.15
commons-cli » commons-cli 1.2 1.4
com.tdunning » t-digest 3.1 3.3
org.apache.xmlbeans » xmlbeans 2.6.0 5.0.1
net.sourceforge.jmatio » jmatio 1.0 Нет
org.apache.zookeeper » zookeeper 3.4.6 3.6.3
org.apache.httpcomponents » httpmime 4.4.1 4.5.12
org.apache.httpcomponents » httpclient 4.4.1 4.5.11
com.facebook.presto » presto-parser 0.122 0.286
com.adobe.xmp » xmpcore 5.1.2 6.1.11
org.apache.james » apache-mime4j-dom 0.7.2 0.8.4
org.apache.james » apache-mime4j-core 0.7.2 0.8.4
commons-configuration » commons-configuration 1.6 1.10
org.apache.pdfbox » jempbox 1.8.12 1.8.16
org.apache.httpcomponents » httpcore 4.4.1 4.4.15
com.ibm.icu » icu4j 56.1 73.1
org.restlet.jee » org.restlet 2.3.0 Нет
org.restlet.jee » org.restlet.ext.servlet 2.3.0 Нет
joda-time » joda-time 2.2 2.12.7
com.carrotsearch » hppc 0.7.1 0.9.0
org.apache.pdfbox » fontbox 2.0.1 3.0.0-alpha2
org.apache.pdfbox » pdfbox 2.0.1 3.0.0-alpha2
commons-fileupload » commons-fileupload 1.3.2 1.4
org.apache.htrace » htrace-core 3.2.0-incubating 4.0.0-incubating
org.apache.pdfbox » pdfbox-tools 2.0.1 2.0.24
com.google.guava » guava 14.0.1 33.0.0-jre
org.bouncycastle » bcprov-jdk15 1.45 1.46
org.bouncycastle » bcmail-jdk15 1.45 1.46
de.l3s.boilerpipe » boilerpipe 1.1.0 Нет
org.slf4j » slf4j-api 1.7.7 2.0.12
io.dropwizard.metrics » metrics-jetty9 3.1.2 4.2.23
org.apache.hadoop » hadoop-common 2.7.2 3.3.1
org.apache.solr » solr-solrj 6.4.0 9.6.0
org.apache.hadoop » hadoop-auth 2.7.2 3.3.1
org.slf4j » jul-to-slf4j 1.7.7 2.0.12
com.googlecode.mp4parser » isoparser 1.1.18 1.1.22
rome » rome 1.0 Нет
io.dropwizard.metrics » metrics-graphite 3.1.2 4.2.23
org.apache.hadoop » hadoop-annotations 2.7.2 3.3.1
org.slf4j » jcl-over-slf4j 1.7.7 2.0.12
io.dropwizard.metrics » metrics-ganglia 3.1.2 3.2.6
org.apache.commons » commons-exec 1.3 Нет
org.apache.solr » solr-core 6.4.0 9.6.0
org.slf4j » slf4j-log4j12 1.7.7 2.0.12
org.apache.tika » tika-core 1.13 1.27
org.apache.tika » tika-xmp 1.13 1.27
org.apache.tika » tika-parsers 1.13 1.27
io.dropwizard.metrics » metrics-jvm 3.1.2 4.2.23
io.dropwizard.metrics » metrics-core 3.1.2 4.2.23
org.antlr » antlr4-runtime 4.5.1-1 4.13.1
org.ow2.asm » asm 5.1 9.2
org.ow2.asm » asm-commons 5.1 9.2
com.google.protobuf » protobuf-java 2.5.0 3.25.3
org.codehaus.woodstox » woodstox-core-asl 4.4.1 Нет
org.apache.tika » tika-java7 1.13 1.27
javax.servlet » javax.servlet-api 3.1.0 4.0.1
org.codehaus.jackson » jackson-core-asl 1.9.13 Нет
com.fasterxml.jackson.dataformat » jackson-dataformat-smile 2.5.4 2.17.1
org.codehaus.jackson » jackson-mapper-asl 1.9.13 Нет
org.apache.poi » poi 3.15-beta1 5.0.0
org.ccil.cowan.tagsoup » tagsoup 1.2.1 Нет
org.apache.poi » poi-ooxml-schemas 3.15-beta1 4.1.2
org.apache.poi » poi-ooxml 3.15-beta1 5.0.0
org.codehaus.woodstox » stax2-api 3.1.4 4.2.1
org.apache.poi » poi-scratchpad 3.15-beta1 5.0.0
info.ganglia.gmetric4j » gmetric4j 1.0.7 1.0.10
org.eclipse.jetty » jetty-deploy 9.3.14.v20161028 10.0.6
org.apache.curator » curator-recipes 2.8.0 5.2.0
org.apache.curator » curator-framework 2.8.0 5.2.0
org.apache.curator » curator-client 2.8.0 5.2.0
commons-io » commons-io 2.5 2.11.0
org.noggit » noggit 0.6 0.8
commons-collections » commons-collections 3.2.2 Нет
org.gagravarr » vorbis-java-core 0.8 Нет
org.apache.lucene » lucene-queryparser 6.4.0 9.9.1
org.apache.lucene » lucene-sandbox 6.4.0 9.9.1
org.gagravarr » vorbis-java-tika 0.8 Нет
org.apache.lucene » lucene-misc 6.4.0 9.9.1
commons-lang » commons-lang 2.6 Нет
xerces » xercesImpl 2.9.1 RELEASE
com.healthmarketscience.jackcess » jackcess 2.1.3 4.0.6
org.apache.lucene » lucene-backward-codecs 6.4.0 9.9.1
org.apache.lucene » lucene-queries 6.4.0 9.9.1
org.apache.lucene » lucene-spatial-extras 6.4.0 9.9.1
org.apache.commons » commons-compress 1.11 1.21
org.apache.lucene » lucene-codecs 6.4.0 9.9.1
com.googlecode.juniversalchardet » juniversalchardet 1.0.3 Нет
com.drewnoakes » metadata-extractor 2.8.1 2.16.0
org.apache.lucene » lucene-join 6.4.0 9.9.1
org.apache.hadoop » hadoop-hdfs 2.7.2 3.3.1
org.apache.lucene » lucene-memory 6.4.0 9.9.1
org.apache.lucene » lucene-grouping 6.4.0 9.9.1
org.apache.lucene » lucene-analyzers-common 6.4.0 8.10.1
org.apache.lucene » lucene-suggest 6.4.0 9.9.1
org.aspectj » aspectjrt 1.8.0 1.9.21.2
org.apache.lucene » lucene-expressions 6.4.0 9.9.1
org.eclipse.jetty » jetty-rewrite 9.3.14.v20161028 10.0.11
org.apache.lucene » lucene-analyzers-kuromoji 6.4.0 8.10.1
org.apache.lucene » lucene-highlighter 6.4.0 9.9.1
org.apache.lucene » lucene-analyzers-phonetic 6.4.0 8.10.1
org.apache.lucene » lucene-classification 6.4.0 9.9.1
org.apache.lucene » lucene-core 6.4.0 9.9.1

Test зависимости (4)