Apache Solr Language Identifier


Apache Solr Language Identifier

This module is intended to be used while indexing documents. It is implemented as an UpdateProcessor to be placed in an UpdateChain. Its purpose is to identify language from documents and tag the document with language code.

Compile зависимости (118)

Группа / Артифакт Версия Более новая версия
org.apache.pdfbox » jempbox 1.8.12 1.8.16
net.arnx » jsonic 1.2.7 1.3.10
org.apache.xmlbeans » xmlbeans 2.6.0 5.0.1
com.github.ben-manes.caffeine » caffeine 1.0.1 3.1.8
com.fasterxml.jackson.core » jackson-core 2.5.4 2.17.1
dom4j » dom4j 1.6.1 1.4-dev-8
com.pff » java-libpst 0.8.1 0.9.3
com.fasterxml.jackson.core » jackson-annotations 2.5.4 2.17.1
org.tukaani » xz 1.5 1.9
io.airlift » slice 0.10 2.0
org.eclipse.jetty » jetty-deploy 9.3.8.v20160314 10.0.6
org.apache.solr » solr-solrj 6.2.0 9.6.0
org.apache.solr » solr-core 6.2.0 9.6.0
org.apache.poi » poi-ooxml-schemas 3.15-beta1 4.1.2
org.apache.poi » poi-scratchpad 3.15-beta1 5.0.0
com.tdunning » t-digest 3.1 3.3
commons-cli » commons-cli 1.2 1.4
org.eclipse.jetty » jetty-rewrite 9.3.8.v20160314 10.0.11
commons-codec » commons-codec 1.10 1.15
log4j » log4j 1.2.17 Нет
org.apache.lucene » lucene-join 6.2.0 9.9.1
com.healthmarketscience.jackcess » jackcess 2.1.3 4.0.6
org.eclipse.jetty » jetty-servlets 9.3.8.v20160314 10.0.11
org.apache.lucene » lucene-misc 6.2.0 9.9.1
org.apache.lucene » lucene-core 6.2.0 9.9.1
org.apache.lucene » lucene-memory 6.2.0 9.9.1
org.ow2.asm » asm-commons 5.1 9.2
org.eclipse.jetty » jetty-continuation 9.3.8.v20160314 9.4.44.v20210927
org.apache.tika » tika-xmp 1.13 1.27
org.apache.lucene » lucene-grouping 6.2.0 9.9.1
org.ow2.asm » asm 5.1 9.2
org.apache.lucene » lucene-analyzers-phonetic 6.2.0 8.10.1
commons-fileupload » commons-fileupload 1.3.1 1.4
org.apache.lucene » lucene-classification 6.2.0 9.9.1
org.apache.poi » poi 3.15-beta1 5.0.0
org.apache.curator » curator-framework 2.8.0 5.2.0
org.apache.curator » curator-recipes 2.8.0 5.2.0
org.apache.poi » poi-ooxml 3.15-beta1 5.0.0
org.apache.lucene » lucene-queries 6.2.0 9.9.1
org.apache.curator » curator-client 2.8.0 5.2.0
org.apache.lucene » lucene-codecs 6.2.0 9.9.1
org.apache.lucene » lucene-analyzers-common 6.2.0 8.10.1
org.noggit » noggit 0.6 0.8
commons-collections » commons-collections 3.2.2 Нет
org.apache.lucene » lucene-suggest 6.2.0 9.9.1
org.apache.lucene » lucene-sandbox 6.2.0 9.9.1
org.apache.lucene » lucene-analyzers-kuromoji 6.2.0 8.10.1
org.apache.tika » tika-java7 1.13 1.27
org.apache.lucene » lucene-highlighter 6.2.0 9.9.1
org.apache.lucene » lucene-backward-codecs 6.2.0 9.9.1
org.slf4j » slf4j-api 1.7.7 2.0.12
org.apache.commons » commons-exec 1.3 Нет
org.slf4j » slf4j-log4j12 1.7.7 2.0.12
xerces » xercesImpl 2.9.1 RELEASE
commons-lang » commons-lang 2.6 Нет
org.eclipse.jetty » jetty-servlet 9.3.8.v20160314 10.0.12
org.slf4j » jcl-over-slf4j 1.7.7 2.0.12
org.eclipse.jetty » jetty-security 9.3.8.v20160314 9.4.44.v20210927
org.apache.hadoop » hadoop-annotations 2.7.2 3.3.1
com.googlecode.mp4parser » isoparser 1.1.18 1.1.22
org.eclipse.jetty » jetty-server 9.3.8.v20160314 9.4.44.v20210927
org.slf4j » jul-to-slf4j 1.7.7 2.0.12
org.eclipse.jetty » jetty-jmx 9.3.8.v20160314 9.4.44.v20210927
org.apache.hadoop » hadoop-common 2.7.2 3.3.1
org.apache.hadoop » hadoop-auth 2.7.2 3.3.1
org.eclipse.jetty » jetty-webapp 9.3.8.v20160314 9.4.44.v20210927
org.apache.lucene » lucene-expressions 6.2.0 9.9.1
org.apache.lucene » lucene-queryparser 6.2.0 9.9.1
org.aspectj » aspectjrt 1.8.0 1.9.21.2
org.apache.tika » tika-core 1.13 1.27
org.apache.lucene » lucene-spatial-extras 6.2.0 9.9.1
org.apache.tika » tika-parsers 1.13 1.27
org.apache.hadoop » hadoop-hdfs 2.7.2 3.3.1
com.drewnoakes » metadata-extractor 2.8.1 2.16.0
com.googlecode.juniversalchardet » juniversalchardet 1.0.3 Нет
com.google.protobuf » protobuf-java 2.5.0 3.25.3
org.codehaus.woodstox » woodstox-core-asl 4.4.1 Нет
org.antlr » antlr4-runtime 4.5.1-1 4.13.1
org.gagravarr » vorbis-java-core 0.8 Нет
org.gagravarr » vorbis-java-tika 0.8 Нет
org.ccil.cowan.tagsoup » tagsoup 1.2.1 Нет
org.eclipse.jetty » jetty-xml 9.3.8.v20160314 9.4.44.v20210927
org.eclipse.jetty » jetty-io 9.3.8.v20160314 9.4.44.v20210927
org.eclipse.jetty » jetty-http 9.3.8.v20160314 10.0.6
org.eclipse.jetty » jetty-util 9.3.8.v20160314 9.4.44.v20210927
commons-configuration » commons-configuration 1.6 1.10
net.sourceforge.jmatio » jmatio 1.0 Нет
org.apache.commons » commons-compress 1.11 1.21
org.bouncycastle » bcprov-jdk15 1.45 1.46
de.l3s.boilerpipe » boilerpipe 1.1.0 Нет
org.bouncycastle » bcmail-jdk15 1.45 1.46
org.apache.zookeeper » zookeeper 3.4.6 3.6.3
org.codehaus.woodstox » stax2-api 3.1.4 4.2.1
commons-io » commons-io 2.5 2.11.0
rome » rome 1.0 Нет
javax.servlet » javax.servlet-api 3.1.0 4.0.1
com.fasterxml.jackson.dataformat » jackson-dataformat-smile 2.5.4 2.17.1
com.google.guava » guava 14.0.1 33.0.0-jre
org.apache.httpcomponents » httpcore 4.4.1 4.4.15
com.facebook.presto » presto-parser 0.122 0.286
org.apache.httpcomponents » httpmime 4.4.1 4.5.12
joda-time » joda-time 2.2 2.12.7
org.apache.httpcomponents » httpclient 4.4.1 4.5.11
jdom » jdom 1.0 1.1
com.cybozu.labs » langdetect 1.1-20120112 Нет
org.apache.pdfbox » pdfbox-tools 2.0.1 2.0.24
org.apache.pdfbox » fontbox 2.0.1 3.0.0-alpha2
org.apache.pdfbox » pdfbox 2.0.1 3.0.0-alpha2
com.adobe.xmp » xmpcore 5.1.2 6.1.11
org.apache.james » apache-mime4j-core 0.7.2 0.8.4
org.apache.james » apache-mime4j-dom 0.7.2 0.8.4
com.fasterxml.jackson.core » jackson-databind 2.5.4 2.17.1
com.ibm.icu » icu4j 56.1 73.1
com.carrotsearch » hppc 0.7.1 0.9.0
org.apache.htrace » htrace-core 3.2.0-incubating 4.0.0-incubating
org.restlet.jee » org.restlet.ext.servlet 2.3.0 Нет
org.restlet.jee » org.restlet 2.3.0 Нет
org.locationtech.spatial4j » spatial4j 0.6 0.8

Test зависимости (4)