Training FAQ: Working with Text

Stardog Academy Training FAQ:

Working with Text

I am not seeing the expected results via the full-text search, how can I investigate and fix the issues?

This behavior may relate to the Lucene analyzer used for indexing (and likewise for processing your search request). Please try exploring alternative analyzers, e.g., using the Lucene tool Luke which is part of the standard Lucene package ($LUCENE/luke/luke.sh).

We've observed considerable latencies when starting the server. It seems building the text index takes time. Our queries leverage only a small subset of text properties in the graph. What are the options to restrict the indexing to the bits needed?

Currently, the indexing is driven by data types of the literal objects. In an upcoming Stardog release, you will soon be able to further select properties for indexing.

Stardog claims to integrate unstructured content by processing documents, what is the functionality we may expect?

By default, Stardog supports extracting the plain document content and the identification and linking of named entities based on the supplied language models.

We have developed our own language models for SpaCy. Can these be reused within Stardog?

Yes, Stardog is happy to provide examples of integrating SpaCy with other frameworks for text processing via a custom extractor.