Last updated 3rd June 2021
Objective
Apache Solr is a scalable and fault-tolerant search index.
Solr search with generic schemas provided, and a custom schema is also supported. See the Solr documentation for more information."
Supported versions
Grid |
---|
7.7 |
8.0 |
8.4 |
8.6 |
Deprecated versions
The following versions are available but are not receiving security updates from upstream, so their use is not recommended. They will be removed at some point in the future.
Grid |
---|
3.6 |
4.10 |
6.3 |
6.6 |
7.6 |
Relationship
The format exposed in the $PLATFORM_RELATIONSHIPS
environment variable:
{
"service": "solr86",
"ip": "169.254.251.226",
"hostname": "csjsvtdhmjrdre2uaoeim22xjy.solr86.service._.eu-3.platformsh.site",
"cluster": "rjify4yjcwxaa-master-7rqtwti",
"host": "solr.internal",
"rel": "solr",
"path": "solr\/maincore",
"scheme": "solr",
"type": "solr:8.6",
"port": 8080
}
Usage example
In your .platform/services.yaml
:
searchsolr:
type: solr:8.6
disk: 256
In your .platform.app.yaml
:
relationships:
solrsearch: "searchsolr:solr"
You will need to use
solr
type when defining the service```yaml
.platform/services.yaml
service_name: type: solr:version disk: 256 ```
and the endpoint
solr
when defining the relationship```yaml
.platform.app.yaml
relationships: relationship_name: “service_name:solr” ```
Your
service_name
andrelationship_name
are defined by you, but we recommend making them distinct from each other.
You can then use the service in a configuration file of your application with something like:
package examples
import (
"fmt"
psh "github.com/platformsh/config-reader-go/v2"
gosolr "github.com/platformsh/config-reader-go/v2/gosolr"
solr "github.com/rtt/Go-Solr"
)
func UsageExampleSolr() string {
// Create a NewRuntimeConfig object to ease reading the Web PaaS environment variables.
// You can alternatively use os.Getenv() yourself.
config, err := psh.NewRuntimeConfig()
checkErr(err)
// Get the credentials to connect to the Solr service.
credentials, err := config.Credentials("solr")
checkErr(err)
// Retrieve Solr formatted credentials.
formatted, err := gosolr.FormattedCredentials(credentials)
checkErr(err)
// Connect to Solr using the formatted credentials.
connection := &solr.Connection{URL: formatted}
// Add a document and commit the operation.
docAdd := map[string]interface{}{
"add": []interface{}{
map[string]interface{}{"id": 123, "name": "Valentina Tereshkova"},
},
}
respAdd, err := connection.Update(docAdd, true)
checkErr(err)
// Select the document.
q := &solr.Query{
Params: solr.URLParamMap{
"q": []string{"id:123"},
},
}
resSelect, err := connection.CustomSelect(q, "query")
checkErr(err)
// Delete the document and commit the operation.
docDelete := map[string]interface{}{
"delete": map[string]interface{}{
"id": 123,
},
}
resDel, err := connection.Update(docDelete, true)
checkErr(err)
message := fmt.Sprintf(\x60Adding one document - %s<br>
Selecting document (1 expected): %d<br>
Deleting document - %s<br>
\x60, respAdd, resSelect.Results.NumFound, resDel)
return message
}
package sh.platform.languages.sample;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.client.solrj.response.UpdateResponse;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrInputDocument;
import sh.platform.config.Config;
import sh.platform.config.Solr;
import java.io.IOException;
import java.util.function.Supplier;
public class SolrSample implements Supplier<String> {
@Override
public String get() {
StringBuilder logger = new StringBuilder();
// Create a new config object to ease reading the Web PaaS environment variables.
// You can alternatively use getenv() yourself.
Config config = new Config();
Solr solr = config.getCredential("solr", Solr::new);
try {
final HttpSolrClient solrClient = solr.get();
solrClient.setParser(new XMLResponseParser());
// Add a document
SolrInputDocument document = new SolrInputDocument();
final String id = "123456";
document.addField("id", id);
document.addField("name", "Ada Lovelace");
document.addField("city", "London");
solrClient.add(document);
final UpdateResponse response = solrClient.commit();
logger.append("Adding one document. Status (0 is success): ")
.append(response.getStatus()).append('\n');
SolrQuery query = new SolrQuery();
query.set("q", "city:London");
QueryResponse queryResponse = solrClient.query(query);
SolrDocumentList results = queryResponse.getResults();
logger.append(String.format("Selecting documents (1 expected): %d \n", results.getNumFound()));
// Delete one document
solrClient.deleteById(id);
logger.append(String.format("Deleting one document. Status (0 is success): %s \n",
solrClient.commit().getStatus()));
} catch (SolrServerException | IOException exp) {
throw new RuntimeException("An error when execute Solr ", exp);
}
return logger.toString();
}
}
<?php
declare(strict_types=1);
use Platformsh\ConfigReader\Config;
use Solarium\Client;
// Create a new config object to ease reading the Web PaaS environment variables.
// You can alternatively use getenv() yourself.
$config = new Config();
// Get the credentials to connect to the Solr service.
$credentials = $config->credentials('solr');
try {
$config = [
'endpoint' => [
'localhost' => [
'host' => $credentials['host'],
'port' => $credentials['port'],
'path' => "/" . $credentials['path'],
]
]
];
$client = new Client($config);
// Add a document
$update = $client->createUpdate();
$doc1 = $update->createDocument();
$doc1->id = 123;
$doc1->name = 'Valentina Tereshkova';
$update->addDocuments(array($doc1));
$update->addCommit();
$result = $client->update($update);
print "Adding one document. Status (0 is success): " .$result->getStatus(). "<br />\n";
// Select one document
$query = $client->createQuery($client::QUERY_SELECT);
$resultset = $client->execute($query);
print "Selecting documents (1 expected): " .$resultset->getNumFound() . "<br />\n";
// Delete one document
$update = $client->createUpdate();
$update->addDeleteById(123);
$update->addCommit();
$result = $client->update($update);
print "Deleting one document. Status (0 is success): " .$result->getStatus(). "<br />\n";
} catch (Exception $e) {
print $e->getMessage();
}
import pysolr
from xml.etree import ElementTree as et
import json
from platformshconfig import Config
def usage_example():
# Create a new Config object to ease reading the Web PaaS environment variables.
# You can alternatively use os.environ yourself.
config = Config()
try:
# Get the pysolr-formatted connection string.
formatted_url = config.formatted_credentials('solr', 'pysolr')
# Create a new Solr Client using config variables
client = pysolr.Solr(formatted_url)
# Add a document
message = ''
doc_1 = {
"id": 123,
"name": "Valentina Tereshkova"
}
result0 = client.add([doc_1], commit=True)
client.commit()
message += 'Adding one document. Status (0 is success): {} <br />'.format(json.loads(result0)['responseHeader']['status'])
# Select one document
query = client.search('*:*')
message += '\nSelecting documents (1 expected): {} <br />'.format(str(query.hits))
# Delete one document
result1 = client.delete(doc_1['id'])
client.commit()
message += '\nDeleting one document. Status (0 is success): {}'.format(et.fromstring(result1)[0][0].text)
return message
except Exception as e:
return e
Solr 4
For Solr 4, Web PaaS supports only a single core per server called collection1
.
You must provide your own Solr configuration via a core_config
key in your .platform/services.yaml
:
searchsolr:
type: solr:4.10
disk: 1024
configuration:
core_config: !archive "<directory>"
The directory
parameter points to a directory in the Git repository, in or below the .platform/
folder. This directory needs to contain everything that Solr needs to start a core. At the minimum, solrconfig.xml
and schema.xml
. For example, place them in .platform/solr/conf/
such that the schema.xml
file is located at .platform/solr/conf/schema.xml
. You can then reference that path like this -
searchsolr:
type: solr:4.10
disk: 1024
configuration:
core_config: !archive "solr/conf/"
Solr 6 and later
For Solr 6 and later Web PaaS supports multiple cores via different endpoints. Cores and endpoints are defined separately, with endpoints referencing cores. Each core may have its own configuration or share a configuration. It is best illustrated with an example.
searchsolr:
type: solr:8.4
disk: 1024
configuration:
cores:
mainindex:
conf_dir: !archive "core1-conf"
extraindex:
conf_dir: !archive "core2-conf"
endpoints:
main:
core: mainindex
extra:
core: extraindex
The above definition defines a single Solr 8.0 server. That server has 2 cores defined: mainindex
— the configuration for which is in the .platform/core1-conf
directory — and extraindex
— the configuration for which is in the .platform/core2-conf
directory.
It then defines two endpoints: main
is connected to the mainindex
core while extra
is connected to the extraindex
core. Two endpoints may be connected to the same core but at this time there would be no reason to do so. Additional options may be defined in the future.
Each endpoint is then available in the relationships definition in .platform.app.yaml
. For example, to allow an application to talk to both of the cores defined above its .platform.app.yaml
file should contain the following:
relationships:
solrsearch1: 'searchsolr:main'
solrsearch2: 'searchsolr:extra'
That is, the application's environment would include a solr1
relationship that connects to the main
endpoint, which is the mainindex
core, and a solr2
relationship that connects to the extra
endpoint, which is the extraindex
core.
The relationships array would then look something like the following:
{
"solr1": [
{
"path": "solr/mainindex",
"host": "248.0.65.197",
"scheme": "solr",
"port": 8080
}
],
"solr2": [
{
"path": "solr/extraindex",
"host": "248.0.65.197",
"scheme": "solr",
"port": 8080
}
]
}
Configsets
For even more customizability, it's also possible to define Solr configsets. For example, the following snippet would define one configset, which would be used by all cores. Specific details can then be overridden by individual cores using core_properties
, which is equivalent to the Solr core.properties
file.
searchsolr:
type: solr:8.4
disk: 1024
configuration:
configsets:
mainconfig: !archive "configsets/solr8"
cores:
english_index:
core_properties: |
configSet=mainconfig
schema=english/schema.xml
arabic_index:
core_properties: |
configSet=mainconfig
schema=arabic/schema.xml
endpoints:
english:
core: english_index
arabic:
core: arabic_index
In this example, the directory .platform/configsets/solr8
contains the configuration definition for multiple cores. There are then two cores created: english_index
uses the defined configset, but specifically the .platform/configsets/solr6/english/schema.xml
file, while arabic_index
is identical except for using the .platform/configsets/solr6/arabic/schema.xml
file. Each of those cores is then exposed as its own endpoint.
Note that not all core.properties features make sense to specify in the core_properties. Some keys, such as name and dataDir, are not supported, and may result in a solrconfig that fails to work as intended, or at all.
Default configuration
If no configuration is specified, the default configuration is equivalent to:
searchsolr:
type: solr:8.4
configuration:
cores:
collection1: {}
endpoints:
solr:
core: collection1
The default configuration is based on an older version of the Drupal 8 Search API Solr module that is no longer in use. While it may work for generic cases defining your own custom configuration, core, and endpoint is strongly recommended.
Limitations
The recommended maximum size for configuration directories (zipped) is 2MB. These need to be monitored to ensure they don't grow beyond that. If the zipped configuration directories grow beyond this, performance will decline and deploys will become longer. The directory archives will be compressed and string encoded. You could use this bash pipeline echo $(($(tar czf - . | base64 | wc -c )/(1024*1024))) Megabytes
inside the directory to get an idea of the archive size.
The configuration directory is a collection of configuration data, like a data dictionary, e.g. small collections of key/value sets. The best way to keep the size small is to restrict the directory context to plain configurations. Including binary data like plugin .jar
files will inflate the archive size, and is not recommended.
Accessing the Solr server administrative interface
Because Solr uses HTTP for both its API and admin interface it's possible to access the admin interface over an SSH tunnel.
webpaas tunnel:open
That will open an SSH tunnel to all services on the current environment, and give an output similar to:
SSH tunnel opened on port 30000 to relationship: solrsearch
SSH tunnel opened on port 30001 to relationship: database
Logs are written to: /home/myuser/.platformsh/tunnels.log
List tunnels with: webpaas tunnels
View tunnel details with: webpaas tunnel:info
Close tunnels with: webpaas tunnel:close
In this example, you can now open http://localhost:30000/solr/
in a browser to access the Solr admin interface. Note that you cannot create indexes or users this way, but you can browse the existing indexes and manipulate the stored data.
Web PaaS Dedicated users can use ssh -L 8888:localhost:8983 <user>@<cluster-name>.ent.platform.sh
to open a tunnel instead, after which the Solr server administrative interface will be available at http://localhost:8888/solr/
.
Upgrading
The Solr data format sometimes changes between versions in incompatible ways. Solr does not include a data upgrade mechanism as it is expected that all indexes can be regenerated from stable data if needed. To upgrade (or downgrade) Solr you will need to use a new service from scratch.
There are two ways of doing that.
Destructive
In your services.yaml
file, change the version of your Solr service and its name. Then update the name in the .platform.app.yaml
relationships block.
When you push that to Web PaaS, the old service will be deleted and a new one with the name name created, with no data. You can then have your application reindex data as appropriate.
This approach is simple but has the downside of temporarily having an empty Solr instance, which your application may or may not handle gracefully, and needing to rebuild your index afterward. Depending on the size of your data that could take a while.
Transitional
For a transitional approach you will temporarily have two Solr services. Add a second Solr service with the new version a new name and give it a new relationship in .platform.app.yaml
. You can optionally run in that configuration for a while to allow your application to populate indexes in the new service as well.
Once you're ready to cut over, remove the old Solr service and relationship. You may optionally have the new Solr service use the old relationship name if that's easier for your application to handle. Your application is now using the new Solr service.
This approach has the benefit of never being without a working Solr instance. On the downside, it requires two running Solr servers temporarily, each of which will consume resources and need adequate disk space. Depending on the size of your data that may be a lot of disk space.
Did you find this guide useful?
Please feel free to give any suggestions in order to improve this documentation.
Whether your feedback is about images, content, or structure, please share it, so that we can improve it together.
Your support requests will not be processed via this form. To do this, please use the "Create a ticket" form.
Thank you. Your feedback has been received.