OpenSearch made easy with Index As A Service
Use the power of OpenSearch without managing a cluster.
Use the power of OpenSearch without managing a cluster.
Last updated 13th October, 2022
OpenSearch is one of the main components of the Logs Data Platform, regarded as one of the most powerful search and analytics engines. From the outset we offered the possibility to host a OpenSearch Dashboards index for your OpenSearch Dashboards metadata, Index As A Service being the next step to this functionality. You can now use a fully unlocked index for almost any purpose; be it complex documents, reports or even logs. Thanks to the OpenSearch API, you will be able to use most of the tools of the OpenSearch Ecosystem.
This is what you need to know to get you started:
There are two ways to create an OpenSearch Index:
To create an OpenSearch index with the Logs Data Platform manager, you need to go the index page and click on the Add a new index
on the OpenSearch index section
You must just choose a suffix for your index. The final name will follow this convention:
logs-<username>-i-<suffix>
.
For each index, you can specify the number of shards. A shard is the main component of index. Its maximum storage capacity is set to 25 GB (per shard). Multiple shards means more volume, more parallelism in your requests and thus more performance. Optionally, you can also be notified when your index is close to its critical size. Once your index is created, you can use it right away.
When you create a index through the OpenSearch API, you can also specify the number of shards. Note that the maximum number of shards by index is limited to 16. OpenSearch compatible tools can now create indices on the cluster as long as they follow the naming convention logs-<username>-i-<suffix>
. Here is an exemple with a curl command with the user logs-ab-12345 and the index logs-ab-12345-i-another-index on gra2 cluster.
$ curl -u logs-ab-12345:mypassword -XPUT -H 'Content-Type: application/json' 'https://gra2.logs.ovh.com:9200/logs-ab-12345-i-another-index' -d '{ "settings" : {"number_of_shards" : 1}}'
There is more information about the API Support at the dedicated section.
Whatever method you use, you will be able to query and visualize your documents on Logs Data Platform using the API.
Logs Data Platform OpenSearch indices are compatible with the OpenSearch REST API. Therefore, you can use simple http requests to index and search your data. The API is accessible behind a secured https endpoint with mandatory authentication. We recommend that you use tokens to authenticate yourself. You can retrieve the endpoint of the API at the Home page of your service. Here is a simple example to index a document with curl with an index on the cluster <ldp-cluster>.logs.ovh.com
.
$ curl -u token:<your-token-value> -XPUT -H 'Content-Type: application/json' 'https://<ldp-cluster>.logs.ovh.com:9200/logs-<username>-i-<suffix>/_doc/1' -d '{ "user" : "Oles", "company" : "OVH", "message" : "Hello World !", "post_date" : "1999-11-02T23:01:00" }'
Here is a quick explanation of this command:
Content-Type: application/json
is the mandatory header to indicate that the data will be in the json format.This command will return with a simple payload indicating if the document has been indexed by all the shards involved.
{
"_id": "1",
"_index": "logs-<username>-i-<suffix>",
"_primary_term": 1,
"_seq_no": 0,
"_shards": {
"failed": 0,
"successful": 2,
"total": 2
},
"_type": "_doc",
"_version": 1,
"result": "created"
}
There are multiple ways to search your data, this is one area where the OpenSearch REST API excels. You can either get your data directly by using a GET request, or search it with the Search APIs. To get your document indexed previously, use the following curl request:
$ curl -XGET -u token:<your-token-value> 'https://<ldp-cluster>.logs.ovh.com:9200/logs-<username>-i-<suffix>/_doc/1'
{"_id":"1","_index":"logs-<username>-i-<suffix>","_primary_term":1,"_seq_no":0,"_source":{"company":"OVH","message":"Hello World !","post_date":"1999-11-02T23:01:00","user":"Oles"},"_type":"_doc","_version":1,"found":true}
To issue a simple search you can either use the Query DSL or a URI search. Here is a simple example with an URI search:
$ curl -XGET -u token:<your-token-value> 'https://<ldp-cluster>.logs.ovh.com:9200/logs-<username>-i-<suffix>/_search?q=user:Oles'
{"_shards":{"failed":0,"skipped":0,"successful":1,"total":1},"hits":{"hits":[{"_id":"1","_index":"newindice","_score":0.2876821,"_source":{"company":"OVH","message":"Hello World !","post_date":"1999-11-02T23:01:00","user":"Oles"},"_type":"_doc"}],"max_score":0.2876821,"total":1},"timed_out":false,"took":31}
The following shows how your e-commerce application logs can be sent to the Logs Data Platform whenever a product is ordered. It logs the customer order by using Id for the customers name. For performance reasons or maybe by-design, the application doesn't fetch the full name of the client or other information from the customer database just to produce a log. You can add this information on the fly by using an OpenSearch Index and a Logstash collector on the Logs Data Platform.
The first thing to do is to index some clients information. The snippet below is one entry of the client index.
{
"firstName": "Jon",
"lastName": "Snow",
"age": 22,
"address":
{
"streetAddress": "21 2nd Street",
"city": "Winterfell",
"state": "North",
"postalCode": "14578",
"geolocation":
{
"lat": 54.369488,
"long": -5.574768
}
},
"phoneNumber":
[
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "mobile",
"number": "102 555-4567"
}
]
}
To index several documents at once, it is more efficient to use the bulk api. Here is a small snippet of 3 users you can use to test it.
{ "index" : { "_index" : "logs-<username>-i-<suffix>", "_type" : "_doc" } }
{ "userId": "1", "firstName": "Jon","lastName": "Snow", "age": 22, "address": { "streetAddress": "21 2nd Street", "city": "Winterfell", "state": "North", "postalCode": "14578", "geolocation": { "lat": 54.369488, "long": -5.574768 } },"phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "mobile", "number": "102 555-4567" } ] }
{ "index" : { "_index" : "logs-<username>-i-<suffix>", "_type" : "_doc" } }
{ "userId": "2", "firstName": "Cersei","lastName": "Lannister", "age": 43, "address": { "streetAddress": "1 Palace Street", "city": "King's Landing", "state": "The Crownlands", "postalCode": "26863", "geolocation": { "lat": 42.639758, "long": 18.1094725 } },"phoneNumber": [ { "type": "home", "number": "212 555-6789" }, { "type": "mobile", "number": "102 555-8901" } ] }
{ "index" : { "_index" : "logs-<username>-i-<suffix>", "_type" : "_doc" } }
{ "userId": "3", "firstName": "Daenerys","lastName": "Targaryen", "age": 22, "address": { "streetAddress": "3 Blackwater Bay Ave", "city": "Dragonstone", "state": "Dragonstone", "postalCode": "75197", "geolocation": { "lat": 43.300097, "long": -2.261580 } },"phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "mobile", "number": "102 555-2345" } ] }
A bulk request is a succession of JSON objects with this structure:
action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
...
action_and_meta_data\n
optional_source\n
You can in one request ask OpenSearch to index, update, delete several documents. Save the content of the previous commands in a file named bulk and use the following call to index these 3 users:
$ curl -u token:<your-token-value> -XPUT -H 'Content-Type: application/json' 'https://<ldp-cluster>.logs.ovh.com:9200/logs-<username>-i-<suffix>/_bulk' --data-binary "@bulk"
This call will take the content of the bulk file and execute each index operation. Note that you have to use the option --data-binary and no -d to preserve the newline after each JSON. You can check that your data are properly indexed with the following call:
$ curl -u token:<your-token-value> -XGET 'https://<ldp-cluster>.logs.ovh.com:9200/logs-<username>-i-<suffix>/_doc/_search?pretty=true'
This will give you back the documents of your index:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "logs-<username>-i-<suffix>",
"_type" : "_doc",
"_id" : "AV3HvbQAz85mIBfrJjkV",
"_score" : 1.0,
"_source" : {
"userId" : "1",
"firstName" : "Jon",
"lastName" : "Snow",
"age" : 22,
"address" : {
"streetAddress" : "21 2nd Street",
"city" : "Winterfell",
"state" : "North",
"postalCode" : "14578",
"geolocation" : {
"lat" : 54.369488,
"long" : -5.574768
}
},
"phoneNumber" : [ {
"type" : "home",
"number" : "212 555-1234"
}, {
"type" : "mobile",
"number" : "102 555-4567"
} ]
}
}, {
"_index" : "logs-<username>-i-<suffix>",
"_type" : "_doc",
"_id" : "AV3HvbQAz85mIBfrJjkW",
"_score" : 1.0,
"_source" : {
"userId" : "2",
"firstName" : "Cersei",
"lastName" : "Lannister",
"age" : 43,
"address" : {
"streetAddress" : "1 Palace Street",
"city" : "King's Landing",
"state" : "The Crownlands",
"postalCode" : "26863",
"geolocation" : {
"lat" : 42.639758,
"long" : 18.1094725
}
},
"phoneNumber" : [ {
"type" : "home",
"number" : "212 555-6789"
}, {
"type" : "mobile",
"number" : "102 555-8901"
} ]
}
}, {
"_index" : "logs-<username>-i-<suffix>",
"_type" : "_doc",
"_id" : "AV3HvbQAz85mIBfrJjkX",
"_score" : 1.0,
"_source" : {
"userId" : "3",
"firstName" : "Daenerys",
"lastName" : "Targaryen",
"age" : 22,
"address" : {
"streetAddress" : "3 Blackwater Bay Ave",
"city" : "Dragonstone",
"state" : "Dragonstone",
"postalCode" : "75197",
"geolocation" : {
"lat" : 43.300097,
"long" : -2.26158
}
},
"phoneNumber" : [ {
"type" : "home",
"number" : "212 555-1234"
}, {
"type" : "mobile",
"number" : "102 555-2345"
} ]
}
} ]
}
}
Now that you have some data, you can enrich your logs with it. For this we will use a Logstash collector and an elasticsearch plugin (some elasticsearch tools are compatible with OpenSearch).
If you don't know how to create a Logstash collector, please refer to the Logstash guide. Edit the configuration of Logstash. For this example we will use a SSL TCP input with the GELF codec. Here is the input configuration.
tcp {
port => 12202
type => gelf
ssl_enable => true
ssl_verify => false
ssl_cert => "/etc/ssl/private/server.crt"
ssl_key => "/etc/ssl/private/server.key"
ssl_extra_chain_certs => ["/etc/ssl/private/ca.crt"]
codec => gelf { delimiter => "\x00" }
}
The most important part in this configuration is the filter part:
elasticsearch {
hosts => ["https://gra2.logs.ovh.com:9200"]
index => "logs-<username>-i-<suffix>"
user => "token"
password => "y762pm8j2yhge9c2idpdaqs456dshr78nb2313eaze4656oue45psla"
enable_sort => false
query => "userId:%{[userId]}"
fields => {
"firstName" => "firstName"
"lastName" => "lastName"
"address" => "address"
}
}
if "_elasticsearch_lookup_failure" not in [tags] {
mutate {
add_field => {
"address_geolocation" => "%{[address][geolocation][lat]},%{[address][geolocation][long]}"
}
remove_field => [ "address" ]
}
}
The filter part is composed by two plugins, the elasticsearch plugin and the mutate plugin. The elasticsearch plugin has the following configuration:
The mutate plugin is here to show you how you can combine different subfield information in one top level field. Here we combine a latitude and a longitude field to create a geolocation field then we remove the original address top-field.
One simple way to test your new Logstash configuration is to send a log by using echo and openssl. Check the examples below:
$ echo -e '{"version":"1.1", "host": "little bird", "short_message": "Warrior from the North", "level":1, "_userId": "2", "_unitType": "Westerosis", "_power_num": 200 }'\0 | openssl s_client -quiet -no_ign_eof -connect <input-hostname>:<port>
$ echo -e '{"version":"1.1", "host": "little bird", "short_message": "A legendary dragon", "level":1, "_userId": "3", "_unitType": "Dragon", "_power_num": 200000 }'\0 | openssl s_client -quiet -no_ign_eof -connect <input-hostname>:<port>
As you can see we just specify the userId this order belong to. Sending this log to your Logstash input will give you the following final log:
The log has been enriched with the fields we declared in our filter automatically. Linking information from an index and the logs allow you to create more meaningful Dashboards based on these information:
In this Dashboard, you can see that the first widget is a "quick values" widget based on the firstName fields of the logs we retrieved.
The maximum size of your index is fixed and is dependent on the number of shards. Shards are the unit of parallelism in OpenSearch, so if search performance is critical, you should choose an index with the highest number of shard you can afford. Thanks to the high performance nodes we use, we managed to send thousands of logs to the Logstash and enrich all of them within seconds using only one shard.
It is not possible to change the number of shards of one index. So you will have to be careful of the storage used by your index. Once your index is full, It will be blocked on write requests and you will have no choice but to use Delete By query requests to free space on your index.
Note that you can monitor yourself the size of the index by using the following curl query:
$ curl -u token:<your-token-value> -XPUT -H 'Content-Type: application/json' 'https://<ldp-cluster>.logs.ovh.com:9200/logs-<username>-i-<suffix>/_stats/store?pretty' --data-binary "@bulk"
This command will give you a document with the following format:
{
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_all" : {
"primaries" : {
"store" : {
"size_in_bytes" : 876787361
}
},
"total" : {
"store" : {
"size_in_bytes" : 1746852820
}
}
},
"indices" : {
"logs-<username>-i-<suffix>" : {
"uuid" : "JC0IWkd3QYSBNd4B2bBZGg",
"primaries" : {
"store" : {
"size_in_bytes" : 876787361
}
},
"total" : {
"store" : {
"size_in_bytes" : 1746852820
}
}
}
}
}
The size in bytes used to compute your billing is the one under the following path:
"indices" -> "logs-
On Logs Data Platform, we allow users to use OpenSearch API to handle the lifecycle of their indices. You can create and delete indices directly with the OpenSearch API. You can also create aliases and them. We even support templates to allow users to create their mapping a the creation of the index automatically !
To create an index on Logs Data Platform, use the following call:
$ curl -u <username>:<mypassword> -XPUT -H 'Content-Type: application/json' 'https://gra2.logs.ovh.com:9200/<username>-i-<suffix>' -d '{ "settings" : {"number_of_shards" : 1}}'
You have to follow the Logs Data Platform naming convention <username>-i-<your-suffix>
to create your index. your username, is the one you use to connect to Graylog or to use the API. The suffix can contain any alphanumeric character.
To delete a index use the following call:
$ curl -u <username>:<password> -XDELETE -H 'Content-Type: application/json' 'https://gra2.logs.ovh.com:9200/<username>-i-<suffix>'
Here we use the DELETE HTTP command to delete the index.
Similarly than indices, you can use the API Calls to delete and create aliases on your indices. The only difference is the convention for the name of your alias. Your alias must be formatted as the following <username>-a-<suffix>
. Here is an exemple call :
$ curl -u <username>:<password> -XPUT -H 'Content-Type: application/json' 'https://gra2.logs.ovh.com:9200/<username>-i-<suffix>/_alias/<username>-a-<alias_suffix>'
This call create a individual alias on one index you have previously created.
You can also use the generic aliases call to create aliases:
$ curl -XPOST "https://gra2.logs.ovh.com:9200/_aliases?pretty" -H 'Content-Type: application/json' -d'
{
"actions" : [
{ "remove" : { "index" : "<username>-i-<one-suffix>", "alias" : "<username>-a-<suffix>" } },
{ "add" : { "index" : "<username>-i-<other-suffix>", "alias" : "<username>-a-<suffix>" } },
{ "remove_index": { "index": "<username>-i-<one-suffix>" } }
]
}'
All the actions (alias change, alias creation and index deletion) will be done in a single call. All the indices and aliases involved must follow the convention, otherwise an error will be thrown.
Logs Data Platform supports your custom templates. As for indices and aliases, the template must follow some rules in order for them to work:
<username>
inside the name. It can be anywhere in the name string.<username>-i-
, the "*" character must be after this prefix<username>-a-<suffix>
Here is an exemple of a template for a user logs-ab-12345:
$ curl -u <username>:<password> -XPUT -H 'Content-Type: application/json' 'https://gra2.logs.ovh.com:9200/_template/template_for_logs-ab-12345_indices' -d '
{
"index_patterns" : [ "logs-ab-12345-i-debug*","logs-ab-12345-i-test*" ],
"settings": {
"number_of_shards" : 1,
},
"aliases" : {
"logs-ab-12345-a-all" : {},
"logs-ab-12345-a-debug" : { "filter" : { "term" : { "type" : "debug" } } }
}
}'
This template will be applied for every new index matching the index pattern.
All the items you create through OpenSearch API will be displayed in your manager and can be deleted or monitored through it.
Here the first index was create through API, its description was filled automatically.
Index as a service has some specificities on our platforms. This additional and technical information can help you to use it properly:
Please feel free to give any suggestions in order to improve this documentation.
Whether your feedback is about images, content, or structure, please share it, so that we can improve it together.
Your support requests will not be processed via this form. To do this, please use the "Create a ticket" form.
Thank you. Your feedback has been received.
Access your community space. Ask questions, search for information, post content, and interact with other OVHcloud Community members.
Discuss with the OVHcloud community