Using Analytics Data Migration tool


Use the Analytics Data Migration tool to migrate the tenant-specific data from an Elasticsearch instance to an OpenSearch instance.

Important

  1. Ensure that sufficient CPU and memory resources are available on Elasticsearch and OpenSearch systems. The migration process from Elasticsearch to OpenSearch is resource-intensive and time-consuming.

    To minimize impact on system performance, it is recommended to plan the migration during off-peak hours.

    As a guideline, exporting 5 GB of data typically takes around 1 hour, and importing it into OpenSearch takes an additional 1 hour. Thus, the total time required will scale with the size of the data.

  2. Analytics console should not be used to generate any new data once the export process begins, so you need to account for this downtime as per Step#1.

  3. Migration is supported only on Linux and Windows operating systems.

  4. Ensure that the health status of your OpenSearch cluster is Green before you start the data migration.

    You can use the following API to check health status of cluster.

    protocol://<open-search-ip>:<port>/_cluster/health

Prerequisites

  1. Enable Elasticsearch access.

    On-prem deployment

    Ensure that the Elasticsearch HTTP port is exposed on the Analytics Server. Use the following command to check through cURL or REST.

     GET http://<hostname>:<port>/
    

    The default HTTP port is 9500.

    If the endpoint is not accessible, add the following JNDI property to the server.xml of the Analytics deployment and restart instance.

    <jndiEntry jndiName="analytics/http.enabled" value='"true"'/>

    Cloud - Kubernetes deployment

    To enable the export process to access Elasticsearch data temporarily from outside the cluster, create a Kubernetes Ingress resource to expose the Elasticsearch service securely. For more information on configuration details, see the provided ingress.yaml template.

  2. Ensure OpenSearch is up and does not have any data or templates for the tenant. Use the following commands to check through cURL or REST.
    GET https://{hostname_port}/_cat/indices/worklight*/?v&s=index
    GET https://{hostname_port}/_index_template/worklight*
    
  3. You need Node.js Version 19 or later to run the tool. Skip to the next step, if it is already installed on your system.

    If Node.js is not installed, use the portable version that is bundled with the tool by using the following command.

    Linux

    export PATH=<installer>/tools/elasticsearch-opensearch-migration/node-linux-x64/bin:$PATH
    

    Windows

    (command prompt)

    set PATH=<installer>\tools\elasticsearch-opensearch-migration\node-win-x64;%PATH%
    

    Where,

    <installer> is the full path of the extracted installer till the tools directory.

  4. Install application prerequisites by using the following command inside the analytics-data-migration-tool directory.

    npm install
    
  5. Ensure sufficient disk space is available for a successful data export from Elasticsearch.

    On the Analytics Console, navigate to Administration (Spanner icon) and check the Bytes Used column to determine the size of the data to be exported.

    For example, if the data size on Analytics console is 50 GB, you will need an additional 65–70% of that size in free disk space to complete the export process. This means that to export 50 GB of data, a minimum of approximately 85 GB of available storage is required.

  6. Before initiating the data import or migration process to OpenSearch, ensure that none of the nodes in the OpenSearch cluster have disk utilization exceeding 90%.

    You can use the following API to verify current disk usage.

    protocol://<open-search-ip>>:<port>/_cat/nodes?v&h=name,disk.used_percent

    If the disk.used_percent value for any node is above 90%, or is expected to exceed this threshold during migration, the process fails.

    Since OpenSearch requires approximately the same amount of storage as Elasticsearch, ensure that equivalent storage capacity is available on the OpenSearch cluster prior to starting the migration.

  7. Data export from Elasticsearch and import into OpenSearch must run sequentially. If export is triggered in the background, ensure it completes fully before starting the data migration to OpenSearch.

Procedure

  1. Navigate to <installer>\tools\elasticsearch-opensearch-migration\analytics-data-migration-tool directory.

  2. Update the following properties in the config.json file.

    Property Description Example
    elasticsearch.url Base URL of the Elasticsearch instance. http://localhost:9200/
    opensearch.url Base URL of the OpenSearch instance. https://localhost:9200/
    opensearch.username Username for authenticating with OpenSearch. admin
    opensearch.password Password for authenticating with OpenSearch. admin
    opensearch.tenant Tenant name used for OpenSearch indexing and templates. worklight
    opensearch.number_of_shards Number of primary shards for OpenSearch indices. 1
    opensearch.number_of_replicas Number of replica shards for OpenSearch indices. 0
    elasticsearch.exportdir Directory path where Elasticsearch exports were stored. es_export/
    Elasticsearch security settings    
    elasticsearch.connection.security Security mode for Elasticsearch. insecure, tls
    elasticsearch.tls.ca_cert_path Path to CA certificate for server verification. tls
    OpenSearch security settings    
    opensearch.connection.security Security mode for OpenSearch. insecure, tls
    opensearch.tls.ca_cert_path Path to CA certificate for server verification. tls

    Important:

    • Before using the tool, replace placeholder IPs or hosts (<update-this-to-ip-or-host>) with actual target address.
    • All URL paths specified in the configuration (Example: elasticsearch.url, opensearch.url) must end with a trailing slash (/).
    • Ensure correct port numbers for your Elasticsearch or OpenSearch clusters.
    • Shards and replicas settings impact performance and availability; adjust according to your cluster size and requirements, do not carry forward Elasticsearch settings configured in pre-migration environment without analyzing anticipated data load and network calls. To set the appropriate configuration for shards and replicas, see the Hardware sizing calculator.
    • Tenant might not be necessarily the same as it was in Elasticsearch, you can set custom tenant name of your choice, but it should not consist any special character or capital letter. If custom tenant name is used during export same tenant should be used post migration before starting Analytics server instance.
  3. Go to the analytics-data-migration-tool directory.

    a. Export data from the ElasticSearch instance by using the following command.

    npm run export -- [options]
    

    Options

    -r, –reset - Delete all previously exported data files and initiate the process from the beginning.

    -b, –batchSize <number> - Batch size for migration task

    Use a custom batch size for data export only on systems with exceptionally powerful RAM and CPU. Even then, keep the batch size within 1000 records to avoid potential system overload.

    Example

    npm run export -- -b 500
    

    b. Migrate the exported data to the OpenSearch instance by using the following command.

    npm run migrate -- [options]
    

    Options

    -b, --batchSize <number> - Batch size for the migration task.

    -i, –includeExisting - Import the exported data into the OpenSearch instance regardless of whether the document counts match.

    -v, –validateData - Validate the document count of exported data with the OpenSearch instance.

    Use a custom batch size for data export only on systems with exceptionally powerful RAM and CPU. Even then, keep the batch size within 1000 records to avoid potential system overload.

    Example

    npm run migrate -- -b 100
    
  4. Verify migration is successfully completed by using the following command.

    npm run migrate -- -v
    

Post migration tasks

Update the existing Analytics server.

  1. Ensure that the following properties are configured in server.xml file of the Analytics server to connect with the OpenSearch datastore.

    JNDI Name Description Default values Required
    analytics/tenant The tenant identifier used for multi-tenant indexing in the OpenSearch. worklight Optional
    analytics/datastore_url The base URL of the OpenSearch instance
    Example
    https://host:port/
    Need to configure Yes
    analytics/datastore_username The username for authenticating with OpenSearch. Need to configure Yes
    analytics/datastore_password The password for authenticating with OpenSearch. Need to configure Yes
    analytics/shards Number of primary shards to be created per index. 1 Optional
    analytics/replicas_per_shard Number of replica shards to be created per primary shard. 0 Optional
  2. If you have set a custom name for the tenant (Default is worklight) during data export, you must update the JNDI property with the same custom tenant name in the server.xml file.
  3. Start the PMF Analytics Application server.

Troubleshooting

For more information, see Troubleshooting Elasticsearch to OpenSearch data migration.

Last modified on