Migrate analytics data from OnPrem MFP 8.0 to PMF Cloud 9.1 on Kubernetes

Assumption:

Analytics data is migrated on a fresh PMF 9.1 installation. PMF elastic search component runs in the same namespace as other PMF components.

Prerequisite:

Ensure to take snapshot of the analytics data into a directory Note the number of shards configured in elastic search OnPrem environment

Steps:

1. Ensure that the elasticsearch API works.

Verify that following elastic search API works:

curl --location --request GET 'http://{ElasticSearchHost}:9200/_nodes/process?pretty'

where {ElasticSearchHost} is IP address or hostname of the machine where elasticsearch service is running

2. Add the JNDI property into server.xml file where elasticsearch service is running.
Below property should be added into server.xml file- This property defines the storage path for the snapshot data.

<jndiEntry jndiName="analytics/path.repo" value='"D:/Backup"'/>

3. Create repository for backup.
Run below elastic search API, which will help to set the path for backup.

curl --location --request PUT 'http://{ElasticSearchHost}:9200/_snapshot/my_backup' --header 'Content-Type: application/json' --data '{"type": "fs","settings": {"location": "D:/Backup"}}'

4. Create Snapshot of the data.
Run below elastic search API to create the snapshot data for existing elasticseach data.

curl --location --request GET 'http://{ElasticSearchHost}:9200/_snapshot/my_backup/snapshot_1'

5. Create a directory /es_backup on the VMs where elasticsearch pods (bm-es-pr-esclient, ibm-es-pr-esmaster, ibm-es-pr-esdata) are running To create the directory, ssh into the VMs and run below command:

mkdir /es_backup

6. Copy the snapshot data taken from OnPrem elasticsearch into the directory /es_backup

7. Edit ibm-es-prod-* configmap and do following changes To edit the configmap, run below command: (Change the name of configmap before running the command)

kubectl edit configmap ibm-es-prod-9-1728914385-es-configmap

and make following changes under “elasticsearch.yml:”

a. Add “path.repo: /es_restore\n” after “script.inline: true\n” b. Change “number_of_shards” value to match the number of shards in your onprem elasticsearch configuration. Default shard in PMF cloud is “3”

Below is the sample code where path.repo was added and number_of_shards value changed to 5

apiVersion: v1
data:
  elasticsearch.yml: "cluster:\n  name: ${CLUSTER_NAME}\nnode:\n  master: ${NODE_MASTER}\n
    \ data: ${NODE_DATA}\n  name: ${NODE_NAME}\n  ingest: ${NODE_INGEST}\nindex:\n
    \ number_of_replicas: 1\n  number_of_shards: 5\n  mapper.dynamic: true\npath:\n
    \ data: /data/data_${staticname}\n  logs: /data/log_${staticname}\n  plugins:
    /elasticsearch/plugins\n  work: /data/work_${staticname}      \nprocessors: ${PROCESSORS:1}\nbootstrap:\n
    \ memory_lock: false\nhttp:\n  enabled: ${HTTP_ENABLE}\n  compression: true\n
    \ cors:\n    enabled: true\n    allow-origin: \"*\"\ncloud:\n  k8s:\n    service:
    ${DISCOVERY_SERVICE}\n    namespace: ${NAMESPACE}\ndiscovery:\n  type: io.fabric8.elasticsearch.discovery.k8s.K8sDiscoveryModule\n
    \ zen:\n    ping.multicast.enabled: false\n    minimum_master_nodes: 1\nxpack.security.enabled:
    false\nxpack.ml.enabled: false\ncompress.lzf.decoder: optimal\ndiscovery.zen.ping.multicast.enabled:
    false\nbootstrap.mlockall: true\ncompress.lzf.decoder: safe\nscript.inline: true\npath.repo:
    /es_restore\n"

8. Create hostPath volume mount in ibm-es-pr-esclient, ibm-es-pr-esmaster deployments and ibm-es-pr-esdata statefulset. If you are using PMF 9.1 on kubernetes,

a. Edit the deployment ibm-es-pr-esclient and make below changes To edit the deployment ibm-es-pr-esclient, run below command:

kubectl edit deployment ibm-es-pr-esclient

Add below mountPath under “volumeMounts” and “volumes:”

i.

mountPath: /es_restore
name: hp-volume

Example code after adding the mountPath:

        volumeMounts:
        - mountPath: /es_restore
          name: hp-volume
        - mountPath: /data
          name: storage
        - mountPath: /elasticsearch/config/elasticsearch.yml
          name: config
          subPath: elasticsearch.yml
        - mountPath: /elasticsearch/config/log4j2.properties
          name: config
          subPath: log4j2.properties 

Add below hostPath under “volumes:”

ii.

hostPath:
  path: /es_backup
  type: DirectoryOrCreate
name: hp-volume      

Example code after adding the hostPath:

      volumes:
      - hostPath:
          path: /es_backup
          type: DirectoryOrCreate
        name: hp-volume
      - emptyDir: {}
        name: storage
      - configMap:
          defaultMode: 420
          name: ibm-es-prod-9-1728914385-es-configmap
        name: config

Save and exit

b. Edit the deployment ibm-es-pr-esmaster and make below changes To edit the deployment ibm-es-pr-esmaster, run below command:

kubectl edit deployment ibm-es-pr-esmaster

Add below mountPath under “volumeMounts” and “volumes:”

i.

mountPath: /es_restore
name: hp-volume

Example code after adding the mountPath:

        volumeMounts:
        - mountPath: /es_restore
          name: hp-volume
        - mountPath: /data
          name: storage
        - mountPath: /elasticsearch/config/elasticsearch.yml
          name: config
          subPath: elasticsearch.yml
        - mountPath: /elasticsearch/config/log4j2.properties
          name: config
          subPath: log4j2.properties 

Add below hostPath under “volumes:”

ii.

hostPath:
  path: /es_backup
  type: DirectoryOrCreate
name: hp-volume      

Example code after adding the hostPath:

      volumes:
      - hostPath:
          path: /es_backup
          type: DirectoryOrCreate
        name: hp-volume
      - emptyDir: {}
        name: storage
      - configMap:
          defaultMode: 420
          name: ibm-es-prod-9-1728914385-es-configmap
        name: config

Save and exit

c. Edit the statefulset ibm-es-pr-esdata and make below changes To edit the statefulset ibm-es-pr-esdata, run below command:

kubectl edit statefulset ibm-es-pr-esdata

Add below mountPath under “volumeMounts” and “volumes:”

i.

mountPath: /es_restore
name: hp-volume

Example code after adding the mountPath:

        volumeMounts:
        - mountPath: /es_restore
          name: hp-volume
        - mountPath: /data
          name: storage
        - mountPath: /elasticsearch/config/elasticsearch.yml
          name: config
          subPath: elasticsearch.yml
        - mountPath: /elasticsearch/config/log4j2.properties
          name: config
          subPath: log4j2.properties 

Add below hostPath under “volumes:”

ii.

hostPath:
  path: /es_backup
  type: DirectoryOrCreate
name: hp-volume      

Example code after adding the hostPath:

      volumes:
      - hostPath:
          path: /es_backup
          type: DirectoryOrCreate
        name: hp-volume
      - emptyDir: {}
        name: storage
      - configMap:
          defaultMode: 420
          name: ibm-es-prod-9-1728914385-es-configmap
        name: config

Save and exit

9. Ensure that all elasticsearch pods are up and running before moving to the next steps.

10. Ensure that the elasticsearch API works:

ssh into the VM and verify that the follwing curl command works:

curl --location --request GET 'http://{ES_MASTER_POD_IP}:9200/_nodes/process?pretty'

where {ES_MASTER_POD_IP} is the ibm-es-pr-esmaster pod IP Address

To know the IP addresses of the pods, run below command:

kubectl get pods -o wide

11. Run below CURL command to restore the snapshot

curl --location --request POST 'http://{ES_MASTER_POD_IP}:9200/_snapshot/my_backup/snapshot_1/_restore'

Success Response:

{"accepted":true}

If you have already run close index API then open the index using below CURL command:

curl --location --request POST 'http://{ES_MASTER_POD_IP}:9200/global_all_tenants/_open'

where “global_all_tenants” is the index that was closed.

Follow below steps to resolve the issues encountered during the restore API:

Possible issues while running restore API and the resolutions:

  • Issue: {“error”:”RepositoryMissingException[[my_backup] missing]”,”status”:404}

    Resolution: Run below CURL command and then after try restore CURL command once again

curl --location --request PUT 'http://{ES_MASTER_POD_IP}:9200/_snapshot/my_backup' --header 'Content-Type: application/json' --data '{"type": "fs","settings": {"location": "/es_restore"}}'
  • Issue: {“error”:”RepositoryVerificationException[[my_backup] path is not accessible on master node]; nested: FileNotFoundException[/es_restore/tests-vkJXwqhRQB-VrXfiRQSPcA-master (Permission denied)]; “,”status”:500}

    Resolution: Ignore the error and try restore API once again

  • Issue: {“error”:”SnapshotRestoreException[[my_backup:snapshot_1] cannot restore index [global_all_tenants] because it’s open]”,”status”:500}

    Resolution: Run below CURL command and then after try restore CURL command once again

curl --location --request POST 'http://{ES_MASTER_POD_IP}:9200/global_all_tenants/_close'

where “global_all_tenants” is the index name, make sure you enter right index name as per the error

Last modified on