Migrate analytics data from MFP Cloud 8.0 to PMF Cloud 9.1 on Kubernetes
Assumption:
Analytics data is migrated from an existing MFP Cloud 8.0 deployment to PMF 9.1 in InPlace upgrade.
Prerequisite:
- Ensure to take snapshot of the analytics data into a directory
- Note the number of shards configured in elastic search OnPrem environment
Steps:
Take snapshot of analytics data (Taking backup of analytics data).
To take snapshot follow below steps:
#1. Upgrade es-operator to 8.1.32 version In your existing MFP 8.0 set up package, to es/deploy directory & modify operator.yaml file and change es-operator image to do the upgrade. Replace existing operator image version with 8.1.32 version, ensure you have updated version info on below lines (3 places) in operator.yaml
Note: PMF 9.1 package contains es-operator:8.1.32 image, push it to your internal docker repository before applying changes in operator.yaml file. Please refer to PMF documentation to know how to push images to private docker registry.
release: es-operator-8.1.32
...
release: es-operator-8.1.32
...
image: es-operator:8.1.32
#2. Apply operator.yaml changes
kubectl apply -f operator.yaml
#3. Wait till all ibm-es-* pods come up before proceeding further
#4. Check if ibm-es-es-configmap configmap reflects the path.repo, run below command and look for ‘path.repo’
kubectl edit configmap ibm-es-es-configmap
If the changes do not reflect in ibm-es-es-configmap configmap then manually edit configmap and make below changes
kubectl edit configmap ibm-es-es-configmap
Update path.repo: /es_backup\n in elasticsearch.yaml section
Example:
apiVersion: v1
data:
elasticsearch.yml: "cluster:\n name: ${CLUSTER_NAME}\nnode:\n master: ${NODE_MASTER}\n
\ data: ${NODE_DATA}\n name: ${NODE_NAME}\n ingest: ${NODE_INGEST}\nindex:\n
\ number_of_replicas: 1\n number_of_shards: 3\n mapper.dynamic: true\npath:\n
\ data: /data/data_${staticname}\n logs: /data/log_${staticname}\n plugins:
/elasticsearch/plugins\n work: /data/work_${staticname} \nprocessors: ${PROCESSORS:1}\nbootstrap:\n
\ memory_lock: false\nhttp:\n enabled: ${HTTP_ENABLE}\n compression: true\n
\ cors:\n enabled: true\n allow-origin: \"*\"\ncloud:\n k8s:\n service:
${DISCOVERY_SERVICE}\n namespace: ${NAMESPACE}\ndiscovery:\n type: io.fabric8.elasticsearch.discovery.k8s.K8sDiscoveryModule\n
\ zen:\n ping.multicast.enabled: false\n minimum_master_nodes: 1\nxpack.security.enabled:
false\nxpack.ml.enabled: false\ncompress.lzf.decoder: optimal\ndiscovery.zen.ping.multicast.enabled:
false\nbootstrap.mlockall: true\ncompress.lzf.decoder: safe\nscript.inline: true\npath.repo:
/es_backup\n"
#5. Create PersistentVolume (PV) and PersistentVolumeClaim (PVC)
a. Create PersistentVolume as below
# es-pv-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: es-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/es_backup"
kubectl apply -f es-pv-volume.yaml
b. Create PersistentVolumeClaim as below
# es-pv-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: es-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
kubectl apply -f es-pv-claim.yaml
[Change the storage size as per your requirements]
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
es-pv-volume 2Gi RWX Retain Bound mfpmig2/es-pv-claim manual <unset> 12s
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
es-pv-claim Bound es-pv-volume 2Gi RWX manual <unset> 9s
#6. Make volumeMounts and volumes changes in elasticsearch deployments & statefulset
To see the deployments,
kubectl get deployments
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
es-operator 1/1 1 1 40h
ibm-es-esclient 1/1 1 1 40h
ibm-es-esmaster 1/1 1 1 40h
ibm-mf-analytics 1/1 1 1 41h
ibm-mf-analytics-recvr 1/1 1 1 41h
ibm-mf-server 1/1 1 1 41h
mf-operator 1/1 1 1 41
To see the statefulsets,
kubectl get statefulsets
NAME READY AGE
ibm-es-esdata 1/1 40h
Modify ibm-es-esclient and ibm-es-esmaster deployments to update volumeMounts and volumes as per below:
a. Modify ibm-es-esclient deployment
kubectl edit deployment ibm-es-esclient
Add below volumeMount:
- mountPath: /es_backup
name: hp-volume
and add below volume:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
Example after adding volume and volumeMount in ibm-es-esclient deployment:
volumeMounts:
- mountPath: /es_backup
name: hp-volume
- mountPath: /data
name: storage
- mountPath: /elasticsearch/config/elasticsearch.yml
name: config
subPath: elasticsearch.yml
- mountPath: /elasticsearch/config/log4j2.properties
name: config
subPath: log4j2.properties
...
...
...
volumes:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
- emptyDir: {}
name: storage
- configMap:
defaultMode: 420
name: ibm-es-es-configmap
name: config
b. Modify ibm-es-esmaster deployment
kubectl edit deployment ibm-es-esmaster
Add below volumeMount:
- mountPath: /es_backup
name: hp-volume
Add below volume:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
Example after adding volume and volumeMount in ibm-es-esmaster deployment:
volumeMounts:
- mountPath: /es_backup
name: hp-volume
- mountPath: /data
name: storage
- mountPath: /elasticsearch/config/elasticsearch.yml
name: config
subPath: elasticsearch.yml
- mountPath: /elasticsearch/config/log4j2.properties
name: config
subPath: log4j2.properties
...
...
...
volumes:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
- emptyDir: {}
name: storage
- configMap:
defaultMode: 420
name: ibm-es-es-configmap
name: config
c. Modify ibm-es-esdata statefulset
kubectl edit statefulset ibm-es-esdata
Add below volumeMount:
- mountPath: /es_backup
name: hp-volume
Add below volume:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
Example after adding volume and volumeMount in ibm-es-esdata statefulset:
volumeMounts:
- mountPath: /es_backup
name: hp-volume
- mountPath: /data
name: analytics-data
- mountPath: /elasticsearch/config/elasticsearch.yml
name: config
subPath: elasticsearch.yml
- mountPath: /elasticsearch/config/log4j2.properties
name: config
subPath: log4j2.properties
...
...
...
volumes:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
- name: analytics-data
persistentVolumeClaim:
claimName: mfanalyticsvolclaim2
- configMap:
defaultMode: 420
name: ibm-es-es-configmap
name: config
#7. Verify that /es_backup path is mounted in each of the elasticsearch pods
ibm-es-esclient-69f4974f8f-brczt 1/1 Running 0 42h
ibm-es-esdata-0 1/1 Running 0 42h
ibm-es-esmaster-6548b4ddf9-l884h 1/1 Running 0 42h
a. Verify that /es_backup path in ibm-es-esclient pod
kubectl exec -it ibm-es-esclient-7cb8768cf5-kv5lc bash
bash-4.4$ cat /elasticsearch/config/elasticsearch.yml
cluster:
name: ${CLUSTER_NAME}
...
...
...
xpack.ml.enabled: false
compress.lzf.decoder: optimal
discovery.zen.ping.multicast.enabled: false
bootstrap.mlockall: true
compress.lzf.decoder: safe
script.inline: true
path.repo: /es_backup
bash-4.4$ ls -ld /es_backup/
You should see: path.repo: /es_backup in elasticsearch.yaml and also a mounted path.
b. Similiarly verify /es_backup path for ibm-es-esdata-* and ibm-es-esmaster-* pods also
#8. Take snapshot using elasticsearch API
a. Get elasticsearch master pod IP address
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
es-operator-658779fbb4-xbm76 1/1 Running 0 42h 172.16.77.139 master-node <none> <none>
ibm-es-esclient-69f4974f8f-brczt 1/1 Running 0 42h 172.16.77.138 master-node <none> <none>
ibm-es-esdata-0 1/1 Running 0 42h 172.16.77.135 master-node <none> <none>
ibm-es-esmaster-6548b4ddf9-l884h 1/1 Running 0 42h 172.16.77.143 master-node <none> <none>
ibm-mf-analytics-58574df7d6-49bft 1/1 Running 0 42h 172.16.77.142 master-node <none> <none>
ibm-mf-analytics-recvr-7bf6857955-xw4ct 1/1 Running 0 42h 172.16.77.191 master-node <none> <none>
ibm-mf-defaultsecrets-job-2bcg9 0/2 Completed 0 42h 172.16.77.163 master-node <none> <none>
ibm-mf-server-867d785477-bmzmp 1/1 Running 0 42h 172.16.77.184 master-node <none> <none>
mf-operator-5c499fd5d8-84sfr 1/1 Running 0 42h 172.16.77.154 master-node <none> <none>
Elastic Search master pod IP address in this example is: 172.16.77.143
b. Execute below API to verify that elastic search cluster is up and running:
curl --location --request GET 'http://172.16.77.143:9200/_nodes/process?pretty'
c. Execute below API to set backup directory
curl --location --request PUT 'http://172.16.77.143:9200/_snapshot/my_backup' --header 'Content-Type: application/json' --data '{"type": "fs","settings": {"location": "/es_backup"}}'
Success response: {"acknowledged":true}
d. Execute below API to take snapshot
curl --location --request PUT 'http://172.16.77.143:9200/_snapshot/my_backup/snapshot_1'
Success response: {"accepted":true}
e. Verify that the snapshot is created [Run this command on the VM/Node where /es_backup volume is created ]
ls /es_backup/
index indices metadata-snapshot_1 snapshot-snapshot_1
Execute below steps after upgrading to PMF 9.1
Restore MFP 8.0 analytics snapshot data in PMF 9.1
#1. Create PersistentVolume and PersistentVolumeClaim and point it to MFP snapshot (/es_backup directory located on the node/vm) also ensure that elasticsearch.yaml is relfecting path.repo location which is /es_backup
a. Create PersistentVolume as below
# es-pv-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: es-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/es_backup"
kubectl apply -f es-pv-volume.yaml
b. Create PersistentVolumeClaim as below
# es-pv-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: es-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
kubectl apply -f es-pv-claim.yaml
[Change the storage size as per your requirements]
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
es-pv-volume 2Gi RWX Retain Bound mfpmig2/es-pv-claim manual <unset> 12s
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
es-pv-claim Bound es-pv-volume 2Gi RWX manual <unset> 9s
#2. Make volumeMounts and volumes changes in elasticsearch deployments & statefulset
To see the deployments,
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
es-operator 1/1 1 1 40h
ibm-es-esclient 1/1 1 1 40h
ibm-es-esmaster 1/1 1 1 40h
ibm-mf-analytics 1/1 1 1 41h
ibm-mf-analytics-recvr 1/1 1 1 41h
ibm-mf-server 1/1 1 1 41h
mf-operator 1/1 1 1 41
To see the statefulsets
kubectl get statefulsets
NAME READY AGE
ibm-es-esdata 1/1 40h
Modify ibm-es-esclient and ibm-es-esmaster deployments to update volumeMounts and volumes as per below:
a. Modify ibm-es-esclient deployment
kubectl edit deployment ibm-es-esclient
Add below volumeMount:
- mountPath: /es_backup
name: hp-volume
Add below volume:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
Example after adding volume and volumeMount in ibm-es-esclient deployment:
volumeMounts:
- mountPath: /es_backup
name: hp-volume
- mountPath: /data
name: storage
- mountPath: /elasticsearch/config/elasticsearch.yml
name: config
subPath: elasticsearch.yml
- mountPath: /elasticsearch/config/log4j2.properties
name: config
subPath: log4j2.properties
...
...
...
volumes:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
- emptyDir: {}
name: storage
- configMap:
defaultMode: 420
name: ibm-es-es-configmap
name: config
b. Modify ibm-es-esmaster deployment
kubectl edit deployment ibm-es-esmaster
Add below volumeMount:
- mountPath: /es_backup
name: hp-volume
Add below volume:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
Example after adding volume and volumeMount in ibm-es-esmaster deployment:
volumeMounts:
- mountPath: /es_backup
name: hp-volume
- mountPath: /data
name: storage
- mountPath: /elasticsearch/config/elasticsearch.yml
name: config
subPath: elasticsearch.yml
- mountPath: /elasticsearch/config/log4j2.properties
name: config
subPath: log4j2.properties
...
...
...
volumes:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
- emptyDir: {}
name: storage
- configMap:
defaultMode: 420
name: ibm-es-es-configmap
name: config
c. Modify ibm-es-esdata statefulset
kubectl edit statefulset ibm-es-esdata
Add below volumeMount:
- mountPath: /es_backup
name: hp-volume
Add below volume:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
Example after adding volume and volumeMount in ibm-es-esdata statefulset:
volumeMounts:
- mountPath: /es_backup
name: hp-volume
- mountPath: /data
name: analytics-data
- mountPath: /elasticsearch/config/elasticsearch.yml
name: config
subPath: elasticsearch.yml
- mountPath: /elasticsearch/config/log4j2.properties
name: config
subPath: log4j2.properties
...
...
...
volumes:
- name: hp-volume
persistentVolumeClaim:
claimName: es-pv-claim
- name: analytics-data
persistentVolumeClaim:
claimName: mfanalyticsvolclaim2
- configMap:
defaultMode: 420
name: ibm-es-es-configmap
name: config
#3. Verify that /es_backup path is mounted in each of the elasticsearch pods
ibm-es-esclient-69f4974f8f-brczt 1/1 Running 0 42h
ibm-es-esdata-0 1/1 Running 0 42h
ibm-es-esmaster-6548b4ddf9-l884h 1/1 Running 0 42h
a. Verify that /es_backup path in ibm-es-esclient pod
kubectl exec -it ibm-es-esclient-7cb8768cf5-kv5lc bash
bash-4.4$ cat /elasticsearch/config/elasticsearch.yml
cluster:
name: ${CLUSTER_NAME}
...
...
...
xpack.ml.enabled: false
compress.lzf.decoder: optimal
discovery.zen.ping.multicast.enabled: false
bootstrap.mlockall: true
compress.lzf.decoder: safe
script.inline: true
path.repo: /es_backup
bash-4.4$ ls -ld /es_backup/
You should see: path.repo: /es_backup in elasticsearch.yaml and also a mounted path
b. Similiarly verify /es_backup path for ibm-es-esdata-* and ibm-es-esmaster-* pods also
c. Verify that snapshot data available inside /es_backup directory
$ls /es_backup/
index indices metadata-snapshot_1 snapshot-snapshot_1
#4. Restore snapshot using elasticsearch API
a. Get elasticsearch master pod IP address
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
es-operator-658779fbb4-xbm76 1/1 Running 0 42h 172.16.77.139 master-node <none> <none>
ibm-es-esclient-69f4974f8f-brczt 1/1 Running 0 42h 172.16.77.138 master-node <none> <none>
ibm-es-esdata-0 1/1 Running 0 42h 172.16.77.135 master-node <none> <none>
ibm-es-esmaster-6548b4ddf9-l884h 1/1 Running 0 42h 172.16.77.137 master-node <none> <none>
ibm-mf-analytics-58574df7d6-49bft 1/1 Running 0 42h 172.16.77.142 master-node <none> <none>
ibm-mf-analytics-recvr-7bf6857955-xw4ct 1/1 Running 0 42h 172.16.77.191 master-node <none> <none>
ibm-mf-defaultsecrets-job-2bcg9 0/2 Completed 0 42h 172.16.77.163 master-node <none> <none>
ibm-mf-server-867d785477-bmzmp 1/1 Running 0 42h 172.16.77.184 master-node <none> <none>
mf-operator-5c499fd5d8-84sfr 1/1 Running 0 42h 172.16.77.154 master-node <none> <none>
Elastic Search master pod IP address in this example is: 172.16.77.137
b. Execute below API to verify that elastic search cluster is up and running:
curl --location --request GET 'http://172.16.77.137:9200/_nodes/process?pretty'
c. Execute below API to set backup directory
curl --location --request PUT 'http://172.16.77.137:9200/_snapshot/my_backup' --header 'Content-Type: application/json' --data '{"type": "fs","settings": {"location": "/es_backup"}}'
Success response: {"acknowledged":true}
d. Execute below API to restore the snapshot
curl --location --request POST 'http://172.16.77.137:9200/_snapshot/my_backup/snapshot_1/_restore'
Success response: {"accepted":true}
e. If you get open index error then close the index and try restore command again
To close the index, execute below API
curl --location --request POST 'http://172.16.77.173:9200/global_all_tenants/_close'
where ‘global_all_tenants’ is the index to be closed. For each open index error, run execute this API with the given open index
f. Try restore after closing all the open indexes
curl --location --request POST 'http://172.16.77.137:9200/_snapshot/my_backup/snapshot_1/_restore'
Success Response: {"accepted":true}
g. Once the restore is successful, open all the closed index
To open an index, execute below API ```bash
curl --location --request POST 'http://172.16.77.173:9200/global_all_tenants/_open' ```
h. Restart the analytics pod
bash
kubectl delete pod ibm-mf-analytics-58574df7d6-49bft
92