Persistent Mobile Foundation Analytics Server Configuration Guide
Overview
Some configuration for the PMF Analytics Server is required. Some of the configuration parameters apply to a single node, and some apply to the whole cluster, as indicated.
Jump to
Properties
For a complete list of configuration properties and how to set them in your application server, see Configuration properties.
- The discovery.zen.minimum_master_nodes property must be set to ceil((number of master-eligible nodes in the cluster / 2) + 1) to avoid split-brain syndrome.
- Elasticsearch nodes in a cluster that are master-eligible must establish a quorum to decide which master-eligible node is the master.
- If you add a master eligible node to the cluster, the number of master-eligible nodes changes, and thus the setting must change. You must modify the setting if you introduce new master-eligible nodes to the cluster. For more information about how to manage your cluster, see Cluster management and Elasticsearch.
- Give your cluster a name by setting the clustername property in all of your nodes.
- Name the cluster to prevent a developer’s instance of Elasticsearch from accidentally joining a cluster that is using a default name.
- Give each node a name by setting the nodename property in each node.
- By default, Elasticsearch names each node after a random Marvel character, and the node name is different on every node restart.
- Explicitly declare the file system path to the data directory by setting the datapath property in each node.
- Explicitly declare the dedicated master nodes by setting the masternodes property in each node.
Cluster Recovery Settings
After you scaled out to a multi-node cluster, you might find that an occasional full cluster restart is necessary. When a full cluster restart is required, you must consider the recovery settings. If the cluster has 10 nodes, and as the cluster is brought up, one node at a time, the master node assumes that it needs to start balancing data immediately upon the arrival of each node into the cluster. If the master is allowed to behave this way, much unnecessary rebalancing is required. You must configure the cluster settings to wait for a minimum number of nodes to join the cluster before the master is allowed to start instructing the nodes to rebalance. It can reduce cluster restarts from hours down to minutes.
- The gateway.recover_after_nodes property must be set to your preference to prevent Elasticsearch from starting a rebalance until the specified number of nodes in the cluster are up and joined. If your cluster has 10 nodes, a value of 8 for the gateway.recover_after_nodes property might be a reasonable setting.
- The gateway.expected_nodes property must be set to the number of nodes that you expect to be in the cluster. In this example, the value for the gateway.expected_nodes property is 10.
- The gateway.recover_after_time property must be set to instruct the master to wait to send rebalanced instructions until after the set time elapsed from the start of the master node.
The combination of the previous settings means that Elasticsearch waits for the value of gateway.recover_after_nodes nodes to be present. Then, it begins recovering after the value of gateway.recover_after_time minutes or after the value of gateway.expected_nodes nodes joined the cluster, whichever comes first.
What not to do
- Do not ignore your production cluster.
- Clusters need monitoring and nurturing. Many good Elasticsearch monitoring tools are available that are dedicated to the task.
- Do not use network-attached storage (NAS) for your datapath setting. NAS introduces more latency, and a single point of failure. Always use the local hosts disks.
- Avoid clusters that span data centers and definitely avoid clusters that span large geographic distances. The latency between nodes is a severe performance bottleneck.
- Roll your own cluster configuration management solution. Many good configuration management solutions, such as Puppet, Chef, and Ansible, are available.
Configuration properties
The PMF Analytics Server can start successfully without any additional configuration.
Configuration is done through JNDI properties on both the Persistent Mobile Foundation Server and the PMF Analytics Server. Additionally, the PMF Analytics Server supports the use of environment variables to control configuration. Environment variables take precedence over JNDI properties.
The Analytics runtime web application must be restarted for any changes in these properties to take effect. It is not necessary to restart the entire application server.
To set a JNDI property on WebSphere Application Server Liberty, add a tag to the server.xml file as follows.
<jndiEntry jndiName="{PROPERTY NAME}" value="{PROPERTY VALUE}}" />
To set a JNDI property on Tomcat, add a tag to the context.xml file as follows.
<Environment name="{PROPERTY NAME}" value="{PROPERTY VALUE}" type="java.lang.String" override="false" />
Persistent Mobile Foundation Server
The following table shows the properties that can be set in the PMF server.
Property | Description | Default Value |
---|---|---|
mfp/mfp.analytics.console.url | Set this property to the URL of your PMF Analytics Console. For example, http://hostname:port/analytics/console. Setting this property enables the analytics icon on the PMF Operations Console. | None |
mfp/mfp.analytics.url | Required. The URL that is exposed by the PMF Analytics Server that receives incoming analytics data. For example, http://hostname:port/analytics-service/rest. | None |
mfp/mfp.analytics.logs.forward | If this property it set to true, server logs that are recorded on the PMF are captured in Persistent Mobile Foundation Analytics. | true |
The following table shows the properties that can be set in the PMF Analytics Server.
Property | Description | Default Value |
---|---|---|
analyticsconsole/mfp.analytics.url | Optional. Full URI of the Analytics REST services. In a scenario with a firewall or a secured reverse proxy, this URI must be the external URI, not the internal URI inside the local LAN. This value can contain * in places of the URI protocol, host name, or port, to denote the corresponding part from the incoming URL. ://:*/analytics-service, with the protocol, host name, and port dynamically determined | |
analytics/mfp.analytics.username | The user name that is used if the data entry point is protected with basic authentication. | None |
analytics/mfp.analytics.password | The password that is used if the data entry point is protected with basic authentication. | None |
Persistent Mobile Foundation Admin Service
The following table shows the properties that can be set in the PMF Admin Service. These properties are required only for inapp feedback feature of analytics service.
Property | Description | Default Value |
---|---|---|
mfpadmin/mfp.analytics.authorization.client.id | The identifier of the confidential client that handles OAuth authorization for the analytics inapp feedback API. Mandatory only if the inapp feedback feature enabled | None |
mfpadmin/mfp.analytics.authorization.client.secret | The secret of the confidential client that handles OAuth authorization for the analytics inapp feedback API. Mandatory only if the inapp feedback feature enabled | None |
Persistent Mobile Foundation Analytics Server
The following table shows the properties that can be set in the PMF Analytics Server.
Property | Description | Default Value |
---|---|---|
analytics/nodetype | Defines the Elasticsearch node type. Valid values are master and data. If this property is not set, then the node acts as both a master-eligible node and a data node. | None |
analytics/shards | The number of shards per index. This value can be set only by the first node that is started in the cluster and cannot be changed. | 1 |
analytics/replicas_per_shard | The number of replicas for each shard in the cluster. This value can be changed dynamically in a running cluster. | 0 |
analytics/masternodes | A comma-delimited string that contains the host name and ports of the master-eligible nodes. | None |
analytics/clustername | Name of the cluster. Set this value if you plan to have multiple clusters that operate in the same subset and need to uniquely identify them. | worklight |
analytics/nodename | Name of a node in the cluster. | A randomly generated string |
analytics/datapath | The path that analytics data is saved to on the file system. | ./analyticsData |
analytics/settingspath | The path to an Elasticsearch settings file. For more information, see Elasticsearch. | None |
analytics/transportport | The port that is used for node-to-node communication. | 9600 |
analytics/httpport | The port that is used for HTTP communication to Elasticsearch. | 9500 |
analytics/http.enabled | Enables or disables HTTP communication to Elasticsearch. | false |
analytics/serviceProxyURL | The analytics UI WAR file and analytics service WAR file can be installed to separate application servers. If you choose to do so, you must understand that the JavaScript run time in the UI WAR file can be blocked by cross-site scripting prevention in the browser. To bypass this block, the UI WAR file includes Java proxy code so that the JavaScript run time retrieves REST API responses from the origin server. But the proxy is configured to forward REST API requests to the analytics service WAR file. Configure this property if you installed your WAR files to separate application servers. | None |
analytics/bootstrap.mlockall | This property prevents any Elasticsearch memory from being swapped to disk. | true |
analytics/multicast | Enables or disables multicast node discovery. | false |
analytics/warmupFrequencyInSeconds | The frequency at which warmup queries are run. Warmup queries run in the background to force query results into memory, which improves web console performance. Negative values disable the warmup queries. | 600 |
analytics/tenant | Name of the main Elasticsearch index. worklight | |
analytics/analytics.authorization.server.url | The URL of the OAuth authorization server that is used by in-app feedback feature of analytics service. If the property is not set properly, the in-app feedback will not be able to sent to the analytics service. | |
analytics/analytics.authorization.client.id | The identifier of the confidential client that handles OAuth authorization for the analytics service. Mandatory only if in-app feedback feature is enabled in application. | |
analytics/analytics.authorization.client.secret | The secret of the confidential client that handles OAuth authorization for the analytics service. Mandatory only if in-app feedback feature is enabled in application. | |
analytics/analytics.security.plugin | The plugin implementation class for the confidential client that handles OAuth authorization for the analytics service. By default it should be assigned to com.ibm.mobile.analytics.server.extensions.plugins.OAuthSecurityPlugin . Mandatory only if in-app feedback feature is enabled in application. |
In all cases where the key does not contain a period (like httpport but not http.enabled), the setting can be controlled by system environment variables where the variable name is prefixed with ANALYTICS_. When both the JNDI property and the system environment variable are set, the system environment variable takes precedence. For example, if you have both the analytics/httpport JNDI property and the ANALTYICS_httpport system environment variable set, the value for ANALYTICS_httpport is used.
Confidential Client and Scope Mapping
Confidential clients are clients that are capable of maintaining the confidentiality of their authentication credentials. In-app feedback API protects resources such as adapters from unauthorized access by assigning a scope to the resource. Following steps are mandatory only if the in-app feedback feature is enabled in the application. For more details see confidential client and scope mapping
Registering the confidential client
Following step is required only if confidential client for analytics service is not added by default. Ignore if its already added.
In the navigation sidebar of the PMF Operations Console, click Runtime Settings → Confidential Clients. Click New to add a new entry. You must provide the following information:
- Display Name - an optional display name that is used to refer to the confidential client. For example: MyExternalServer.
- ID - A unique identifier for the confidential client (can be considered as a username). The ID can contain only ASCII characters.
- Secret - A private passphrase to authorize access from the confidential client (can be considered as an API key). The secret can contain only ASCII characters.
- Allowed Scope - A confidential client that uses ID and Secret combination is automatically granted the scope that is defined here. By default it should be authorization.introspect
Scope mapping
Map the analytics.mobileclient scope element to the application. Mandatory only if in-app feedback feature is enabled in the application.
- Load the PMF Operations Console and navigate to [your application] → Security → Scope-Elements Mapping, click New.
-
Enter MyExternalServer.mobileclient in the Scope element field. Then, click Add.
Document Time to Live (TTL)
TTL is effectively how you can establish and maintain a data retention policy. Your decisions have dramatic consequences on your system resource needs. The long you keep data, the more RAM, disk, and scaling is likely needed.
Each document type has its own TTL. Setting a document’s TTL enables automatic deletion of the document after it is stored for the specified amount of time.
Each TTL JNDI property is named analytics/TTL_[document-type]. For example, the TTL setting for NetworkTransaction is named analytics/TTL_NetworkTransaction.
These values can be set by using basic time units as follows.
- 1w = 1 week
- 1d = 1 day
- 1h = 1 hour
- 1m = 1 minute
- 1s = 1 second
List of supported document-types are as follows:
- TTL_PushNotification
- TTL_PushSubscriptionSummarizedHourly
- TTL_ServerLog
- TTL_AppLog
- TTL_NetworkTransaction
- TTL_AppSession
- TTL_AppSessionSummarizedHourly
- TTL_NetworkTransactionSummarizedHourly
- TTL_CustomData
- TTL_AppPushAction
- TTL_AppPushActionSummarizedHourly
- TTL_PushSubscription
By default, TTL value is set as 90 days if the TTL entry for the particular document-type is not defined in JNDI property.
Elasticsearch
The underlying storage and clustering technology that serves the PMF Analytics Console is Elasticsearch.
Elasticsearch provides many tunable properties, mostly for performance tuning. Many of the JNDI properties are abstractions of properties that are provided by Elasticsearch.
All properties that are provided by Elasticsearch can also be set by using JNDI properties with analytics/ prepended before the property name. For example, threadpool.search.queue_size is a property that is provided by Elasticsearch. It can be set with the following JNDI property.
<jndiEntry jndiName="analytics/threadpool.search.queue_size" value="100" />
These properties are normally set in a custom settings file. If you are familiar with Elasticsearch and the format of its properties files, you can specify the path to the settings file by using the settingspath JNDI property, as follows.
<jndiEntry jndiName="analytics/settingspath" value="/home/system/elasticsearch.yml" />
Unless you are an expert Elasticsearch IT manager, identified a specific need, or were instructed by your services or support team, do not be tempted to fiddle with these settings.
Backing up PMF Analytics data
Learn about how to back up your Persistent Mobile Foundation Analytics data.
The data for PMF analytics is stored as a set of files on the PMF Analytics Server file system. The location of this folder is specified by the analytics/datapath JNDI property in the PMF Analytics Server configuration. For more information about the JNDI properties, see Configuration properties.
The PMF Analytics Server configuration is also stored on the file system, and is called server.xml.
You can back up these files by using any existing server backup procedures that you might already have in place. No special procedure is required when you back up these files, other than ensuring that the PMF Analytics Server is stopped. Otherwise, the data might change while the backup is occurring, and the data that is stored in memory might not yet be written to the file system. To avoid inconsistent data, stop the PMF Analytics Server before you start your backup.
Cluster management and Elasticsearch
Manage clusters and add nodes to relieve memory and capacity strain.
Add a Node to the Cluster
You can add a new node to the cluster by installing the PMF Analytics Server or by running a standalone Elasticsearch instance.
If you choose the standalone Elasticsearch instance, you relieve some cluster strain for memory and capacity requirements, but you do not relieve data ingestion strain. Data reports must always go through the PMF Analytics Server for preservation of data integrity and data optimization prior to going to persistent store.
You can mix and match.
The underlying Elasticsearch data store expects nodes to be homogenous, so do not mix a powerful 8-core 64 GB RAM rack system with a leftover surplus notebook in your cluster. Use similar hardware among the nodes.
Adding a PMF Analytics Server to the cluster
Learn how to add a PMF Analytics Server to the cluster.
Because Elasticsearch is embedded in the PMF Analytics Server, use the Elasticsearch setup to define cluster behavior. Do not, for example, create a WebSphere Application Server Liberty farm or use other application server setups.
In the following sample instructions, do not configure the node to be a master node or a data node. Instead, configure the node as a “search load balancer” whose purpose is to be up temporarily so that the Elasticsearch REST API is exposed for monitoring and dynamic configuration.
Notes:
- Remember to configure the hardware and operating system of this node according to the System requirements.
- Port 9600 is the transport port that is used by Elasticsearch. Therefore, port 9600 must be open through any firewalls between cluster nodes.
- Install the analytics service WAR file and the analytics UI WAR file (if you want the UI) to the application server on the newly allocated system. Install this instance of the PMF Analytics Server to any of the supported app servers.
-
Edit the application server’s configuration file for JNDI properties (or use system environment variables) to configure at least the following flags.
Flag Value (example) Default Note cluster.name worklight worklight The cluster that you intend this node to join. discovery.zen.ping.multicast.enabled false true Set to false to avoid accidental cluster join. discovery.zen.ping.unicast.hosts [“9.8.7.6:9600”] None List of master nodes in the existing cluster. Change the default port of 9600 if you specified a transport port setting on the master nodes. node.master false true Do not allow this node to be a master. node.data false true Do not allow this node to store data. http.enabled true true Open unsecured HTTP port 9200 for Elasticsearch REST API. - Consider all configuration flags in production scenarios. You might want Elasticsearch to keep the plug-ins in a different file system directory than the data, so you must set the path.plugins flag.
- Run the application server and start the WAR applications if necessary.
- Confirm that this new node joined the cluster by watching the console output on this new node, or by observing the node count in the Cluster and Node section of the Administration page in PMF Analytics Console.
Adding a stand-alone Elasticsearch node to the cluster
Learn how to add a stand-alone Elasticsearch node to the cluster.
You can add a stand-alone Elasticsearch node to your existing Persistent Mobile Foundation Analytics cluster in just a few simple steps. However, you must decide the role of this node. Is it going to be a master-eligible node? If so, remember to avoid the split-brain issue. Is it going to be a data node? Is it going to be a client-only node? Perhaps you want a client-only node so that you can start a node temporarily to expose Elasticsearch’s REST API directly to affect dynamic configuration changes to your running cluster.
In the following sample instructions, do not configure the node to be a master node or a data node. Instead, configure the node as a “search load balancer” whose purpose is to be up temporarily so that the Elasticsearch REST API is exposed for monitoring and dynamic configuration.
Notes:
- Remember to configure the hardware and operating system of this node according to the System requirements.
- Port 9600 is the transport port that is used by Elasticsearch. Therefore, port 9600 must be open through any firewalls between cluster nodes.
- Download Elasticsearch from https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.7.5.tar.gz.
- Decompress the file.
-
Edit the config/elasticsearch.yml file and configure at least the following flags.
Flag Value (example) Default Note cluster.name worklight worklight The cluster that you intend this node to join. discovery.zen.ping.multicast.enabled false true Set to false to avoid accidental cluster join. discovery.zen.ping.unicast.hosts [“9.8.7.6:9600”] None List of master nodes in the existing cluster. Change the default port of 9600 if you specified a transport port setting on the master nodes. node.master false true Do not allow this node to be a master. node.data false true Do not allow this node to store data. http.enabled true true Open unsecured HTTP port 9200 for Elasticsearch REST API. - Consider all configuration flags in production scenarios. You might want Elasticsearch to keep the plug-ins in a different file system directory than the data, so you must set the path.plugins flag.
- Run
./bin/plugin -i elasticsearch/elasticsearch-analytics-icu/2.7.0
to install the ICU plug-in. - Run
./bin/elasticsearch
. - Confirm that this new node joined the cluster by watching the console output on this new node, or by observing the node count in the Cluster and Node section of the Administration page in PMF Analytics Console.
Circuit breakers
Learn about Elasticsearch circuit breakers.
Elasticsearch contains multiple circuit breakers that are used to prevent operations from causing an OutOfMemoryError. For example, if a query that serves data to the PMF Operations Console results in using 40% of the JVM heap, the circuit breaker triggers, an exception is raised, and the console receives empty data.
Elasticsearch also has protections for filling up the disk. If the disk on which the Elasticsearch data store is configured to write fills to 90% capacity, the Elasticsearch node informs the master node in the cluster. The master node then redirects new document-writes away from the nearly full node. If you have only one node in your cluster, no secondary node to which data can be written is available. Therefore, no data is written and is lost.
▲