Distributed Caching Using Ehcache on Amazon EC2
This post elaborates the setup of Ehcache to enable distributed caching in JEE applications. It also contains a specific Ehcache configuration for use in applications deployed on Amazon EC2.
Ehcache, since its takeover by Terracotta®, has evolved to incorporate enterprise features like ‘search’, BigMemory in the Ehcache DX, EX and FX product editions.
However, you can still achieve a high degree of distributed caching by just using the ‘core’ Ehcache library.
This post focuses on a minimalistic approach to incorporate distributed caching into an application. Deployment approaches that involve a large number of instances in a cluster/farm and require cache interactions to participate in JTA-transactions are not covered here. Distributed Caching that caters to these requirements can be achieved using a Terracotta Server Array (TSA).
There are 4 available mechanisms that can be employed to realize a distributed and replicated cache using Ehcache:
RMI – Default remoting mechanism in Java
JGroups – Requires adding JGroups to the application stack
JMS – Requires a Messaging System Provider
Cache Server – Realized using a Terracotta Server (/Array)
This post will cover the use of Java’s default remoting mechanism – RMI.
Configuring an application for Distributed Caching
Enabling distributed and replicated caching in a Java/JEE application is surprisingly simple using Ehcache. The procedure involves 3 steps:
- Download Ehcache (core) from http://ehcache.org
- Place ehcache-core-xxx.jar, slf4j-api-xxx.jar, slf4j-log4jxx-xxx.jar into the library folders of your application (e.g. classpath, WEB-INF/lib)
- Place the Ehcache configuration file (ehcache.xml) in the application classpath (e.g. classpath, WEB-INF/classes)
Ehcache Configuration File (ehcache.xml)
The configuration file, which configures Ehache’s Cache Manager, must have the following elements to enable distributed caching:
- A CacheManagerPeerListenerFactory
- A CacheManagerPeerProviderFactory
- At least one Cache configuration containing:
- A CacheEventListenerFactory
- A BootstrapCacheLoaderFactory
Configurations for environments with no multicast limitations:
Using RMI as the replication mechanism, a simple configuration of the Cache Manager would be as follows:
<?xml version="1.0" encoding="UTF-8"?>This configuration file can be copied into all deployments of the application that need to be part of, and have access to the distributed cache.
<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ehcache.xsd"
updateCheck="false" monitoring="autodetect"
dynamicConfig="true" name="my cache">
<cacheManagerPeerListenerFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"/>
<cacheManagerPeerProviderFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
properties="peerDiscovery=automatic,
multicastGroupAddress=230.0.0.1,
multicastGroupPort=4446,
timeToLive=32"/>
<cache name="cache1"
maxElementsInMemory="100"
eternal="true"
overflowToDisk="false">
<cacheEventListenerFactory
class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
<bootstrapCacheLoaderFactory
class="net.sf.ehcache.distribution.RMIBootstrapCacheLoaderFactory"/>
</cache>
</ehcache>
Configurations for environments with multicast limitations:
If the application is deployed within an environment where multicast is not allowed or not available (like Amazon EC2) then the following configuration can be used:
Example configuration on machine 1 [server1]:
<?xml version="1.0" encoding="UTF-8"?>
<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ehcache.xsd"
updateCheck="false" monitoring="autodetect"
dynamicConfig="true" name="my cache">
<!-- For Non-Multicast supported environments [for server1] START -->
<cacheManagerPeerListenerFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
properties="port=40001,
remoteObjectPort=40002,
socketTimeoutMillis=120000"/>
<cacheManagerPeerProviderFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
properties="peerDiscovery=manual,
rmiUrls=//server2:40001/cache1"/>
<!-- For Non-Multicast supported environments [for server1] END -->
<cache name="cache1"
maxElementsInMemory="100"
eternal="true"
overflowToDisk="false">
<cacheEventListenerFactory
class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
<bootstrapCacheLoaderFactory
class="net.sf.ehcache.distribution.RMIBootstrapCacheLoaderFactory"/>
</cache>
</ehcache>
Example configuration on machine 2 [server2]:
<?xml version="1.0" encoding="UTF-8"?>If there are more than 2 servers in a replication group, then additional servers can be specified in the ‘rmiUrls’ property using a pipe character ‘|’ as the delimiter.
<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ehcache.xsd"
updateCheck="false" monitoring="autodetect"
dynamicConfig="true" name="my cache">
<!-- For Non-Multicast supported environments [for server2] START -->
<cacheManagerPeerListenerFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
properties="port=40001,
remoteObjectPort=40002,
socketTimeoutMillis=120000"/>
<cacheManagerPeerProviderFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
properties="peerDiscovery=manual,
rmiUrls=//server1:40001/cache1"/>
<!-- For Non-Multicast supported environments [for server2] END -->
<cache name="cache1"
maxElementsInMemory="100"
eternal="true"
overflowToDisk="false">
<cacheEventListenerFactory
class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
<bootstrapCacheLoaderFactory
class="net.sf.ehcache.distribution.RMIBootstrapCacheLoaderFactory"/>
</cache>
</ehcache>
This configuration relies on manual peer discovery and requires the enumeration of all machines participating in the distributed cache to be specified.
Another important point to note is that if the machines are separated by firewalls then the RMI remoteObjectPort needs to be specified (as shown in the example above) and firewall configurations would need to be made accordingly.
For the above configurations, create your servers/virtual machines in a security group and authorize the group to itself (this allows all virtual machines in the group free access to each other).
ec2-add-group ehcache -d "ehcache replication ports”
ec2-authorize ehcache -o ehcache –u [aws account id]
If this is not possible and each virtual machine has its own group then allow ports 40001 and 40002 between the groups (you can read http://manmaza.blogspot.com /2011/03/creating-dmz-configurations-on-amazon.html">this post for details on how to do this).
Post a Comment