<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ecomZera &#187; technology</title>
	<atom:link href="http://www.ecomzera.com/category/technology/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ecomzera.com</link>
	<description>world class ecommerce services provider</description>
	<lastBuildDate>Sat, 10 Dec 2011 12:25:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Today&#039;s Software development is &quot;enabled by design&quot; to change</title>
		<link>http://www.ecomzera.com/2010/04/13/todays-software-development-are-enabled-by-design-to-change/</link>
		<comments>http://www.ecomzera.com/2010/04/13/todays-software-development-are-enabled-by-design-to-change/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 09:58:35 +0000</pubDate>
		<dc:creator>pankaj</dc:creator>
				<category><![CDATA[ecomzera]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://www.ecomzera.com/?p=760</guid>
		<description><![CDATA[While i was reading through a paper, to understand the application life cycle i have come across how the technology development focus has changed. I see we have been doing the development almost the way the Today&#8217;s software are visualized, designed. Yesterday’s technologies, teams, applications Today’s technologies, teams, applications Designed to last Designed to change [...]]]></description>
			<content:encoded><![CDATA[<p>While i was reading through a paper, to understand the application life cycle i have come across how the technology development focus has changed.  I see we have been doing the development almost the way the Today&#8217;s software are visualized, designed.</p>
<table border="0" cellspacing="0" rules="NONE">
<col width="305"></col>
<col width="285"></col>
<tbody>
<tr>
<td style="border: 1px solid #000000" width="305" height="17" align="LEFT"><strong>Yesterday’s technologies, teams, applications</strong></td>
<td style="border: 1px solid #000000" width="285" align="LEFT"><strong>Today’s technologies, teams, applications</strong></td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT"></td>
<td style="border: 1px solid #000000" align="LEFT"></td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT">Designed to last</td>
<td style="border: 1px solid #000000" align="LEFT">Designed to change</td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT">Tightly coupled</td>
<td style="border: 1px solid #000000" align="LEFT">Loosely coupled, modular</td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT">Integrated silos</td>
<td style="border: 1px solid #000000" align="LEFT">Compositions (of services, of applications)</td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT">Code-oriented</td>
<td style="border: 1px solid #000000" align="LEFT">Process-oriented</td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT">Rigid sequential development</td>
<td style="border: 1px solid #000000" align="LEFT">Interactive and iterative development</td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT">Cost-centered</td>
<td style="border: 1px solid #000000" align="LEFT">Business-oriented</td>
</tr>
<tr>
<td style="border: 1px solid #000000" height="17" align="LEFT">Homogeneous</td>
<td style="border: 1px solid #000000" align="LEFT">Heterogeneous</td>
</tr>
</tbody>
</table>
<p>The above tabular representation describes the direction development is moving towards. Features/Dashboards of a website are visualized to be plug-in, that works more by enable/disable methodology.  Layers of a framework are designed to interact collectively or provide a service independently.</p>
<p>ref : <a title="Business Agility Revealed: How IT can Enable Business Change with Application Lifecycle Management" href="http://go.techtarget.com/r/11311925/8333341/3?kasid=1268999265257" target="_blank">http://go.techtarget.com/r/11311925/8333341/3?kasid=1268999265257</a></p>
<div style="overflow: hidden;width: 1px;height: 1px"></div>
<div style="overflow: hidden;width: 1px;height: 1px"></div>
<div style="overflow: hidden;width: 1px;height: 1px">Yesterday’s                       Today’s<br />
technologies, teams, applications technologies, teams, applications<br />
Designed to last                  Designed to change<br />
Tightly coupled                   Loosely coupled, modular<br />
Integrated silos                  Compositions (of services, of applications)<br />
Code-oriented                     Process-oriented<br />
Rigid sequential development      Interactive and iterative development<br />
Cost-centered                     Business-oriented<br />
Homogeneous                       Heterogeneous</div>
]]></content:encoded>
			<wfw:commentRss>http://www.ecomzera.com/2010/04/13/todays-software-development-are-enabled-by-design-to-change/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scaling your J2EE Applications &#8211; Wang Yu</title>
		<link>http://www.ecomzera.com/2009/07/14/scaling-your-j2ee-applications-wang-yu/</link>
		<comments>http://www.ecomzera.com/2009/07/14/scaling-your-j2ee-applications-wang-yu/#comments</comments>
		<pubDate>Tue, 14 Jul 2009 12:39:36 +0000</pubDate>
		<dc:creator>lijesh</dc:creator>
				<category><![CDATA[technology]]></category>
		<category><![CDATA[The server side]]></category>

		<guid isPermaLink="false">http://www.ecomzera.com/?p=481</guid>
		<description><![CDATA[If an application is useful, then the network of users will grow crazily fast at some point. As more and more mission-critical applications are now running on Java EE, many Java developers are caring about scalability issues. However, most of popular Web 2.0 sites are built with script languages, and there are a lot of [...]]]></description>
			<content:encoded><![CDATA[<p>If an application is useful, then the network of users will grow crazily fast   at some point. As more and more mission-critical applications are now running   on Java EE, many Java developers are caring about scalability issues. However,   most of popular Web 2.0 sites are built with script languages, and there are   a lot of voices to doubt the scalability of Java Applications. In this article,   Wang Yu takes real world cases as examples to explain ways on how to scale   Java applications based on his experiences on the laboratory projects, and   at the same time, bring together practice, science, algorithms, frameworks,   and experience on failed projects, to help readers on building high scalable   Java applications.</p>
<p>I have been working in an internal laboratory for years. This laboratory   is always equipped with the latest big servers from our company and is free   for our partners to test the performance of their products and solutions. Part   of my job is to help them tune the performance on all kinds of the powerful   CMT and SMP servers.</p>
<p>In these years, I have helped testing dozens of Java applications in variety   of different solutions. Many products are aimed for the same industry domains   and have very similar functionalities, but the scalability is so different   that some of them can not only scale up on the 64 CPUs servers, but also scale   out to more than 20 server nodes, while others can only be running on the machines   with no more than 2 CPUs.</p>
<p>The key for the difference lies in the vision of the architect when designing   the products. All these scaled-well Java applications were well prepared for   the scalability, from the requirement collection phase, system design phase   to the implementation phase of the products&#8217; life cycle. Your Java application   scalability is really based on your vision.</p>
<p>Scalability, as a property of systems, is generally difficult to define,   and is often mix-used with &#8220;performance&#8221;. Yes, yes, scalability is closely   related with performance, and its purpose is to get high performance. But the   measurement for &#8220;scalability&#8221; is different from &#8220;performance&#8221;. In this article,   we will take the definitions from wikipedia:</p>
<p>Scalability is a desirable property of a system, a network, or a process,     which indicates its ability to either handle growing amounts of work in a graceful       manner, or to be readily enlarged. For example, it can refer to the capability       of a system to increase total throughput under an increased load when resources       (typically hardware) are added.</p>
<p>To scale vertically (or scale up) means to add resources to a single node   in a system, typically involving the addition of CPUs or memory to a single   computer. Such vertical scaling of existing systems also enables them to leverage   virtualization technology more effectively, as it provides more resources for   the hosted set of operating systems and application modules to share.</p>
<p>To scale horizontally (or scale out) means to add more nodes to a system,   such as adding a new computer to a distributed software application. An example   might be scaling out from one web server system to three. As computer prices   drop and performance continues to increase, low cost &#8220;commodity&#8221; systems   can be used for high performance computing applications such as seismic analysis   and biotechnology workloads that could in the past only be handled by supercomputers.   Hundreds of small computers may be configured in a cluster to obtain aggregate   computing power which often exceeds that of traditional RISC processor based   scientific computers.</p>
<p>The first installment of this article will discuss scaling Java applications     vertically.</p>
<h2>How to scale Java EE applications vertically</h2>
<p>Many software designers and developers take the functionality as the most   important factor in a product while thinking of performance and scalability   as add-on features and after-work actions. Most of them believe that expensive hardware can close the gap of the performance issue.</p>
<p>Sometimes they are wrong. Last month, there was an urgent project in our   laboratory. After the product failed to meet the performance requirement   of their customer in a 4-CPU machine, the partner wanted to test their product   in a bigger (8-CPU) server. The result was that the performance was worse than in the 4-CPU server.</p>
<p>Why did this happen? Basically, if your system is a multiprocessed or multithreaded     application, and is running out of CPU resources, then your applications   will most likely scale well when more CPUs added.</p>
<p>Java technology-based applications embrace threading in a fundamental way.   Not only does the Java language facilitate multithreaded applications, but   the JVM is a multi-threaded process that provides scheduling and memory management   for Java applications. Java applications that can benefit directly from multi-CPU   resources include application servers such as BEA&#8217;s Weblogic, IBM&#8217;s Websphere,   or the open-source Glassfish and Tomcat application server. All applications   that use a Java EE application server can immediately benefit from CMT &amp; SMP technology.</p>
<p>But in my laboratory, I found a lot of products cannot make full usage of   the CPU resources. Some of them can only occupy no more than 20% CPU resources   in an 8-CPU server. Such applications can benefit little when more CPU resources added.</p>
<h1>Hot lock is the key enemy of scalability</h1>
<p>The primary tool for managing coordination between threads in Java programs   is the synchronized keyword. Because of the rules involving cache flushing   and invalidation, a synchronized block in the Java language is generally more   expensive than the critical section facilities offered by many platforms. Even   when a program contains only a single thread running on a single processor, a synchronized method call is still slower than an un-synchronized method call.</p>
<p>To observe the problems caused by the synchronized keyword, just send a QUIT   signal to the JVM process, which gives you a thread dump. If you have seen   a lot of thread stacks just like the following in the thread dump file, which means that your system hits &#8220;Hot Lock&#8221; problem.</p>
<pre>...........
"Thread-0" prio=10 tid=0x08222eb0 nid=0x9 waiting for monitor entry
[0xf927b000..0xf927bdb8]
	at testthread.WaitThread.run(WaitThread.java:39)
	- waiting to lock &lt;0xef63bf08&gt; (a java.lang.Object)
	- locked &lt;0xef63beb8&gt; (a java.util.ArrayList)
	at java.lang.Thread.run(Thread.java:595)
.........</pre>
<p>The synchronized keyword will force the scheduler to serialize operations   on the synchronized block. If many threads compete for the contended synchronizations,   and only one thread is executing a synchronized block, then any other threads   waiting to enter that block are stalled. If no other threads are available   for execution, then processors may sit idle. In such situations, more CPUs can help little on performance.</p>
<p>Hot Lock may involve multiple thread switches and system calls. When multiple     threads contend for the same monitor, the JVM has to maintain a queue of   threads waiting for that monitor (and this queue must be synchronized across   processors), which means more time spent in the JVM or OS code and less time spent in your program code.</p>
<p>To avoid the hot lock problem, following suggestions may be helpful:</p>
<p><strong> Make synchronized blocks as short as possible </strong></p>
<p>When you make the time a thread holds a given lock shorter, the probability     that another thread competes with the same lock will become lower. So while     you should use synchronization to access shared variables, you should move     the thread safe code outside of the synchronized block. Take following code   as an example:</p>
<pre><em>Code list 1:</em>
public boolean updateSchema(HashMap nodeTree) {
synchronized (schema) {
	String nodeName = (String)nodeTree.get("nodeName");
	String nodeAttributes = (List)nodeTree.get("attributes");
		if (nodeName == null)
			return false;
		else
		return schema.update(nodeName,nodeAttributes);
	}
}</pre>
<p>This piece of code wants to protect the shared variable &#8220;schema&#8221;     when updating it. But the code for getting attribute values is thread safe, and can be moved out of the block, making the synchronized block shorter:</p>
<pre><em>Code list 2:</em>
public boolean updateSchema(HashMap nodeTree) {
String nodeName = (String)nodeTree.get("nodeName");
String nodeAttributes = (List)nodeTree.get("attributes");
synchronized (schema) {
	if (nodeName == null)
		return false;
	else
	return schema.update(nodeName,nodeAttributes);
	}
}</pre>
<p><strong> Reducing lock granularity </strong></p>
<p>When you are using a &#8220;synchronized&#8221; marker, you have two choices on its granularity:     &#8220;method locks&#8221; or &#8220;block locks&#8221;. If you put the &#8220;synchronized&#8221; on a method,   you are locking on &#8220;this&#8221; object implicitly.</p>
<pre><em>Code list 3:</em>
public class SchemaManager {
	private HashMap schema;
	private HashMap treeNodes;
	....
	public boolean synchronized updateSchema(HashMap nodeTree) {
	String nodeName = (String)nodeTree.get("nodeName");
	String nodeAttributes = (List)nodeTree.get("attributes");

		if (nodeName == null) return false;
		else return schema.update(nodeName,nodeAttributes);
}

public boolean synchronized updateTreeNodes() {
	......
}
}</pre>
<p>Compared the code with Code list 2, this piece of code is worse, because it   locks on the entire object when calling &#8220;updateSchema&#8221; method. To achieve finer   granularity, just lock the &#8220;schema&#8221; instance variable instead of the all &#8220;SchemaManager&#8221;   instances to enable different methods to be paralleled.<br />
<strong><br />
Avoid lock on static methods </strong></p>
<p>The worst solution is to put the &#8220;synchronized&#8221; keywords on the static methods,     which means it will lock on all instances of this class. One of projects     tested in our laboratory had been found to have such issues. When tested,     we found almost all working threads waiting for a static lock (a Class lock):</p>
<pre>--------------------------------
at sun.awt.font.NativeFontWrapper.initializeFont(Native Method)
- waiting to lock &lt;0xeae43af0&gt; (a java.lang.Class)
at java.awt.Font.initializeFont(Font.java:316)
at java.awt.Font.readObject(Font.java:1185)
at sun.reflect.GeneratedMethodAccessor147.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1736)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1835)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1759)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1835)
at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452)
at com.fr.report.CellElement.readObject(Unknown Source)
.........</pre>
<p>When using Java2D to generate font objects for the reports, the developers       put a native static lock on the &#8220;initialize&#8221; method. To be fair, this was       caused by Sun&#8217;s JDK 1.4 (Hotspot). After changing to JDK 5.0, the static       lock disappeared.</p>
<p><strong>Using lock free data structure in Java SE 5.0 </strong></p>
<p>The &#8220;synchronized&#8221; keyword in Java is simply a relatively coarse-grained   coordination mechanism, and as such, is fairly heavy for managing a simple   operation such as incrementing a counter or updating a value, like following code:</p>
<pre><em>Code list 4:</em>
public class OnlineNumber {
	private int totalNumber;
	public synchronized int getTotalNumber() { return totalNumber; }
	public synchronized int increment() { return ++totalNumber; }
	public synchronized int decrement() { return --totalNumber; }
}</pre>
<p>The above code is just locking very simple operations, and the &#8220;synchronized&#8221;   blocks are very short. However, if the lock is heavily contended (threads frequently   ask to acquire the lock when it is already held by another thread), throughput   can suffer, and contended synchronization can be quite expensive.</p>
<p>Fortunately, in Java SE 5.0 and above, you can write wait-free, lock-free   algorithms under the help with hardware synchronization primitives without   using native code. Almost all modern processors have instructions for updating   shared variables in a way that can either detect or prevent concurrent access   from other processors. These instructions are called compare-and-swap, or CAS.</p>
<p>A CAS operation includes three parameters &#8212; a memory location, the expected   old value, and a new value. The processor will update the location to the new   value if the value that is there matches the expected old value; otherwise   it will do nothing. It will return the value that was at that location prior   to the CAS instruction. An example way to use CAS for synchronization is as   following:</p>
<pre><em>Code list 5:</em>
public int increment() {
	int oldValue = value.getValue();
	int newValue = oldValue + 1;
	while (value.compareAndSwap(oldValue, newValue) != oldValue)
	oldValue = value.getValue();
	return oldValue + 1;
}</pre>
<p>First, we read a value from the address, then perform a multi-step computation   to derive a new value (this example is just increasing by one), and then use   CAS to change the value of address from oldValue to the newValue. The CAS succeeds   if the value at address has not been changed in the meantime. If another thread   did modify the variable at the same time, the CAS operation will fail, but   detect it and retry it in a while loop. The best thing of CAS is that it is   implemented in hardware and is extremely lightweight. If 100 threads execute   this increment()method at the same time, in the worst case each thread will   have to retry at most 99 times before the increment is complete.</p>
<p>The <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/package-summary.html" target="_blank">java.util.concurrent.atomic</a> package in Java SE 5.0 and above provides   classes that support lock-free thread-safe programming on single variables.   The atomic variable classes all expose a compare-and-set primitive, which is   implemented using the fastest native construct available on the platform. Nine   flavors of atomic variables are provided in this package, including: AtomicInteger;   AtomicLong; AtomicReference; AtomicBoolean; array forms of atomic integer;   long; reference; and atomic marked reference and stamped reference classes,   which atomically update a pair of values.</p>
<p>Using an atomic package is easy. To rewrite the increasing method of code   list 5:</p>
<pre><em>Code list 6:</em>
import java.util.concurrent.atomic.*;
....

   private AtomicInteger value = new AtomicInteger(0);
   public int increment() {
     return value.getAndIncrement();
   }
....</pre>
<p>Nearly all the classes in the java.util.concurrent package use atomic     variables instead of synchronization, either directly or indirectly. Classes     like ConcurrentLinkedQueue use atomic variables to directly implement wait-free     algorithms, and classes like ConcurrentHashMap use ReentrantLock for locking     where needed. ReentrantLock, in turn, uses atomic variables to maintain the     queue of threads waiting for the lock.</p>
<p>One successful story about the lock free algorithms is a financial system   tested in our laboratory, after replaced the &#8220;Vector&#8221; data structure with &#8220;ConcurrentHashMap&#8221;, the performance in our CMT machine(8 cores) increased more than 3 times.</p>
<h1>Race   condition can also cause the scalability problems</h1>
<p>Too many &#8220;synchronized&#8221; keywords will cause the scalability problems. But     in some special cases, lack of &#8220;synchronized&#8221; can also cause the system fail       to scale vertically. The lack of &#8220;synchronized&#8221; can cause race conditions,       allowing more than two threads to modify the shared resources at the same       time, and may corrupt some shared data. Why do I say it will cause the   scalability problem?</p>
<p>Let&#8217;s take a real world case as an example. This is an ERP system for manufacture,   when tested its performance in one of our latest CMT servers (2CPU, 16 cores,   128 strands ), we found the CPU usage was more than 90%. This was a big surprise,   because few applications can scale so well in this type of machine. Our excitement   just lasted for 5 minutes before we discovered that the average response     time was very high and the throughput was unbelievable low. What were these     CPUs doing? Weren&#8217;t they busy? What were they busy with? Through the tracing     tools in the OS, we found almost all the CPUs were doing the same thing     &#8211; &#8220;HashMap.get()&#8221;, and it seemed that all CPUs were in infinite loops.     Then we tested this application on diverse servers with different numbers     of CPUs. The result was that the more CPUs the server has, the more chances   this infinite loop would happen.</p>
<p>The root cause of the infinite loop is on an unprotected shared variable&#8211;     a &#8220;HashMap&#8221; data structure. After added &#8220;synchronized&#8221; marker to all the     access methods, everything was normal. By checking the source code of the     &#8220;HashMap&#8221; (Java SE 5.0), we found there was some potential for such an infinite     loop by corrupting its internal structure. As shown as following code, if     we make the entries in the HashMap to form a circle, then &#8220;e.next()&#8221; will     never be a null.</p>
<pre><em>Code list 7:</em>
public V get(Object key) {
if (key == null) return getForNullKey();
	int hash = hash(key.hashCode());
	for (Entry&lt;K,V&gt; e = table[indexFor(hash, table.length)];
		e != null;
		e = e.next) {
		Object k;
		if (e.hash == hash &amp;&amp; ((k = e.key) == key || key.equals(k)))
			return e.value;
		}
	return null;
}</pre>
<p>Not only its get() method, but also put() and other methods are all exposed   by this risk. Is this a bug of JVM? No, this was reported long time ago (please   refer to <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6423457" target="_blank">http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6423457)</a>. Sun engineers   didn&#8217;t think it a bug, but rather suggested the use of &#8220;ConcurrentHashMap&#8221;.   So take it into consideration when building a scalable system.</p>
<h1>Non-Blocking IO vs. Blocking IO</h1>
<p>The java.nio package, which was introduced in Java 1.4, allows developers       to achieve greater performance in data processing and offers better scalability.     The non-blocking I/O operations provided by NIO allows for Java applications     to perform I/O more like what is available in other lower-level languages       like C. There are a lot of NIO frameworks currently, such as Mina from       Apache and Grizzly from Sun, which are widely used by many projects and   products.</p>
<p>During the last 5 months, there were two Java EE projects hold in our laboratory   which only wanted to test their products&#8217; performance on both traditional   blocking-I/O based servers and non-blocking I/O based servers, to see the   difference. They chose Tomcat 5 as blocking-I/O based servers, and Glassfish   as Non-blocking I/O based servers.</p>
<p>First, they tested a few simple JSP pages and Servlets, got the following   result (on a 4-CPUs server):</p>
<table border="1" cellspacing="0" cellpadding="4" align="center">
<tbody>
<tr>
<td rowspan="2">
<div><strong>Concurrent Users</strong></div>
</td>
<td colspan="2">
<div><strong>Average Response Time (ms)</strong></div>
</td>
</tr>
<tr>
<td>
<div><strong>Tomcat</strong></div>
</td>
<td>
<div><strong>Glassfish</strong></div>
</td>
</tr>
<tr>
<td>
<div>5</div>
</td>
<td>
<div>30</div>
</td>
<td>
<div>138</div>
</td>
</tr>
<tr>
<td>
<div>15</div>
</td>
<td>
<div>35</div>
</td>
<td>
<div>142</div>
</td>
</tr>
<tr>
<td>
<div>30</div>
</td>
<td>
<div>37</div>
</td>
<td>
<div>142</div>
</td>
</tr>
<tr>
<td>
<div>50</div>
</td>
<td>
<div>41</div>
</td>
<td>
<div>151</div>
</td>
</tr>
<tr>
<td>
<div>100</div>
</td>
<td>
<div>65</div>
</td>
<td>
<div>155</div>
</td>
</tr>
</tbody>
</table>
<p>The performance of Glassfish was far behind Tomcat&#8217;s according to           the test result. The customer doubted about the advantage of non-blocking           I/O. Why so many articles and technical reports are telling about the         performance and scalability of the NIO?</p>
<p>After tested more scenarios, they changed their mind, for the results     showed the power of NIO little by little. What they have tested are:</p>
<ol>
<li> More complex scenarios instead of simple JSPs and Servlets, involving           EJB, Database, file IO, JMS and transactions.</li>
<li>Simulating more concurrent users, from 1000 up to 10,000.</li>
<li>Testing in different hardware environments, from 2CPUs, 4CPUs,           up to 16 CPUs.</li>
</ol>
<p>The figure below shows the results of the testing           on a 4-CPU server.</p>
<div>
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td align="left" valign="top"><img src="http://www.theserverside.com/tt/articles/content/ScalingYourJavaEEApplications/clip_image002.gif" alt="" width="511" height="276" /></td>
</tr>
</tbody>
</table>
</div>
<p align="center"><strong>Figure 1: Throughput in a 4CPU server </strong></p>
<p>Traditional blocking I/O will use a dedicated working thread for a coming   request. The assigned thread will be responsible for the whole life cycle of   the request &#8211; reading the request data from the network, decoding the parameters,   computing or calling other business logical functions, encoding the result,   and sending it out to the requester. Then this thread will return to the thread   pool and be reused by other requests. This model in Tomcat 5 is very effective   when dealing with simple logical in a small number of concurrent users under   perfect network environments.</p>
<p>But if the request involves complex logic, or interacts with outer system   such as file systems, database, or a message server, the working thread will   be blocked at the most of the processing time to wait for the return of Syscalls     or network transfers. The blocking thread will be held by the request until     finished, but the operating system will park this thread to relieve the CPU     to deal with other requests. If the network between the clients and the server     is not very good, the network latency will block the threads longer. Even   more, when keep-alive is required, the current working thread will be blocked   for a long time after the request processing is finished. To better utilized   the CPU resources, more working threads are needed.</p>
<p>Tomcat uses a thread pool, and each request will be served by any idle thread     in the thread pool. &#8220;maxThreads&#8221; decides the maximum number of threads that     Tomcat can create to service requests. If we set &#8220;maxThreads&#8221; too small,     we cannot fully utilize the CPU resources, and more important, will get a     lot of requests dropped and rejected by the server when concurrent users     increases. In this testing, we set &#8220;maxThreads&#8221; to &#8220;1000&#8243; (which is too large     and unfair to Tomcat). Under such settings, Tomcat will span a lot of threads   when concurrent users go up to a high level.</p>
<p>The large number of Java threads will cause the JVM and OS busy with handling     scheduling and maintenance work of these threads, instead of processing business     logic. More over, more threads will consume more JVM heap memory (each thread   stack will occupy some memory), and will cause more frequent garbage collection.</p>
<p>Glassfish doesn&#8217;t need so many threads. In non-blocking IO, a working thread     will not binding to a dedicated request. If one request is blocking due to     any reasons, this thread will reuse by other requests, In such way, Glassfish     can handle thousands of concurrent users by only tens of working threads.     By limiting the threads resources, Non-blocking IO has better scalability     (refer to the figure below). That&#8217;s the reason that Tomcat 6 has embraced     non-blocking IO too.</p>
<div>
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td align="left" valign="top"><img src="http://www.theserverside.com/tt/articles/content/ScalingYourJavaEEApplications/clip_image002_0000.gif" alt="" width="501" height="290" /></td>
</tr>
</tbody>
</table>
</div>
<p align="center"><strong>Figure 2: scalability test result</strong></p>
<h1>Single thread task problem</h1>
<p>A Java EE-based ERP system was tested in our laboratory months ago, and one       of its testing scenarios was to generate a very complex annual report.   We tested this scenario in different servers and found that the cheapest AMD       PC server got the best performance. This AMD server has only two 2.8G HZ       CPUs and 4G memory, yet its performance exceeded the expensive 8-CPUs SPARC   server shipped with 32G memory.</p>
<p>The reason is because that scenario is a single thread task, which can     only be run by a single user (concurrently access by many users is meaningless     in this case ). So it can just using one CPU when running. Such a task     cannot scale to multi-processors. At the most of time, the frequency of   CPU plays the leading role of the performance in such cases.</p>
<p>Parallelization is the solution. To parallelize the single thread task, you     must find a certain level of independence in the order of operations, then     use multiple threads to achieve the parallelization. In this case, the customer     had refined their &#8220;annual report generation&#8221; task to generate monthly reports     first, then generate the annual report based on those 12 monthly reports.     &#8220;Monthly reports&#8221; are just transition results, since such reports are useful     for the end users. But &#8220;monthly reports&#8221; can be generated concurrently and     will be used to generate the final report quickly. In this way, this scenario     was scaled to 4-CPU SPARC servers very well, and exceeded the AMD server   more than 80% on performance.</p>
<p>Re-architecture and re-code the whole solution is a time consuming work and     error prone. One of projects in our laboratory used <a href="http://www.hipecc.wichita.edu/jomp.html" target="_blank">JOMP</a> and     achieved parallelization for its single-thread tasks. JOMP is a Java API     for thread-based SMP parallel programming. Just like OpenMP, JOMP uses compiler     directives to insert parallel programming constructs into a regular program.     In a Java program, the JOMP directives take the form of comments beginning     with //omp. The JOMP program is run through a precompiler which processes     the directives and produces the actual Java program, which is then compiled     and executed. JOMP supports most features of OpenMP, including work-sharing   parallel loops and parallel sections, shared variables, thread local variables,   and reduction variables. The following code is an example of JOMP programming.</p>
<pre><em>Code list 8:</em>
Li n k e dLi s t c = new Li n k e dLi s t ( ) ;
c . add ( " t h i s " ) ;
c . add ( " i s " ) ;
c . add ( " a " ) ;
c . add ( "demo" ) ;
/ / #omp p a r a l l e l i t e r a t o r
f o r ( S t r i n g s : c )
System . o u t . p r i n t l n ( " s " ) ;</pre>
<p>Like most parallelizing compilers, JOMP also focus on loop-level and collection   parallelism, studying how to execute different iterations simultaneously. To   be parallelized, two iterations shouldn&#8217;t present any data dependency-that   is, neither should rely on calculations that the other one performs.</p>
<p>To write a JOMP program is not an easy work. First, you should familiar with   OpenMP directives, and familiar with JVM Memory Model&#8217;s mapping for those directives,   then know your business logic to put the right directives on the right places.</p>
<p>Another choice is to use <a href="http://www.cs.rit.edu/%7Eark/pj.shtml" target="_blank">Parallel     Java</a>.   Parallel Java, like JOMP, supports most features of OpenMP; but unlike JOMP,   PJ&#8217;s parallel constructs are obtained by instantiating library classes rather   than by inserting precompiler directives. Thus, &#8220;Parallel Java&#8221; needs no extra   precompilation step. Parallel Java is not only useful for the parallelization   on multiple CPUs, but also for the scalability on multiple nodes. The following   code is an example of &#8220;Parallel Java&#8221; programming.</p>
<pre><em>Code list 9:</em>
static double[][] d;
new ParallelTeam().execute (new ParallelRegion()
	{
	public void run() throws Exception
		{
		for (int ii = 0; ii &lt; n; ++ ii)
			{
			final int i = ii;
			execute (0, n-1, new IntegerForLoop()
				{
					public void run (int first, int last)
						{
						for (int r = first; r &lt;= last; ++ r)
                           {
                           for (int c = 0; c &lt; n; ++ c)
								{
								d[r][c] = Math.min (d[r][c],
								d[r][i] + d[i][c]);
								}
							}
						}
					});
				}
			}
		});</pre>
<h1>Scale Up to More Memory</h1>
<p>Memory is an important resource for your applications. Enough memory is critical     to performance in any application, especially for database systems and other     I/O-focused systems. More memory means larger shared memory space and larger     data buffers, to enable applications read more data from the memory instead   of slow disks.</p>
<p>Java garbage collection relieves programmers from the burden of freeing allocated   memory, in doing so making programmers more productive. The disadvantage of   a garbage-collected heap is that it will halt almost all working threads when   garbage is collecting. In addition, programmers in a garbage-collected environment   have less control over the scheduling of CPU time devoted to freeing objects   that are no longer needed. For those near-real-time applications, such as Telco   systems and stock trade systems, this kind of delay and less controllable behavior   are big risks.</p>
<p>Coming back to the question of whether Java applications scale by given more     memory, the answer is yes, sometimes. Too little memory will cause garbage     collection to happened too frequently. Enough memory will keep the JVM processing   your business logic most of time, instead of collecting garbage.</p>
<p>But it is not always true. A real world case in my laboratory is a Telco   system built on a 64-bit JVM. By using a 64-bit JVM, the application can   break the limit of 4GB memory usage found in a 32-bit JVM. It was tested   on a 4-CPU server with 16GB memory, and they gave 12GB memory to the Java   application. In order to improve the performance, they cached more than 3,000,000   objects in memory when initialization to avoid creating too many objects   when running. This product was running very fast during the first hour of   testing, then suddenly, system halted for more than 30 minutes. We had determined   that it was the garbage collection that stopped the system for half an hour.</p>
<p>Garbage collection is the process of reclaiming memory taken up by unreferenced     objects. Unreferenced objects are ones the application can no longer reach     because all references to them have gone out of extent. If a huge number     of live objects exist in the memory (just like the 3,000,000 cached objects),     the garbage collection process will take a long time to traverse all these   objects. That&#8217;s why the system halted for such a long and unacceptable time.</p>
<p>In other memory-centric Java applications tested in our laboratory, we found   the following characteristics:</p>
<ol>
<li> Every request processing action needed big and complex objects</li>
<li>It kept too many objects into HttpSession for every session.</li>
<li>The HttpSession timeout was too long, and HttpSession was not explicitly     invalidated.</li>
<li>The thread pool, EJB pool or other objects pool was set too large.</li>
<li>The objects cache was set too large.</li>
</ol>
<p>Those kinds of applications don&#8217;t scale well. When the number of concurrent       users increasing, the memory usage of those applications increases largely.       If large numbers of live object cannot be recycled in time, the JVM will       spend considerable time on garbage collection. On the other hand, if given       too much memory (in a 64-bit JVM), the JVM will still spend considerable time on garbage collection after running for a relatively long time.</p>
<p>The conclusion is that Java applications are NOT scalable by given too   much memory. In most cases, 3GB memory assigned to Java heap (through &#8220;-Xmx&#8221; option)                   is enough (in some operating systems, such as Windows and Linux,       you may not be able to use more than 2G memory in a 32-bit JVM). If you   have more memory than the JVM can use (memory is cheap these days), please       give the memory to the other applications within the same system, or just       leave it to the operating system. Most OSs will use spare memory as a data buffer and cache list to improve IO performance.</p>
<p>The Real Time JVM (JSR001) has the ability to let the programmer control   memory collection. Applications can use this feature to say to the JVM &#8220;Hi,   this huge space of memory is my cache, I will take care of it myself, please   don&#8217;t collect it automatically&#8221;. This functionality can make Java applications             scale on the huge memory resources. Hope JVM vendors will bring it   into the normal free JVM versions in the near future.</p>
<p>To scale these memory-centric Java applications, you need multiple JVM     instances, or multiple machine nodes.</p>
<h1>Other Scale Up Problems</h1>
<p>Some scalability problems in Java EE applications are not related to themselves.             The limitation from external systems sometime will become the bottleneck       of scalability. Such bottlenecks may include:</p>
<ul>
<li> Database management system:      This is the most common bottleneck for most of     enterprise and Web 2.0 applications, for the database is normally shared       by the JVM threads. So effectiveness of database access, and the isolation       levels between database transactions will affect the scalability significantly.     We have seen a lot of projects where most of the business logic resides     in the database in terms of stored procedures, while keeping the     Web tier very lightweight just to perform simple data filtering actions     and process the stored procedures in database. This architecture     is causing a lot of issues with respect to scalability as the number   of requests grow.</li>
<li> Disk IO and Network IO</li>
<li> Operating System:      Sometimes the scalability bottleneck may lie in the limitation of the operating                           system. For example, putting too many files under the       same directory can cause file systems to slow when creating and finding     a file.</li>
<li> Synchronous logging:      This is a common problem about scalability. In some of the cases,   the problem was solved by using a logging server such as Apache log4j.   Others have used JMS messages to convert synchronous logging to asynchronous   one.</li>
</ul>
<p>These are not only problems for Java EE applications, but for all systems                   on any platform. To resolve these problems need help from database           administrators, system engineers and network analyzers on all the levels of the systems.</p>
<p>The second installment of this article will discuss problems with   scaling horizontally.</p>
<h1>About         the Author</h1>
<p>Wang           Yu presently works for ISVE group of Sun Microsystems as a Java technology           engineer and technology architecture consultant. His duties include           supporting local ISVs, evangelizing and consulting on important Java           technologies such as Java EE, EJB, JSP/Servlet, JMS, Web services technologies.           He can be reached at wang.yu@sun.com.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ecomzera.com/2009/07/14/scaling-your-j2ee-applications-wang-yu/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>5 things to love (and hate) about IE8</title>
		<link>http://www.ecomzera.com/2009/03/18/5-things-to-love-and-hate-about-ie8/</link>
		<comments>http://www.ecomzera.com/2009/03/18/5-things-to-love-and-hate-about-ie8/#comments</comments>
		<pubDate>Wed, 18 Mar 2009 14:06:59 +0000</pubDate>
		<dc:creator>webmaster</dc:creator>
				<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://www.ecomzera.com/ez-blogs/ecomzera/?permalink=5-things-to-love-and-hate-about-IE8.html</guid>
		<description><![CDATA[IE8 is faster than FF and opera but it uses lots of system resource IE8 has a compatibility view mode which tries to render the page nicely but it won&#8217;t work in all cases while using web developer tool for debugging in system having 4gb RAM and 300GB HD got hanged&#8230;. Imagine in ordinary system [...]]]></description>
			<content:encoded><![CDATA[<ol>
<li>IE8 is faster than FF and opera but it uses lots of system resource</li>
<li>IE8 has a compatibility view mode which tries to render the page nicely but it won&#8217;t work in all cases</li>
<li>while using web developer tool for debugging in system having 4gb RAM and 300GB HD got hanged&#8230;. Imagine in ordinary system how it will be</li>
<li>Security is one which is going to help users</li>
<li>Some other added features in IE8 are domain name highlighting, Tab grouping, Restoring previous session and last but not least it won&#8217;t crash often like previous IEs</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.ecomzera.com/2009/03/18/5-things-to-love-and-hate-about-ie8/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tech Crunch Dead Pool</title>
		<link>http://www.ecomzera.com/2009/02/27/tech-crunch-dead-pool/</link>
		<comments>http://www.ecomzera.com/2009/02/27/tech-crunch-dead-pool/#comments</comments>
		<pubDate>Fri, 27 Feb 2009 09:18:15 +0000</pubDate>
		<dc:creator>webmaster</dc:creator>
				<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://www.ecomzera.com/ez-blogs/ecomzera/whatever/?permalink=Tech-Crunch-Dead-Pool.html</guid>
		<description><![CDATA[Here is the blog following the tech companies that went kaput. The first entry I see is as on June3, 2006.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.techcrunch.com/tag/deadpool">Here</a> is the blog following the tech companies that went kaput.  The first entry I see is as on June3, 2006.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ecomzera.com/2009/02/27/tech-crunch-dead-pool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Web Content Accessibility Guidelines (WCAG)</title>
		<link>http://www.ecomzera.com/2009/02/27/web-content-accessibility-guidelines-wcag/</link>
		<comments>http://www.ecomzera.com/2009/02/27/web-content-accessibility-guidelines-wcag/#comments</comments>
		<pubDate>Fri, 27 Feb 2009 08:02:06 +0000</pubDate>
		<dc:creator>webmaster</dc:creator>
				<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://www.ecomzera.com/ez-blogs/ecomzera/tech-talk/?permalink=Web-Content-Accessibility-Guidelines-WCAG.html</guid>
		<description><![CDATA[WCAG is part of a series of accessibility guidelines, including the Authoring Tool Accessibility Guidelines (ATAG) and the User Agent Accessibility Guidelines (UAAG). Essential Components of Web Accessibility explains the relationship between the different guidelines. We need to make sure that we use this information for all new UI that we are generating.]]></description>
			<content:encoded><![CDATA[<p>WCAG is part of a series of accessibility guidelines, including the Authoring Tool Accessibility Guidelines (ATAG) and the User Agent Accessibility Guidelines (UAAG). <a href="http://www.w3.org/WAI/intro/components">Essential Components of Web Accessibility</a> explains the relationship between the different guidelines.</p>
<p>We need to make sure that we use this information for all new UI that we are generating.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ecomzera.com/2009/02/27/web-content-accessibility-guidelines-wcag/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

