Saturday, 27 August 2016

This week 13/2016

This week I was creating a functionality. An independent service to export large part of data to MS Excel file and part of service which retrieve data from database. It is obvious that xls format can contain about 65 thousand of rows, so I decided to use xlsx format which I thought it is unlimited but about this it will be later. My requirement was to export from database to excel a large set of data and not kill application. 

First of all I focused on the output. The solution was to not use a XSSFWorkbook but SXSSFWorkbook. In my application currently I use old version of Apache POI v3.7 and there isn't implemented SXSSFWorkbook so in this case there is impossible to solve my problem. SXSSFWorkbook is available from v3.8.
However what I could do after I upgrade a libraries? I checked and it is possible to export huge part of data using less then 64MB heap memory.  The SXSSFWorkbook implementation can save simple data in a stream. Process of creating file is split into two phases. In the first phase implementation is saving processed data into temporary file (on linux it is /tmp/.... file). In the second phase temporary xml file is compressed with additional files containing styles and other information into final file.

By the way I found out that xlsx is not unlimited and every sheet can have maximum a little more then one million rows (2^20) and about 16 thousand columns (2^14).

After I had found out how to export large volume of data to xlsx I looking for solution how to retrieve data from database row by row. I'd like to separate input from output service. I created interface of DataProvider and injected there a RowMapper and other types used in NamedParameterStatement's query method but it doesn't work. Finally I used a ScrollableResultSet with Forward option and limitation of retrieved data at once and it works.


Saturday, 20 August 2016

This week 12/2016

I watched a few presentation about jvm and a garbage collection. Today I will write about topics which are new for me.

I didn't know that:...

1. JVM have a lot of parameters (a few hundreds) and some of them are manageable. It is possible to change their state in runtime (about hundred).

2. Bytecode is compiled by JIT into processor code in runtime but there are a few types of compiled code, depending of:
- free memory for compiled code - every platform have other size of memory
- frequency of usage - often used part of code is better optimised then used onece,
- count of processors and cores,
- complication of code's part.

3. Log4J retrieve logged line of code by dumping a stack trace. It is an expensive operation.
4. New one GC called G1 solve a fragmentation of data by splitting memory area into blocks.

Saturday, 6 August 2016

This week 11/2016

This week I was interested in collecting statistic from my application. I have never been doing that and I think a few times, it could be helpful to resolve some problems. Currently there is other motivation - from it depends my half year premium.

Ok, so I have my Java application and what's next?
I can measure ex. how is used cache, heap memory, CPU and etc.
How can I get this information? 
I can log it to file or other adapter or I can serve it by JMX MBean. Logging to file takes hard drive memory and it can be weight. JMX works as JMS service but it's allow to change some application parameters and is light.

Anyway I started from log statistic in file. What I did?

JVM:
Everything about jvm memory and threads is in class ManagementFactory with static methods. Only you have to dump it to log. MBeans are by default registered in JMX Server too.
I didn't find counter of blocked threads and not dead locked, so I prepared my own method to update MBean as below.

        ThreadInfo[] infos = ManagementFactory.getThreadMXBean().dumpAllThreads(true, true);
        int blocked = 0;
        for(ThreadInfo ti : infos){
            if(Thread.State.BLOCKED.equals(ti.getThreadState())){
                blocked++;
            }
        }
        mbean.setBlocked(blocked);


Hibernate statistic:
I had to add in configuration parameter hibernate.generate_statistics and than I could get Statistics interface and enable collecting statistic in SessionFactory.

statistics = sessionFactory.getStatistics();
statistics.setStatisticsEnabled(true);
All values are available in Statistics objec.

If you need JMX you have to registry Statistic object as MBean in JMX Server.



EHCache statistic
To get EHCache statistics it is required add statistics attribute to every cache container definition.
statistics="true"

 Then it is possible to get statistics from every cache managers.
I noticed when there is a few cache factories, it is needed to set them not shared. Only cache shared with hibernate should be shared. Otherwise I couldn't get to cache manager for hibernate. I saw only one which I defined in spring configuration.

When I had two different configurations, finally two cache managers. I had to share only one which was based on hibernate ehcache file. Others shouldn't be shared otherwise I couldn't get reference to hibernate cache manager.


By the way I will write how to registry your own MBean on your JVM. What you need is to get JMX service and register your MBean. MBean class have to implement interface with postfix MBean. All method included in interface are available from JMX console. Getters presents values, setters change it and other methods can be executed from jconsole. Example code is shown below.

    MyHello mbean = new MyHelloMBean();
    MBeanServer mbs =  ManagementFactory.getPlatformMBeanServer();
    ObjectName name = new ObjectName("com.example:type=Hello");
    mbs.registerMBean(mbean, name);