May 29, 2016

Textual description of firstImageUrl

Byte code manipulation using Java instrumentation and jboss Javassist

In previous post we discussed how to use Java instrumentation to find size of object. In this post we will discuss how to modify Java byte codes using Transformer class (An agent implementing ClassFileTransformer interface) and jboss Javassist byte code modification library.
Lets understand importance of transformer class, how to register transformer class using instrumentation instance and finally how does it help in byte code instrumentation.

Transformer class:- A class which implements ClassFileTransformer interface is called transformer class.Transformer class implements transform() method. Transformer class is registered with instrumentation instance so that any further class loading(after Agent class by system class loader) invokes transform method of this transformer class. Signature of transform method is as follows :
public byte[] transform(ClassLoader loader, String className, Class classBeingRedefined,
ProtectionDomain protectionDomain, byte[] classfileBuffer){
           // returns  modified bytecode of class  or null, if not modified.
 }

Classloader invokes transform method for each class being loaded and passes information of classloader name which loaded this class or null if loaded by bootstrap classloader, class name and bytecode of class file.
Note:- Byte code of class file passed by class loader in the form of  "byte[] classfileBuffer" is not modified directly. We make a copy of classfileBuffer, modify it and return to class loader and class loader loads this modified byte code of class file.

What is uses of javassist.jar :- This library provides high level API to modify byte code of class file so that behaviour of class file can be changed dynamically(either by adding new methods or modifying existing one). We have another library in the form of Apache BCEL, ASM, etc which serves same purpose but in different way.
Bytecode modification framework can be broadly classified in two categories - One which provides high level API, that allows us to get away from learning low level opcodes and JVM internals (e.g, javaassist and CGLIB) and another one low level frameworks when we need to understand JVM or use some bytecode generation tools (ASM and BCEL).

Byte code injection demonstration:- 
In order to show how byte code injection works with javassist library we are creating a java project with following classes and manifest file as created in previous post(Fundamental of Instrumentation).

Create a class "InstrumentationAgent.java" and copy following code lines in it.(Adjust package name accordingly)

package com.devinline.instrumentation;

import java.lang.instrument.Instrumentation;
import java.lang.instrument.UnmodifiableClassException;

public class InstrumentationAgent {
 /*
  * System classloader after loading the Agent Class, then invokes the
  * premain (premain is roughly equal to main method for normal Java classes)
  */
 public static void premain(String agentArgs, Instrumentation inst) {
   inst.addTransformer(new Transformer());
 }

 public static void agentmain(String agentArgs, Instrumentation inst)
   throws ClassNotFoundException, UnmodifiableClassException {

 }
}
Create another class "Transformer.java" which implements ClassFileTransformer interface and its transform() method. Use following code lines.
package com.devinline.instrumentation;

import java.io.ByteArrayInputStream;
import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.security.ProtectionDomain;

import javassist.ClassPool;
import javassist.CtClass;
import javassist.CtMethod;

public class Transformer implements ClassFileTransformer {

 @Override
 public byte[] transform(ClassLoader classLoader, String className,
   Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
   byte[] classfileBuffer) throws IllegalClassFormatException {
  byte[] modifiedClassByteCode = classfileBuffer;
  if (className.equals("com/devinline/client/SampleClass")) {
   try {
    ClassPool classPool = ClassPool.getDefault();
    CtClass ctClass = classPool.makeClass(new ByteArrayInputStream(
      classfileBuffer));
    CtMethod[] methods = ctClass.getDeclaredMethods();
    for (CtMethod method : methods) {
     if (method.getName().equalsIgnoreCase("method2")) {
      System.out.println("Start Instrumentation in method " + method.getName());
      method.addLocalVariable("elapsedTime", CtClass.longType);
      method.insertBefore("elapsedTime = System.currentTimeMillis();");
      method.insertAfter("{elapsedTime = System.currentTimeMillis() - elapsedTime;"
        + "System.out.println(\"Method Executed in ms: \" + elapsedTime);}");
     }

    }
    modifiedClassByteCode = ctClass.toBytecode();
    ctClass.detach();
    System.out.println("Instrumentation complete.");
   } catch (Throwable ex) {
    System.out.println("Exception: " + ex);
    ex.printStackTrace();
   }
  }
  return modifiedClassByteCode;
 }

}
Now create "SampleClass.java" whose methods are modified by transform method, when this class is loaded by classloader and transform method is invoked.
package com.devinline.client;

public class SampleClass {
 public void method1() throws InterruptedException {
  // randomly sleeps between 1000ms and 30000ms
  long randomSleepDuration = (long) (1000 + Math.random() * 2000);
  System.out.printf("Sleeping for %d ms ..\n", randomSleepDuration);
  Thread.sleep(randomSleepDuration);
 }

 public void method2(String input) throws InterruptedException {
  // randomly sleeps between 1000ms and 30000ms
  long randomSleepDuration = (long) (1000 + Math.random() * 2000);
  System.out.printf("Sleeping for %d ms ..\n", randomSleepDuration);
  Thread.sleep(randomSleepDuration);
 }

}
Finally, create a main class from where class loading starts and triggers method executions of sample class and verify that method has been instrumented of not.
package com.devinline.client;

public class TestInstrumentation {

 public static void main(String args[]) throws InterruptedException {
  SampleClass l = new SampleClass();
  System.out.println("Executing method1(Not instrumented) ");
  l.method1();
  System.out.println("Executing method2(Instrumented) ");
  l.method2("Hello");

 }
}

Update MANIFEST.Mf with property "premain-class" and "Boot-Class-Path". premain-class indicates the Agent class name to JVM and Boot-Class-Path is used to provide access of external jar (e.g: javassist.jar). Create a file manifest.txt and add following two entries in it:
premain-class: com.devinline.instrumentation.InstrumentationAgent
Boot-Class-Path: ../lib/javassist.jar
Note:- we have created lib folder in java project directory and placed javassist.jar inside it.

Create agent jar file
:- 
 Execute following command from bin directory and create agent jar which will be passed to JVM instance with -javaagent:<agent.jar>.
> jar -cmf manifest.txt agent.jar com
It will create a jar executable jar file in bin directory with name agant.jar.

Execute main method with agent.jar  

Use follwoing command to run main method of TestInstrumentaion with agent.jar
> java -javaagent:agent.jar -cp . com.devinline.client.TestInstrumentation

Referring above diagram of program execution, we can see that method2 is instrumented and execution time of method 2 is printed and it is more than sleeping time 
Textual description of firstImageUrl

Java instrumentation fundamentals(Part 1): Find size of Object using instrumentation

Java instrumentation was introduced in java 5 and it gives developer flexibility to get handle of bytecode of a class loaded by JVM(classloader). Once we get hold of bytecode, it can be manipulated using library like Apache BCEl, ASM, etc. and returned back to JVM. It can summarized as - get access of byte code , append custom code lines at run time and returned modified bytecode to JVM so that client will not be able to know whether byte code has been modified.
Instrumentaion finds its uses in various tools like monitoring agents, profilers, coverage analyzers, and event loggers.
In this post we will get overview of Java Instrumentation and does a simple application of it - How to find size of object and later discuss constraint associated in finding object size (What happens if object contains reference of another object).

Lets start with fundamental of Java Instrumentation - According to Java doc, there are two ways to obtain an instance of the Instrumentation interface:

1. Loading agent class via Command-Line Interface :- When JVM is launched in a way that indicates an agent class and in that case an Instrumentation instance is passed to the premain method of the agent class.
Using command-line interface with option to the command-line:
-javaagent:jarpath[=options] where jarpath is the path to the agent JAR file and options is the agent options. This switch may be used multiple times on the same command-line, thus creating multiple agents.
2. Lazy loading of Agent class (Starting Agents After VM Startup):- When a JVM provides a mechanism to start agents sometime after the JVM is launched. In that case an Instrumentation instance is passed to the agentmain method of the agent code.

Agent class and its methods(premain and agentmain):-
An agent is deployed as a JAR file. An attribute in the JAR file manifest specifies the agent class which will be loaded to start the agent. The manifest of the agent JAR file must contain the attribute Premain-Class or Agent-Class depending on implementation which we are flowing to obtain instrumentation instance.
Agent class is entry point of instrumentation, when an instance of JVM is created it first looks for Agent class and system class loader loads this agent class. Based on the option 1 or 2 discussed above premain or agentmain is executed and handle of Instrumentation object is passed to these method.
Signatures of the premain method :
public static void premain(String agentArgs, Instrumentation instObj){ }
public static void premain(String agentArgs){ }
Signatures of the agentmain method :
public static void agentmain(String agentArgs, Instrumentation instObj){ }
public static void agentmain(String agentArgs) { }
When the agent is started using a command-line option (with -javaagent), the agentmain method is not invoked and When the agent is started after JVM startup the premain method is not invoked, agentmain is invoked. Here premain/agentmain method is like main method of any class in java and agentArgs has similar characteristics of "String[] args" of  main method. 

Manifest – manifest.mf file :- Following discussion is using approach 1(Loading agent class via Command-Line Interface).
We know that JVM should be informed about Agent classs that it can be loaded by system class loader. We inform JVM about agent class - using “premain-class” property in manifest.mf file. While preparing jar file we include premain-class property with fully classified name of Agent class in manifest.mf. Sample manifest file with premain-class property :
Manifest-Version: 1.0
Build-Jdk: 1.7.0_25
premain-class:  com.devinline.instrumentation.AgentClassName

These are the building blocks which are required to get instance of Instrumentation instance from JVM. Now we will use the instrumentation instance received from JVM to find size of object.

Note
:- In next post we will see how we can use Instrumentation instance for byte code instrumentation and understand extensive flexibility provided by this instrumentation instance using javaassist.

Create a class "InstrumentationAgent.java" and copy following code lines in it.(Adjust package name accordingly)
package com.devinline.instrumentation;

import java.lang.instrument.Instrumentation;
import java.lang.instrument.UnmodifiableClassException;

public class InstrumentationAgent {
 /*
  * System classloader after loading the Agent Class, invokes the premain
  * (premain is roughly equal to main method for normal Java classes)
  */
 private static volatile Instrumentation instrumentation;

 public static void premain(String agentArgs, Instrumentation instObj) {
  // instObj is handle passed by JVM
  instrumentation = instObj;
 }

 public static void agentmain(String agentArgs, Instrumentation instObj)
   throws ClassNotFoundException, UnmodifiableClassException {

 }

 public static long findSizeOfObject(Object obj) {
  // use instrumentation to find size of object obj
  if (instrumentation == null) {
   throw new IllegalStateException("Agent not initialised");
  } else {
   return instrumentation.getObjectSize(obj);
  }
 }
}
Create another class "TestInstrumentation.java" for finding size of object of class "SampleClass" having int, String and Long data types.
package com.devinline.client;

import com.devinline.instrumentation.InstrumentationAgent;

public class TestInstrumentation {

 public static void main(String[] args) {
  long sizeOfObject = InstrumentationAgent
    .findSizeOfObject(new SampleClass(12, "Hello", (long) 2345));
  System.out.println("Size of object is " + sizeOfObject);
 }
}

class SampleClass {
 int number;//4 bytes 
 String name; //  12 bytes (char value[]; 4 bytes int offset; // 4 bytes int count; // 4 bytes)
 Long ssn; //8 bytes

 public SampleClass() {
 }

 public SampleClass(int number, String name, Long ssn) {
  super();
  this.number = number;
  this.name = name;
  this.ssn = ssn;
 }
}
Create "manifest.txt" in bin directory and make an entry in this file (Add property "premain-class")
premain-class: com.devinline.instrumentation.InstrumentationAgent

Jar(javaAgent) file creation:-
Create jar file with these class files and manifest.txt using following command.(Execute follwoing command from bin directory)
 C:\JavaInstrumentationPart1\bin> jar -cmf manifest.txt agent.jar com
Above command create agent.jar and MANIFEST.MF is updated with property "premain-class".

Execute agent.jar with following command
> java -javaagent:agent.jar -cp . com.devinline.client.TestInstrumentation
Size of object is 24

Limitation of finding size of Object using above approach:-
If Object contains reference of another Object then inner object size cannot be computed, it will be treated just as reference.

How to find size of object which contains reference of another object ?

Using classmexer (a utility developed using instrumentation)- a wrapper over InstrumentationAgent.findSizeOfObject().

In next post, we will explore instrumentation concept and see how byte code can be manipulated using it - how to use ClassFileTransformer interface and create Transformer class to register with instrumentation agent.

May 28, 2016

What is significance of __name__ in python

In python modules can be loaded by importing in some other module or executed directly as standalone program. So, how does python detect at run time that a particular module was imported  or it was executed directly as standalone application.
Simple answer is : using system defined variable __name__.

When a module is imported  __name__ variable contains module name.However, __name__ contains __main__ if executed directly.

Illustration:-
Create a file name "parentmodule.py" with following code lines.
def importedmodule():
    print "I have been imported."

def directexecution():
    print "I have run as application."
    
#Executed directly as application
if __name__ == '__main__':
    directexecution()
else:
    importedmodule()

Create another file "test.py" with following code lines

import parentmodule

def importedmodule():
    print "I have been imported."

def directexecution():
    print "I have run as application."
    
Now run parentmodule.py directly, it will call method directexecution(). However, if we run test.py - importedmodule() method is executed, because parentmodule has been imported in test.py so __name__ is not equal to __main__.

Sample output:-
C:\Python27\pgm>python parentmodule.py
I have run as application.

C:\Python27\pgm>python test.py
I have been imported.

May 22, 2016

Level order traversal of Binary tree in Python

Level order traversal in python
Input:-
     5 
   3     7
2    4  6   8
Level order traversal of BST:  5 3 7 2 4 6 8
import sys

class Node:
    def __init__(self,data):
        self.right=self.left=None
        self.data = data
class Solution:
    def insert(self,root,data):
        if root==None:
            return Node(data)
        else:
            if data<=root.data:
                cur=self.insert(root.left,data)
                root.left=cur
            else:
                cur=self.insert(root.right,data)
                root.right=cur
        return root
    def levelOrder(self,root):
        items = []
        count=0
        items.insert(count,root)
        elements =""
        while items != []:
            temp = items.pop()
            elements= elements+str(temp.data)+ " "
            if temp.left!=None:
                items.insert(0,temp.left)
            if temp.right!=None:
                items.insert(0,temp.right)
        print "Level order traversal of BST: "+ elements 
print "Enter numer of elements to be added in tree: "  
N=int(raw_input())
myTree=Solution()
root=None
print "Enter elements: "
for i in range(N):
    data=int(raw_input())
    root=myTree.insert(root,data)
myTree.levelOrder(root)
Sampel output:-
>>>
7
5
3
7
2
4
6
8
Level order traversal of BST: 5 3 7 2 4 6 8

May 21, 2016

Textual description of firstImageUrl

Python 2.0 list comprehensions - Sample examples

Comprehensions are constructs that allow sequences to be built from other sequences. List comprehensions was introduced in Python 2.0. The concise and clean code of list comprehension inspired to introduce dictionary and set comprehensions.

Components of List comprehensions :- 

1. An Input Sequence  - From which new sequence is built
2. A variable representing members of the input sequence.
3. An conditional expression [ Optional ]
4. An output expression producing elements of the output list - all elements of input sequence which satisfies conditional expression, if available. 
List comprehension structure

Examples:- 

1. Use a list comprehension print even numbers from 0 to n(n is some arbitrary value). 
>>> values = [x for x in range(0, n+1) if x%2==0]
>>> values
[0, 2, 4]
Here range(0, n+1) is input sequence, x is variable representing members of the input sequence, if x%2==represents conditional expression and x at beginning generates output sequence.

2. Using list comprehension print the Fibonacci Sequence in comma separated form for given input n.
def f(n):    
    if n == 0: 
            return 0    
    elif n == 1: 
            return 1    
    else: 
            return f(n-1)+f(n-2)
n=int(raw_input()) 
values = [str(f(x)) for x in range(0, n+1)] 
print ",".join(values)
O/P:- 4
0,1,1,2,3
Here str(f(x)) (converting int to string) generates output sequence as list.

3.Using list comprehension, write a program to print the list after removing the 0th, 2nd, 4th , 6th elements in [2,14,45,19,18,10,55].
li = [12,24,35,70,88,120,155] 
li = [x for (i,x) in enumerate(li) if i%2!=0] 
print li
O/P:-  [24, 70, 120]

4. Using list comprehension, write a program to print the list with numbers which are divisible by 5 and 7 in [12,24,35,70,88,120,155]
>>> li =[12,24,35,70,88,120,155]
>>> li = [x for x in li if x%5 ==0 and x%7 ==0]
>>> li
[35, 70]

Note:- Python 3.0 supports Set and Dict comprehensions too.

May 15, 2016

Height of Binary Search Tree in python

Find Height if Binary search tree in Python 

Input: 

      5 
   3     7
2    4  6   8
Height of tree is : 3
class Node:
    def __init__(self,data):
        self.right=self.left=None
        self.data = data
class Solution:
    def insert(self,root,data):
        if root==None:
            return Node(data)
        else:
            if data<=root.data:
                cur=self.insert(root.left,data)
                root.left=cur
            else:
                cur=self.insert(root.right,data)
                root.right=cur
        return root
        
    def getHeight(self,root):
        m = A()
        temp = m.getHeightUtil(root)
        return temp 
class A:
    def method1(self,root):
        print root.data
        
    def getHeightUtil(self,root):
        if root == None:
            return 0
        lh = self.getHeightUtil(root.left)
        rh = self.getHeightUtil(root.right)
        if lh>rh:
            return lh+1
        else:
            return rh+1
T=int(raw_input())
myTree=Solution()
root=None
for i in range(T):
    data=int(raw_input())
    root=myTree.insert(root,data)
height=myTree.getHeight(root)
print "Height of tree is: " + str(height)       
Sample output:-
>>>
7
5
3
7
2
4
6
8
Height of tree is: 3

May 13, 2016

Database CRUD Operation in Python- SQLite and Python

In this post we will use Python to work with data in SQLite database files. We will create table, insert records into it, update table and finally perform SELECT based on some condition.

Problem statement
: - Process this text file and count the number email messages per organisation (i.e. Domain name of the email address consider as unique organisation name)

Database Schema:-
Table with two columns ORGANISATION and COUNT of Emails.
< CREATE TABLE EmailCounts (ORGANISATION TEXT, COUNT INTEGER) >

SQLite database:- SQLite is a light weight relational database management system. In contrast to many other database management systems, SQLite is not a client–server database engine. Rather, it is embedded into the end program.(Source: Wiki)

How to connect SQLite database ? :-

Python provides support for SQLite database in sqlite3. First, we need to import this in and followed by connect method is used to connect database. The connect operation makes a “connection” to the database stored in the file "emaildb.sqlite" in the current directory. If the file does not exist, it will be created.
Using conn.cursor() we get a hadle of file (like a file handle we obtain using open() method). Once cursor(handle) is obtained, we can use to perform operations on the data stored in the database using execute() method. Below sample code demonstrate the same in sequence.

import sqlite3

conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS EmailCounts ')
cur.execute('CREATE TABLE EmailCounts (ORGANISATION TEXT, COUNT INTEGER)')

conn.close()

Now we will process the this text file which contains raw data and insert data into database. Consider following code lines for processing the raw data in file.
Raw data:-
From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008
Return-Path: <postmaster@collab.sakaiproject.org>
Received: from murder (mail.umich.edu [141.211.14.90])
by frankenstein.mail.umich.edu (Cyrus v2.3.8) with LMTPA;
Sat, 05 Jan 2008 09:14:16 -0500
X-Sieve: CMU Sieve 2.3
... ......   . ..
......  . ..............

We need to open file mbox.txt and read each line, when we get Line starting from From ...  process it and filter out domain name. If That domain name is not in Database Insert into database else update the count(increment count by 1).

Sample program to count emails corresponding to organisation

#http://www.pythonlearn.com/code/mbox.txt

import sqlite3
import sys

conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()

#drop table in database, if exists
cur.execute('DROP TABLE IF EXISTS EmailCounts ')

#create table EmailCounts
cur.execute('CREATE TABLE EmailCounts (ORGANISATION TEXT, COUNT INTEGER)')

if ( len(sys.argv) < 2 ) :
    print 'Invalid arguments, Input file missing, exiting !!'
    exit
filename = sys.argv[1]
filehandle = open(filename)
print "Processing input data file......"
for line in filehandle:
    if not line.startswith('From: ') : 
  continue
    emailPart = line.split()
    #stephen.marquard@uct.ac.za
    email = emailPart[1]
    org1=email.split("@")
    
    #stephen.marquard
    orgVal = org1[1]
    #print org1[1]
    cur.execute('SELECT COUNT FROM EmailCounts WHERE ORGANISATION = ? ', (orgVal, ))
    row = cur.fetchone()
    if row is None:
        cur.execute('''INSERT INTO EmailCounts (ORGANISATION, COUNT) 
                VALUES ( ?, 1 )''', ( orgVal, ) )
    else : 
        cur.execute('UPDATE EmailCounts SET COUNT=COUNT+1 WHERE ORGANISATION = ?', 
            (orgVal, ))
    # This statement commits outstanding changes to disk each 
    # time through the loop - the program can be made faster 
    # by moving the commit so it runs only after the loop completes
    conn.commit()

# https://www.sqlite.org/lang_select.html
sqlstr = 'SELECT ORGANISATION, COUNT FROM EmailCounts ORDER BY COUNT DESC LIMIT 10'

print "\n-------------------------"
print "Organization---Counts"
print "-------------------------"
for row in cur.execute(sqlstr) :
    print str(row[0]), row[1]

#close cur/handle 
cur.close()

Sample Output:-
[zytham@s158519-vm backup]$ /usr/bin/python emailCounts.py mbox.txt
Processing input data file......

-------------------------
Organization---Counts
-------------------------
iupui.edu 536
umich.edu 491
indiana.edu 178
caret.cam.ac.uk 157
vt.edu 110
uct.ac.za 96
media.berkeley.edu 56
ufp.pt 28
gmail.com 25
et.gatech.edu 17