Sep 15, 2016

Big Data: An overview of big data technologies learning and its importance (Need of hour technologies)

Do You Want to Become a Big Data Hadoop Expert?

Hadoop is one of the most sought-after skills today. Big data professionals are asked to prove their skills with tools and techniques of Hadoop stack. It has been observed by the market analyzers that though there are many Hadoop professionals in the market, still a huge number of skilled and trained professionals are required.

We are living in the world of Big Data revolution and every organization relies on collecting, analyzing, organizing and synthesizing a huge amount of data in order to survive in the competitive market. Every organization either government or private uses Big data to manage their products and services to attract new customers. In this blog, let’s know more about the career path of a Hadoop Expert.

Introduction to Big Data
The Big Data term is basically used to manage dataset collections, especially those that are complex and large. The sets that cannot be processed and stored with the help of traditional data management tools use Hadoop technology to process them. The main challenges of data processing that are observed include searching, sharing, curating, capturing, analyzing and transferring of stored data.

Following Listed 5 Vs Characterize The Data Processing Challenges:
1.     Volume: Volume refers to the huge amount of data, which keeps on growing day by day and becomes huge to process

2.     Variety: Presence of various data sources contributing to Big data can be from old databases or social media. Data can also be structured or unstructured.

3.     Velocity: The pace with which various data sources contribute to big data by generating traffic may be different. Big data has the power to manage the traffic and massive amount of data.

4.     Veracity: Sometimes data can be present or not so the uncertainty of data availability refers to data inconsistency and incompleteness refers to data veracity.

5.     Value: Though the massive amount of data is available throughout the data sources all of them is not valuable, so turning it to valuable data which can benefit the organization is important and done by Big data Hadoop.

Hadoop and It’s Architecture

Hadoop and its architecture consist mainly of two components that are NameNode and DataNode. Both are described below:

NameNode is the master daemon that is responsible to manage several files, clusters, file permission, hierarchy and every change made to the file system. As soon as a change in any file is made like if a file will be deleted then it will be immediately reflected in EditLog. Edit log basically receives a block report and heartbeat from all data nodes to make sure that data nodes are live.

DataNodes are daemon slave nodes, that run on slave machines. Actual data is stored on datanode and they are responsible to read and write client requests. As per NameNode decisions the data nodes can delete or replicate the blocks. For this YARN or Yet Another Resource Manager tool is used.

YARN Resource Manager
ResourceManager works at cluster level and runs on the master machine. Resource management and application scheduling are two of the main responsibilities of ResourceManager. Through YARN ResourceManager both of these tasks are managed.

YARN Node Manager
NodeManager component of YARN is Nodelevel and runs on every slave machine. Main responsibilities of NodeManager includes managing and monitoring of containers, it also manages logs and keeps track of node health. NodeManager continuously communicates with ResourceManager.

MapReduce is a core Hadoop component and provides processing logic. This software framework helps in writing applications that can process large data sets by using parallel and distributed algorithms in Hadoop environment. Functions like grouping, sorting, and filtering are performed by the map function and aggregation, summarization and result production are two of the main responsibilities of the map-reduce component.

Career Path  of a Hadoop Developer

There can be many challenges while starting a career as Hadoop developer. Here we have summarized some key factors for the path of Hadoop professionals, which will help you in shaping the career as a successful professional.

Required Educational and Technical Skill
The course can also be joined by non-technical candidates from the backgrounds like graduates of Statistics, Electronics, Physics, Mathematics, Business Analytics and Material Processing. As far as experience is concerned then newbies or less experienced professionals having 1-2 years of experience can become Hadoop developers. Mainly employers judge the candidates based on their knowledge and their zeal to learn the new concepts.

For technical experience technical knowledge of java concepts are required. Though for Hadoop you may not require possessing advance Java concept knowledge. Professionals from other streams also learn Hadoop and switch their career to this most in-demand platform.

Hadoop Certifications and Learning Path
One of the commonly seen questions among Hadoop developers is that “what is the value of certification for this profession?” With the certification, the candidate’s candidature can be judged or even verified. There are various Hadoop certifications available for Hadoop developers like of IBM, EMC, MapR and many more. One can apply and get certified in the technology easily.

As far as learning path for the developers is concerned then the candidates who fulfill the basic educational requirements either with or without relevant experience can apply for the position of Hadoop developer.

There are a number of companies that hire Hadoop professionals and are offered best salary packages. As the demand for Hadoop professionals is higher than availability so it has become the most sought-after skill among Hadoop professionals.

The research shows that a Hortonwork Hadoop professional can earn around $170,472, while Walmart is offering an average package of the $128K package to the Hadoop professionals in California. In the countries like the USA, Hadoop professionals are getting on an average $145K salary package. So you can sense the sensation of Hadoop profession these days.

Final Words

Those who have a passion for data analytics and statistics, Hadoop is one of the great choices for you. You can deep dive into the technology and it can prove as a lucrative career option for you. Good Luck!!

About Author
Manchun Pandit loves pursuing excellence through writing and have a passion for technology. he has successfully managed and run personal technology magazines and websites. he currently writes for, a global training company that provides e-learning and professional certification training.

Sep 13, 2016

Apache Hadoop Interview Questions - Set 1

1. What is Hadoop and how it is related to Big data ?
Answer:- In 2012, Gartner updated its definition as follows: "Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."
Hadoop is a framework that allows for distributed processing of large data sets across clusters of commodity system using various programming model like Mapreduce.(Commodity hardware is a non-expensive system without high-availability traits).
As business expand,volume of  data also grows and unstructured data is getting dumped into different machines for analysis.The major challenge is not to store large data but to retrieve and analyse the big data, that too data present in different machines geographically.
Hadoop framework comes here for rescue. Hadoop has the ability to analyse the data present in different machines at different locations very quickly and in a very cost effective way. It uses the concept of MapReduce programming model which enables it to process data sets in parallel.

2. What is Hadoop ecosystem and its building block elements ?
Answer:- The Hadoop ecosystem refers to the various components of the Apache Hadoop software library, as well as to the accessories and tools provided by the Apache Software Foundation for these types of software projects, and to the ways that they work together.
Core components of Hadoop are:
1. MapReduce - a framework for parallel prosessing vast amounts of data.
2. Hadoop Distributed File System (HDFS), a sophisticated distibuted file system.
3.YARN, a Hadoop resource manager.
In addition to these core elements of Hadoop, Apache has also delivered other kinds of accessories or complementary tools for developers. These include Apache Hive, a data analysis tool; Apache Spark, a general engine for processing big data; Apache Pig, a data flow language; HBase, a database tool; and also Ambarl, which can be considered as a Hadoop ecosystem manager, as it helps to administer the use of these various Apache resources together.

3. What is fundamental difference between classic Hadoop 1.0 and Hadoop 2.0 ?
Hadoop 1.X Hadoop 2.X
Limited up to 4000 nodes per cluster Potentially up to 10000 nodes per cluster
Supports only for MapReduce processing model. Along with MapReduce processing model, support added for other distributed computing models(non MR) like Spark, Hama, Giraph, Message Passing Interface) MPI & HBase co-processors.
Job tracker is bottleneck in Hadoop 1.x - responsible for resource management, scheduling and monitoring.(MR does both processing and cluster resource management.) YARN (Yet Another Resource Negotiator) does cluster resource management and processing is done using different processing models. Efficient cluster utilisation achieved using YARN.
Map Reduce slots are static. A given slots can run either a Map task or a Reduce task only. Works on concepts of containers. Using containers can run generic tasks.
Only one namespace for managing HDFS. Multiple namespace for managing HDFS.
Because of single NameNode it might lead of single point of failure and in case of NameNode failure, needs manual intervention. SPOF overcome with a standby NameNode and in case of NameNode failure, it is configured for automatic recovery.

4. What is Job tracker and Task tracker. How are they used in Hadoop cluster ?
Answer:- Job Tracker is a daemon that runs on a Namenode for submitting and tracking MapReduce jobs in Hadoop. Some typical tasks of Job Tracker are:
- Accepts jobs from clients
- It talks to the NameNode to determine the location of the data.
- It locates TaskTracker nodes with available slots at or near the data.
- It submits the work to the chosen Task Tracker nodes and monitors progress of each task by receiving heartbeat signals from Task tracker.
Task tracker is a daemon that runs on Datanodes. It accepts tasks like Map, Reduce and Shuffle operations - from a Job Tracker. Task Trackers manage the execution of individual tasks on slave node. When a client submits a job, the job tracker will initialise the job and divide the work and assign them to different task trackers to perform MapReduce tasks.While performing this action, the task tracker will be simultaneously communicating with job tracker by sending heartbeat. If the job tracker does not receive heartbeat from task tracker within specified time, then it will assume that task tracker has crashed and assign that task to another task tracker in the cluster.

5. Whats the relationship between Jobs and Tasks in Hadoop ?
Answer:-  In hadoop Jobs are submitted by client and Jobs are split into multiple tasks like Map, Reduce and Shuffle.

6. What is HDFS (Hadoop distributed file system)? Why HDFS is termed as Block structured file system ? What is default HDFS block size ?
Answer:- HDFS is a file system designed for storing very large files. HDFS is highly fault-tolerant, with high throughput, suitable for applications with large data sets, streaming access to file system data and can be built out of commodity hardware (Commodity hardware is a non-expensive system without high-availability traits).
HDFS is termed as Block structured file system because individual files are broken into blocks of fixed size (default block size of an HDFS block is 128 MB). These blocks are stored across a cluster of one or more machines with data storage capacity. Changing the dfs.blocksize property in hdfs-site.xml will change the default block size for all the files placed into HDFS.

7. Why HDFS blocks are large as compared to disk blocks (HDFS default block size is 128 MB and disk block size in Unix/Linux is 8192 bytes) ? 
Answer:- HDFS is more suitable for large amount of data sets in a single file as compared to small amount of data spread across multiple files.In order to minimise the seek time while read operation - files are stored in large chunks in order of HDFS block size.
If file size is smaller than 128 MB then file will just use its's own size on a given block, rest will e used by other files.
If a particular file is 110 MB, will the HDFS block still consume  128 MB as the default size?
No, only 110 MB will be consumed by an HDFS block and 18 MB will be free to store something else.
Note:- In Hadoop 1 - default block size is 64 MB and in Hadoop 2 - default block size is 128 MB

8. What is significance of fault tolerance and high throughput in HDFS ?
Answer:- Fault Tolerance: - In Hadoop, when we store a file, it automatically gets replicated at two other locations also. So even if one or two of the systems collapse, the file is still available on the third system.So, chance of data loss is minimised and data loss can be recovered if there is any failure at one node.
Throughput:- Throughput is the amount of work done in a unit time. In HDFS, when client submit a job- it is divided and shared among different systems. All the systems will be executing the tasks assigned to them independently and in parallel. So the work will be completed in a very short period of time. In this way, the HDFS provides good throughput.

9. What does "Replication factor" mean in Hadoop? What is default replication factor in HDFS ? How to modify default replication factor in HDFS ?
:- The number of times a file needs to be replicated in HDFS is termed as replication factor.
Default replication factor in HDFS is 3. Changing the dfs.replication property in hdfs-site.xml will change the default replication for all files placed in HDFS.
The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.
We can change the replication factor on a per-file basis and on all files in the directory using hadoop FS shell.
$ hadoop fs –setrep –w 3 /MyDir/file
$ hadoop fs –setrep –w 3 -R /RootDir/Mydir

10. What is Datanode and Namenode in HDFS ?
Answer:- Datanodes are the slaves which are deployed on each machine and provide the actual storage. These are responsible for serving read and write requests for the clients.
Namenode is the master node on which job tracker runs and stores metadata about actual storage of data blocks, so that it can manages the blocks which are present on the datanodes. It is a high-availability machine, Namenode can never be a commodity hardware because the entire HDFS rely on it so it has to be a high-availability machine.

11. Can Namenode and Datanode system have same hardware configuration ?
Answer:- In a single node cluster there is only one machine so Namenode and Datanode can be on same machine. However, in production environment Namenode and datanodes are on different machines. Namenode should be a high-end and high- availability machine.

12. What is the fundamental difference between traditional RDBMS and Hadoop?
Answer:- Traditional RDBMS is used for transnational systems ,whereas Hadoop is an approach to store huge amount of data in the distributed file system and process it.
RDBMS Hadoop
Data size are order of Gigabytes Data size are order of Petabytes or Zettabytes 
Access method support Interactive and batch Access method support batch only
Static schemaDynamic schema
Nonlinear scaling Linear scaling 
High integrity Low integrity 
Suitable for Read and write many timesSuitable for write once, multiple times

12. What is secondary Namenode and what is its significance in hadoop ?
:- In Hadoop 1, Namenode was single point of failure. In order to make hadoop system up and running it was important to make the Namenode resilient to failure and add ability to recover from failure. If Namenode fails, no data access is possible from datanodes, as Namenode stores metadata about data balock stores on datanodes.
The main file written by the NameNode is called fsimage; This file is read into memory and all future modifications to the filesystem are applied to this in-memory representation of the filesystem. The Namenode does not write out new versions of fsimage as new changes are applied after it is run; instead, it writes another file called edits, which is a list of the changes that have been made since the last version of fsimage was written.
Secondary Namenode is used to periodically merge the Namespace image with the edit log to prevent the edit log from becoming too large. The secondary Namenode usually runs on a separate physical machine because it requires plenty of CPU and as much memory as the Namenode to perform the merge. It maintains a copy of the merged namespace image, which can be used in the event of the Namenode failing. However, the state of the secondary Namenode lags that of the primary, so in the event of total failure of the primary, data loss is almost certain.
Note:- Secondary Namenode is not standby of primary Namenode, so it is not substitute of Namenode. Read in detail about Namenode,Datanode and Secondry Namenode and Internals of read and write operation in hadoop.

13. What is importance of heartbeat in HDFS ?
Answer:- A heartbeat is a signal indicating that it is alive. A datanode sends heartbeat to Namenode and task tracker will send its heart beat to job tracker.
If the Namenode or job tracker does not receive heart beat then they will decide that there is some problem in datanode or task tracker is unable to perform the assigned task. 14. What is HDFS cluster? Answer:- HDFS cluster is the name given to the whole configuration of master and slaves where data is stored. In other words, collections of Datanode commodity machine and High availability Namenode collectively termed as HDFS cluster. Read in detail about Namenode,Datanode and Secondry Namenode

14. What is the communication channel between client and namenode/datanode?
Answer:- The mode of communication is SSH.

15. What is a rack ? What is Replica Placement Policy ?
Answer:- Rack is a physical collection of datanodes which are stored at a single location. There can be multiple racks in a single location.
When client wants to load a file into the cluster, the content of the file will be divided into blocks and Namenode provides information about 3 datanodes for every block of the file which indicates where the block should be stored.
While placing the datanodes, the key rule followed is “for every block of data, two copies will exist in one rack, third copy in a different rack“. This rule is known as “Replica Placement Policy“.

Sep 4, 2016

Textual description of firstImageUrl

Bash Script: Generate Pascal triangle

Problem statement:- Write a bash script to print pascal triangle.
Input:- Number of row and
Wiki:- In mathematicsPascal's triangle is a triangular array of the binomial coefficients

#generate pascal triangle
echo -n "Enter the number of Row "
read NR

typeset -A arr
#declare -a arr

for i in `seq 0 $NR`;do
 arr[$i,0]=1 #start is 1
 arr[$i,$i]=1  #end is 1
 #for j in `seq 1 $p`;do 
  for ((j=1;j<$i;j++));do
#echo ${arr[$i]}
#print result
for ((i=0;i<=$NR;i++));do
  echo -n ${arr[$i,$j]} " "
 printf "\n"

Sample output:-
[zytham@s158519-vm scripts]$ sh
Enter the number of Row 4
1  1
1  2  1
1  3  3  1
1  4  6  4  1

Textual description of firstImageUrl

Bash Shell Script Sample Code - Part 2

1.  Write a bash script to find GCD and LCM of two number.

#find GCD  and LCM of two numbers
printf "Enter first  nuumber: "
read n1
printf "Enter second  nuumber: "
read n2
while [ $r -ne 0  ];do
  r=$(( n1%n2 ))
  if [ $r -eq 0 ];then
printf  "GCD of %d and %d is %d \n" $m $n $n2
printf  "LCM Of %d and %d is %d \n" $m $n $((($m*$n)/$n2))
Sample output:-
[zytham@s158519-vm scripts]$ sh 
Enter first  nuumber: 30
Enter second  nuumber: 12
GCD of 30 and 12 is 6
LCM Of 30 and 12 is 60

2.Write a bash script to find GCD and LCM , provided input is passed via command line while executing this script (Pass command line arguments). 
#find GCD of two numbers - input at runtime 
while [ $r -ne 0  ];do
  r=$(( n1%n2 ))
  if [ $r -eq 0 ];then
        #echo "GCD is $n2"
printf  "GCD of %d and %d is %d \n" $m $n $n2
printf "LCM Of %d and %d is %d \n" $m $n $((($m*$n)/$n2))           

Sample output:-
[zytham@s158519-vm scripts]$ sh 30 12
GCD of 30 and 12 is 6
LCM Of 30 and 12 is 60

Explanation:- Here 30 and 12 are input to the bash script. It is accessed by script using $1 and $2. Here n1 is 30 and n2 os 20. 

3. Write a bash script to find factorial of a number. 

#find factorial of a number - input Passed as command line arguments 
n=$1 #Input passed via command line is stored in n 
 for i in `seq 1 $n`;do
printf "factorial of number %d is:  %d\n" $n $fact

Sample output:-
[zytham@s158519-vm scripts]$ sh 5
factorial of number 5 is:  120
[zytham@s158519-vm scripts]$ sh 13
factorial of number 13 is:  6227020800

4. WABS to find Binomial co-efficient of C(M,N) - value of the coefficient is given by the expression .  
#find binomial coefficient of a two number - input Passed as command line arguments 

#function to find factorial of a number-Pass input argument to function
function factorial
#echo $1
 for i in `seq 1 $n`;do
 #printf "factorial of number %d is:  %d\n" $n $fact

#call function for finding factorial of M
factorial $1
#call function for finding factorial N
factorial $2
#call function for finding factorial M-N
factorial $dif

printf "Binomial coefficient of C($M $N) is %d \n" $bico

Sample output:-
[zytham@s158519-vm scripts]$ sh 9 3
Binomial coefficient of C(9 3) is 60480
[zytham@s158519-vm scripts]$ sh 8 4
Binomial coefficient of C(8 4) is 1680
[zytham@s158519-vm scripts]$ sh 5 2
Binomial coefficient of C(5 2) is 60

Explanation:- Two input is passed from command line and that is propagated to function one by one to find factorial of M , N and M-N. Similar to bash script function also access passed argument as $1.

5. WABS to check numbe is perfect number or not - input passed from command line. ( A perfect number is a positive integer that is equal to the sum of its proper positive divisors, that is, the sum of its positive divisors excluding the number itself ) 
#To check perfect number or not input via command line
function perfectNumCheck
for i in `seq 1 $iter`;do
  if (( $m%i==0)); then
      sum=$((sum + i))
if [ $sum -eq $m ]; then
echo "perfect"
echo "Not perfect"

perfectNumCheck $1
Sample output:-
[zytham@s158519-vm scripts]$ sh 4
Not perfect
[zytham@s158519-vm scripts]$ sh 6
[zytham@s158519-vm scripts]$ sh 28

6. WABS to generate fibonacci numbers. (Inpiut passed via command line for number of fibonacci to be generated)

#fibbonacci generation - Input via command line regarding how many fibbonacci number
function fibonacci
if [ $N -eq 1 ]; then
 printf 1
 printf $((a+b))
 printf "\t"
 while [ $N -ne 1 ];do
  printf $sum
  printf "\t"
fibonacci $1
printf "\n"

Sample output:-
[zytham@s158519-vm scripts]$ sh  5
1 1 2 3 5
[zytham@s158519-vm scripts]$ sh  8
1 1 2 3 5 8 13 21
[zytham@s158519-vm scripts]$ sh  9
1 1 2 3 5 8 13 21 34

7. WABS to check given number is Armstrong number or not.
#check armstrong number 
echo -n "Enter number : "
read n
while [ $n -gt 0 ]; do
 (( rem= n%10 ))
 ((sum=$sum+(rem**3) ))
 # echo $sum 
if [ $input -eq $sum ];then
echo "$input is an armstrong number "
echo "$input is not armstrong number "

Sample output:-
[zytham@s158519-vm scripts]$ sh
Enter number : 153
153 is an armstrong number
[zytham@s158519-vm scripts]$ sh
Enter number : 145
145 is not armstrong number

Sep 2, 2016

Vector Arithmetic in Python : Dot product and Cross product

Write a sample program to perform Addition(+), Subtraction(-), Dot product,Cross product between two vectors. Also find angle between two vectors.

from math import *

class vector:
    def __init__(self, x, y, z):
        self.x = float(x)
        self.y = float(y)
        self.z = float(z)
    def dot(self, other):
        return self.x*other.x + self.y*other.y + self.z*other.z
    def cross(self, other):
        return vector(self.y*other.z-self.z*other.y, self.z*other.x-self.x*other.z, \
    def mod(self):
        return pow(self.x**2+self.y**2+self.z**2, 0.5)
    def __sub__(self, other):
        return vector(self.x-other.x, self.y-other.y, self.z-other.z)
    def __add__(self, other):
        return vector(self.x+other.x, self.y+other.y, self.z+other.z)
    def __str__(self, precision=2):
        return str(("%."+"%df"%precision)%float(self.x))+'i'+('+' if self.y>=0 else '-')+ \
                str(("%."+"%df"%precision)%float(abs(self.y)))+'j'+('+' if self.z>=0 else '-')+\
if __name__ == "__main__":
    print "Enter x,y,z(separeated by space) value of vector-1" 
    A = vector(*map(float, raw_input().strip().split()))
    print "Enter x,y,z(separeated by space) value of vector-2" 
    B = vector(*map(float, raw_input().strip().split()))
    print "A + B: " + str(A+B)
    print "A - B: " +str(A-B)
    print "A[.]B: " + str(
    print "A[X]B: " + str(A.cross(B))
    print "Modulas of A (|A|): " + str(A.mod())
    print "Modulas of B (|B|): "+ str(B.mod())
    print "Angle between then in radian is %.2f"%degrees(acos(*B.mod())))
Sample output:-
Enter x,y,z(separated by space) value of vector-1
2 4 -5
Enter x,y,z(separated by space) value of vector-2
-1 3 -2
A + B: 1.00i+7.00j-7.00k
A - B: 3.00i+1.00j-3.00k
A[.]B: 20.0
A[X]B: 7.00i+9.00j+10.00k
Modulas of A (|A|): 6.7082039325
Modulas of B (|B|): 3.74165738677
Angle between then in radian is 37.17

Here we have used operator overloading concept to add/subtract vectors. Refer this for another example of operator overloading in Python.In order to find radian value of angle between two vector used math.acos(x) which return the arc cosine of x, in radians.

Operator overloading in python: Python Arithmetic on Complex numbers

Write a sample program to perform Addition(+), Subtraction(-), Multiplication(*),Division(/) between complex numbers. And also perform Mod(underRoot(Square(real)+square(imaginary))) of complex number. 
from math import pow
class complex:
    def __init__(self, real, imag):
        self.real = real
        self.imag = imag
    def __add__(self, other):
        return complex(self.real+other.real, self.imag+other.imag)
    def __sub__(self, other):
        return complex(self.real-other.real, self.imag-other.imag)
    def __mul__(self, other):
        return complex(self.real*other.real-self.imag*other.imag, self.real*other.imag+self.imag*other.real)
    def __div__(self, other):
            return self.__mul__(complex(other.real, -1*other.imag)).__mul__(complex(1.0/(other.mod().real)**2, 0))
        except Exception as e:
            print e
            return None
    def mod(self):
        return pow(self.real**2+self.imag**2, 0.5)
    def __str__(self, precision=2):
        return str(("%."+"%df"%precision)%float(self.real))+('+' if self.imag>=0 else '-')+str(("%."+"%df"%precision)%float(abs(self.imag)))+'i'

print "Enter real and imaginary part of complex No - 1(separeated by space)" 
A = complex(*map(float, raw_input().strip().split()))
print "Enter real and imaginary part of complex No - 2(separeated by space)" 
B = complex(*map(float, raw_input().strip().split()))

print "Addition: " + str(A+B)
print "Subtraction: " + str(A-B)
print "Multiplication: " + str(A*B)
print "Division: "+ str(A/B)
print "Modulas of complex Number A: " + str(A.mod())
Sample output:-
Enter real and imaginary part of complex No - 1
3 4
Enter real and imaginary part of complex No - 2
2 3
Addition: 5.00+7.00i
Subtraction: 1.00+1.00i
Multiplication: -6.00+17.00i
Division: 1.38-0.08i
Modulas of complex Number A: 5.0

Python provides ability to override inbuilt arithmetic methods like __add__(),__sub__(), etc methods. It is similar to overriding __str__() method to print formatted output(like we have done in above code).For more detail refer this.

What happens internally when we perform + operation between two objects(Remember primitive integer is object in python)?- When we sum two objects in Python, like
a + b
the __add__ method of the a object is called: a.__add__(b)
it means that we can control the result of a sum of two objects by modifying or defining our own __add__ method.