Database CRUD Operation in Python- SQLite and Python

In this post we will use Python to work with data in SQLite database files. We will create table, insert records into it, update table and finally perform SELECT based on some condition.

Problem statement
: - Process this text file and count the number email messages per organisation (i.e. Domain name of the email address consider as unique organisation name)

Database Schema:-
Table with two columns ORGANISATION and COUNT of Emails.
< CREATE TABLE EmailCounts (ORGANISATION TEXT, COUNT INTEGER) >

SQLite database:- SQLite is a light weight relational database management system. In contrast to many other database management systems, SQLite is not a client–server database engine. Rather, it is embedded into the end program.(Source: Wiki)

How to connect SQLite database ? :-

Python provides support for SQLite database in sqlite3. First, we need to import this in and followed by connect method is used to connect database. The connect operation makes a “connection” to the database stored in the file "emaildb.sqlite" in the current directory. If the file does not exist, it will be created.
Using conn.cursor() we get a hadle of file (like a file handle we obtain using open() method). Once cursor(handle) is obtained, we can use to perform operations on the data stored in the database using execute() method. Below sample code demonstrate the same in sequence.

import sqlite3

conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS EmailCounts ')
cur.execute('CREATE TABLE EmailCounts (ORGANISATION TEXT, COUNT INTEGER)')

conn.close()

Now we will process the this text file which contains raw data and insert data into database. Consider following code lines for processing the raw data in file.
Raw data:-
From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008
Return-Path: <postmaster@collab.sakaiproject.org>
Received: from murder (mail.umich.edu [141.211.14.90])
by frankenstein.mail.umich.edu (Cyrus v2.3.8) with LMTPA;
Sat, 05 Jan 2008 09:14:16 -0500
X-Sieve: CMU Sieve 2.3
... ......   . ..
......  . ..............

We need to open file mbox.txt and read each line, when we get Line starting from From ...  process it and filter out domain name. If That domain name is not in Database Insert into database else update the count(increment count by 1).

Sample program to count emails corresponding to organisation

#http://www.pythonlearn.com/code/mbox.txt

import sqlite3
import sys

conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()

#drop table in database, if exists
cur.execute('DROP TABLE IF EXISTS EmailCounts ')

#create table EmailCounts
cur.execute('CREATE TABLE EmailCounts (ORGANISATION TEXT, COUNT INTEGER)')

if ( len(sys.argv) < 2 ) :
    print 'Invalid arguments, Input file missing, exiting !!'
    exit
filename = sys.argv[1]
filehandle = open(filename)
print "Processing input data file......"
for line in filehandle:
    if not line.startswith('From: ') : 
  continue
    emailPart = line.split()
    #stephen.marquard@uct.ac.za
    email = emailPart[1]
    org1=email.split("@")
    
    #stephen.marquard
    orgVal = org1[1]
    #print org1[1]
    cur.execute('SELECT COUNT FROM EmailCounts WHERE ORGANISATION = ? ', (orgVal, ))
    row = cur.fetchone()
    if row is None:
        cur.execute('''INSERT INTO EmailCounts (ORGANISATION, COUNT) 
                VALUES ( ?, 1 )''', ( orgVal, ) )
    else : 
        cur.execute('UPDATE EmailCounts SET COUNT=COUNT+1 WHERE ORGANISATION = ?', 
            (orgVal, ))
    # This statement commits outstanding changes to disk each 
    # time through the loop - the program can be made faster 
    # by moving the commit so it runs only after the loop completes
    conn.commit()

# https://www.sqlite.org/lang_select.html
sqlstr = 'SELECT ORGANISATION, COUNT FROM EmailCounts ORDER BY COUNT DESC LIMIT 10'

print "\n-------------------------"
print "Organization---Counts"
print "-------------------------"
for row in cur.execute(sqlstr) :
    print str(row[0]), row[1]

#close cur/handle 
cur.close()

Sample Output:-
[zytham@s158519-vm backup]$ /usr/bin/python emailCounts.py mbox.txt
Processing input data file......

-------------------------
Organization---Counts
-------------------------
iupui.edu 536
umich.edu 491
indiana.edu 178
caret.cam.ac.uk 157
vt.edu 110
uct.ac.za 96
media.berkeley.edu 56
ufp.pt 28
gmail.com 25
et.gatech.edu 17

9 Comments

  1. This comment has been removed by the author.

    ReplyDelete
  2. I appreciate that you produced this wonderful article to help us get more knowledge about this topic.
    I know, it is not an easy task to write such a big article in one day, I've tried that and I've failed. But, here you are, trying the big task and finishing it off and getting good comments and ratings.
    IELTS Coaching in chennai

    German Classes in Chennai

    GRE Coaching Classes in Chennai

    TOEFL Coaching in Chennai

    spoken english classes in chennai | Communication training

    ReplyDelete
  3. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. machine learning projects for final year In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.


    The Nodejs Projects Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

    ReplyDelete
  4. Great attempts are made to publish content that are quite useful to know. I'm really liking your blog. And positive comments foster positive relationships. You're doing a fantastic job. Keep up the good work. custom erp development

    ReplyDelete
Previous Post Next Post