What was a rather dull and boring day turned into an interesting one thanks to Python!
I was basking in my holiday mood as I prepare to leave London after 1.5 years of highly satisfying development project. One of my team mates is doing a small project that involves taking a file, extracting the fields and creating a HTML report out of it. It was decided to use a transformation tool that we are using in another part of our project. This tool claims to be able to transform any input schema to any output schema. Our input schema is a raw flat file which looks like:
header
record
footer
header
record
record
...
footer
Pretty simple file - but we realized that the transformer tool is not able to handle this. I cannot discuss the specifics of the tool or the file, but it is a file that confirms to a widely used standard format. While we engaged the technical support to solve this, my manager started exploring alternatives. I looked at the file and said we could do it in Python using some basic regular expressions and string operations.
He let me give it a try, and here is what I did.
1. read the file and use regex to identify the kind of record represented by the line we read
2. extract the fields in each line using the simple string[x:y] operation
3. add these fields to a dict
4. create a list of each type of record and put the dict on that list
5. then using Cheetah, render a HTML page that can show this
Because I used the lists to hold the dicts, most of my operations in the template was looping over the list and printing a values from the dict. The data was part of the header records was put in global variables and passed to the template.
At the end of the day, in about 200 lines (thanks to me putting all that coding standards and OOPS into the code) we had a credible Plan B that worked. Also we could tweak this is a few minutes to work with another file that we needed to use!
So now in case we are left in the lurch by the technical support team, we have something that will work!
Tuesday, 7 April 2009
Sunday, 22 March 2009
Issue with Ant Copy task
This is something that I faced a few months ago. We had spent a few weeks developing build scripts for our deployment. We were entering the period where we prepared things for our final go live, and one of the suggestions was that we build a deployment script that could replace the current one. The existing one did not create a single tar file that could be extracted, it created two files - one with all the property files for all the services to be deployed and one with the jar files. Extracting these two one after the other was all that was needed, but it was not elegant.
So we asked a developer to merge these two scripts and thought that would be it. On the day we did the deployment to DEV, the services refused to start. Reason - corrupted files. We switched to plan B - used the old scripts and viola! the services start!
We duly fired a failure mail and decided to dissect things the next day. And after half a day when I figured it, I kicked myself.
Ant has a copy task that lets you perform token replacement. This basically lets you prepare files that can vary depending on environment etc and set the values when building the script. This is how it is used
Apparently, this task does not differentiate between ASCII and binary files. It does the replacement on ANY file. Our original scripts were separate. So the jar files were copied in a copy task that did not do filtering. When we merged the scripts, we put all the files into a single task.
Once I moved the jar files and the property files into separate tasks, the scripts worked fine.
So it was a simple Ant task that was the issue!!
So we asked a developer to merge these two scripts and thought that would be it. On the day we did the deployment to DEV, the services refused to start. Reason - corrupted files. We switched to plan B - used the old scripts and viola! the services start!
We duly fired a failure mail and decided to dissect things the next day. And after half a day when I figured it, I kicked myself.
Ant has a copy task that lets you perform token replacement. This basically lets you prepare files that can vary depending on environment etc and set the values when building the script. This is how it is used
< copy todir='../backup/dir'>
< fileset dir='src_dir'/>
< filterset>
< filter token='TITLE' value='Foo Bar'/>
< /filterset>
< /copy>
Apparently, this task does not differentiate between ASCII and binary files. It does the replacement on ANY file. Our original scripts were separate. So the jar files were copied in a copy task that did not do filtering. When we merged the scripts, we put all the files into a single task.
Once I moved the jar files and the property files into separate tasks, the scripts worked fine.
So it was a simple Ant task that was the issue!!
Sunday, 15 March 2009
Street photography
I was watching a programme called "The Genius of Photography"on BBC today. It gave me an insight into photography back in the old days. I loved all those black and white photos. It told me that I need to look at photography to capture life around me, how things are, how the world looks like. That's a good tip that I can use.
But there is a catch - I cant do street photography without getting into trouble. Just yesterday, I took my camera to work so that I could take a few shots around my office. I took my camera out in the evening and before I could take a picture, a security guard came to me and told me I could not take a picture without a pass. What? I was clicking pictures of a building and I got told off. What if I try to take a few pictures of people on the street? I am sure they will report me or take my film or my memory card.
The reason why people do that is not important to me. I just am not comfortable with the fact that I cannot just go out there and capture the scenes that I like. I guess I live in a different time now and I might need a license to own a camera one day.
Friday, 13 March 2009
Python Script: Converting strings to camel case
A simple script to convert strings to camelCase
import sys
def processTokens(tokens):
result='';
for token in tokens:
if token is not None:
result=result+token.title()
return result
def processString(string,separator=' '):
li=string.split(separator)
if li is not []:
result=li[0].lower()
result=result+processTokens(li[1:])
return result
def getCamelCase(string,separator=' '):
return processString(string,separator)
if __name__=="__main__":
if sys.argv.__len__()<3:
print 'usage: camelcase [input filename] [output filename]'
else:
f=open(sys.argv[1],'r')
o=open(sys.argv[2],'w')
for line in f:
o.write(getCamelCase(line))
f.close()
o.close()
Tuesday, 10 March 2009
Python Script: Extract strings matching a regex from a file
A simple python script to extract strings matching a regex from a file.
import re
import sys
#initialize with some default pattern
mainpattern = re.compile('.*')
def matchLine(line):
result=mainpattern.search(line)
if result is not None:
if result.group() is not None:
print result.group()
def matchLines(inputfile):
for line in inputfile:
matchLine(line)
def processFile(inputfilename):
inputfile=open(inputfilename,'r')
matchLines(inputfile)
def main():
processFile(sys.argv[1])
if __name__=="__main__":
if sys.argv.__len__()<3:
print 'usage: match [infile] [pattern]'
else:
mainpattern=re.compile(sys.argv[2])
main()
Sunday, 8 February 2009
Timeline of my life
You can get your timeline at this site
| 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 |
| EpochOn the 11th day of the 11th month, when the sun is in Scorpio and doctors are irritated and tired, a 5 kilo baby is born ( 1982 - 1982 ) | |||||||||||||||||||||||||||
| The sleeping tigerThe boy grows up, learning thinsg fast and easy, apple of all eyes and troubling one and all ( 1982 - 1985 ) | |||||||||||||||||||||||||||
| The beginning of schoolingHe is admitted to school at the young age of 2.5 years, the telugu speaking principal has high hopes, but has a change of heart after he runs all over the school screaming and tells the principal rceipies for food that mom cooked at home. That was the beginning at Tiny Tots School Duliajan Assam ( 1985 - 1985 ) | |||||||||||||||||||||||||||
| Initial schoolingSettled down at Tiny Tots, studied well and moved at Kendriya Vidyalaya. Not sure or cannot recollect how much he studied but he did do quite afew things that are better not told at home, was not a bad boy buty was not innocent either ( 1985 - 1992 ) | |||||||||||||||||||||||||||
| The urge to excelStared getting interested in studies, painting etc etc and begins to have dreams ( 1992 - 1994 ) | |||||||||||||||||||||||||||
| The studious boyMoved to Hyderabad, learnt proper telugu, got class first, was even the leader of a class with 47 girls and 7 boys, and did everything a good boy would do. The evil streak died and he became a saint ( 1994 - 1998 ) | |||||||||||||||||||||||||||
| Pre universityRealized how much he doesnt know, and how scred he is - embarks on a journey to rectify things ( 1998 - 2000 ) | |||||||||||||||||||||||||||
| The computer bugWas intoduced to the C programming language by a cousin , is hooked immediately ( 1994 - 1994 ) | |||||||||||||||||||||||||||
| The computer bug #2Is good at C now, decides to look towards Microsoft now, and starts using Visual Basic ( 1999 - 1999 ) | |||||||||||||||||||||||||||
| The computer bug #3By the end of the milennium sets sights on Java, and gets the first computer he can play with without worries about spoiling it ( 2000 - 2000 ) | |||||||||||||||||||||||||||
| EngineeringSwam through electronics and commnunications engineering, not sure whether he likes circuits or microprocessors, made friends and learnt many lessons of life, is officially labelled \"ITEM\" for being the odd one out in the class ( 2000 - 2004 ) | |||||||||||||||||||||||||||
| The computer bug #4Has mastered Java, 8051 assembly and whole lot of things and thinks can conquer the world now ( 2004 - 2004 ) | |||||||||||||||||||||||||||
| On the jobRealized there is a lot of things to be learnt on the job, is good at the job and is doing quite well, waiting to get married in August ( 2004 - 2009 ) | |||||||||||||||||||||||||||
| Important Event #1In January decides to quit Infosys and join Oracle, but sticks back - not sure why ( 2007 - 2007 ) | |||||||||||||||||||||||||||
| Important Event #2In Feb starts blogging and writing supercoder stories ( 2007 - 2007 ) | |||||||||||||||||||||||||||
| Impotant Event #3In July asks the most important question ever asked: \"which building dear?\" ( 2007 - 2007 ) | |||||||||||||||||||||||||||
| Important Event #4In August life changed completely when on Aug 15 he got engaged to his sweetheart ( 2008 - 2008 ) | |||||||||||||||||||||||||||
| Mega Event!!Getting married on Aug 2 2009 ( 2009 - 2009 ) | 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 |
Saturday, 7 February 2009
Extracting review comments with Python
Often I need to exchange code over email with temmates who cannot access our source control repository due to network issues. In many cases, especially after a code review, I find that they have removed the review comments that I put in and I have no way to figure where I commented on what. This is also partly due to the not-so-good practice where I put the comments in the code using '//@@' as an indicator. To get over such cases, I wrote a simple Python script that reads all my Java source files and extracts these lines with '//@@' and puts them in a text file. This text file then serves as a rough guideline for me to review the code again.
Here is the script, hope it proves useful to someone else.
Let me know how useful you find it!
Here is the script, hope it proves useful to someone else.
import re
import os
class Parser:
def __init__(self,pattern,outfilename):
self.pattern=re.compile(pattern)
self.outfile=open(outfilename,'a')
def walkDir(self,base_dir):
for t in os.walk(base_dir):
for f in t[2]:
self.processFile("/".join((t[0], f)))
self.outfile.close()
def processFile(self,filename):
self.outfile.write('Processiing file :'+filename+'\n')
self.outfile.write('=============================\n')
file=open(filename,'r')
for line in file:
if(self.pattern.match(line) is not None):
self.outfile.write(line+'\n')
if __name__=="__main__":
p = Parser('^.*//@@.*','testresult.txt')
p.walkDir('./code_with_review_comments')
Let me know how useful you find it!
Subscribe to:
Posts (Atom)
