Friday, April 20, 2012

Igpay Atinlay

def pig_latinize(text):
    '''
    Returns the Pig Latin version of the supplied English text.
    @return (string) Pig Latin
    '''
    import string

    new_text = ''

    for word in text.split():
        punctuation_mark_begin = ''
        punctuation_mark_end   = ''

        if word[-1] in string.punctuation:
            punctuation_mark_end   = word[-1]
            word = word[:-1]

        if word[0]  in string.punctuation:
            punctuation_mark_begin = word[0]
            word = word[1:]

        all_caps   = word == word.upper()
        title_caps = word[0] in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

        word = word.lower()

        new_word = None

        m = len(word)

        if m < 3 or word in ['and', 'the']:
            new_word = word + 'kay'

        else:
            vowel_offsets = [word.find(v) for v in 'aeiouy' if (word.find(v) != -1)]

            m = 0 if (len(vowel_offsets) == 0) else min(vowel_offsets)

            if m == 0:
                new_word = word + 'way'
            else:
                new_word = word[m:] + word[:m] + 'ay'

        if all_caps and (len(new_word) > 1):
            new_word = new_word.upper() 

        elif title_caps:
            new_word = new_word[0].upper() + new_word[1:]

        new_text += punctuation_mark_begin + new_word + punctuation_mark_end + ' '

    return new_text[:-1]


INKAY ONGRESSCAY, ULYJAY 4KAY, 1776WAY Thekay unanimousway Eclarationday ofkay thekay irteenthay unitedway Atesstay ofkay Americaway Enwhay inkay thekay Oursecay ofkay umanhay eventsway itkay ecomesbay ecessarynay orfay oneway eoplepay tokay issolveday thekay oliticalpay andsbay ichwhay avehay onnectedcay emthay ithway anotherway andkay tokay assumeway amongway thekay owerspay ofkay thekay earthway, thekay eparatesay andkay equalway ationstay tokay ichwhay thekay Awslay ofkay Aturenay andkay ofkay Ature'snay Odgay entitleway emthay, akay ecentday espectray tokay thekay opinionsway ofkay ankindmay equiresray atthay eythay ouldshay eclareday thekay ausescay ichwhay impelway emthay tokay thekay eparationsay.

Oodgay enoughway.

Thursday, November 17, 2011

str() is the Devil

Recently I discovered the folly of using str() within Python 2.7 scripts. My program would ingest arbitrary UTF8 text from outside sources and then try to print it to a file, only to crash with a UnicodeDecodeError or UnicodeEncodeError exception. In good time, I realized how to do it The Right Way and converted all my str() calls to unicode().

But the error kept occurring. It took me quite a while to remember that

foobar = ''

is functionally equivalent to

foobar = str()

This problem shouldn't happen in Python 3, since all strings are Unicode by default there.

Thursday, September 15, 2011

How to turn a list outside-in



def turnListOutsideIn(aList):
	"""
	Returns a new list containing the 1st element, then the last,
	then the 2nd, then the next-to-last, etc.
	"""

	# create a reversed copy of the original list
	revList = list(aList)
	revList.reverse()

	# zip them together as a list of tuple pairs
	zippedList = zip(aList, revList)

	# flatten the list
	flatList = [inner for outer in zippedList for inner in outer]

	# return the first half of the list
	return flatList[0:len(aList)]


digits = range(10)
 
print outsideInList(digits)
[0, 9, 1, 8, 2, 7, 3, 6, 4, 5]

Sunday, September 11, 2011

September 11th, 2001

Today is the 10th anniversary of the attacks on the World Trade Center, Pentagon, and wherever it was that the hijackers of United Airlines flight 93 had intended to strike.

My girlfriend and I, living in Los Angeles, were asleep at 6:50am when a relative called and made us turn on our television. We were watching live coverage when the second plane struck the North Tower.

After the shock wore off, I decided I should go to work. I was a software tester at Symantec Corporation, working on security utilities. I figured that in case there were a cyberwarfare component to the attack, I wanted to be on hand to help out in any way that I could. But my manager sent us all home, the handful who showed up.

So I went back home and spent the rest of the day glued to the TV, like everybody else.

Tuesday, July 26, 2011

Don't use JSON for config files.

Why not? Because JSON does not allow comments.

That is all.

Saturday, May 14, 2011

Process pipelines in Python

It took me long enough to figure this out, I figure somebody else might benefit from my effort.

When you're working in Python and you wish to launch an external program and capture the output, there's an easy solution: you use popen from the subprocess module. All well and good. But suppose you need to fire up a pipeline of external programs, with the output of the first program being piped to the input of the second, and so on. Something like this:

$ cat /etc/passwd | grep -E '^root:' | tr "a-z" "A-Z"
ROOT:*:0:0:SYSTEM ADMINISTRATOR:/VAR/ROOT:/BIN/SH

The answer is still popen, but things get a little complicated if you want to solve the general problem of hooking up n processes into a pipeline. Here's my solution.

def launchProcessPipeline(cmdList):
import fcntl, time

totalCmds = len(cmdList)
if totalCmds > 0:

procs = []

try:
for i in range(totalCmds):
currPipe = None

if i > 0:
currPipe = procs[i-1].stdout

procs.append(subprocess.Popen(cmdList[i], stdout=subprocess.PIPE, stdin=currPipe))

# set stdout file descriptor to nonblocking
flags = \
fcntl.fcntl(procs[-1].stdout.fileno(), fcntl.F_GETFL)
fcntl.fcntl(procs[-1].stdout.fileno(), fcntl.F_SETFL, (flags | os.O_NDELAY | os.O_NONBLOCK))

except:
raise

if len(procs) == totalCmds:
return procs[-1]

return None


def pollProcess(proc):
import select
output = None

# wait 1 millisecond and check whether proc has written anything to stdout
readReady, _, _ = select.select([proc.stdout.fileno()], [], [], 0.001)

if len(readReady):

try:
for line in iter(proc.stdout.readline, ""):

if output is None:
output = ''

output += line

except IOError:
# Ignore any I/O errors reading from the pipe, which are infrequent but not rare.
pass

return output

And here's how you call it:

cmdlist = [
['cat', '/etc/passwd'],
['grep', '-E', '^root:'],
['tr', 'a-z', 'A-Z']
]
proc = launchProcessPipeline(cmdlist)
print pollProcess(proc)


>>> ROOT:*:0:0:SYSTEM ADMINISTRATOR:/VAR/ROOT:/BIN/SH

Sunday, October 24, 2010

How Not to Lie on Your Resume, Part 1

The other day we were interviewing a prospective Senior QA Analyst. I noted with satisfaction that at his present job (going back two years), he listed Perl under "Programming Languages". This makes me happy because I use Perl regularly to automate testing tasks.

When it came time for me to ask questions, I posed a simple task: Write a function in Perl which takes a multiline string as input, and returns an array of strings consisting of each valid IP address from the input.

He didn't know where to start. When I tried to jog his memory by asking whether a regexp would be a good way to extract the IPs, he admitted being completely unaware of the term Regular Expression. Oh my.

He finally took a stab at it by walking the input string one character at a time, looking for the first period and then backtracking to see if the previous char was a numeral. And then capturing subsequent periods and digits, until finally counting the number of captured periods and digits. A truly terrible solution.

So, tip #1 for job seekers who don't want to look like a fraud in your interview:

Under the description for your current job, only list programming languages you actually do know.