[Tutor] findall() (original) (raw)

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Mon Jul 5 02:22:35 CEST 2004


On Sat, 3 Jul 2004 Dragonfirebane at aol.com wrote:

> > I'm trying to get the following code to work:

[text cut]

Hi Dragonfirebane,

[Mailing list ettiquette issue: Please quote only the part of message that you're responding to; you're making it difficult for me to find your response within the older message. Whenever you're replying to someone else's message on a mailing list, try to be concise in consideration for the others on the list.]

> I hope you don't mind if I split it up into multiple functions? It > may help us to see what the code is really doing. Here's a start:

[refactored code cut]

That's fine. As long as it still does the same thing, it makes little difference.

Actually, it doesn't. Work the same way, that is. grin

The way I refactored it is deliberately broken, in the sense that the second and third subfunctions should raise a particular NameError. Did you try running the new code yet?

I hinted at the bug earlier with:

Each function does almost the same thing, but with suble differences. In fact, the variation in the three blocks is almost certainly buggy; the third block tries to do a findall() search, but even before it reads in 'logain'.

In any case, I strongly recommend that you use the refactored code so that we can a common base to talk.

Here's the code again, but slightly revised so that the NameErrors shouldn't occur. I'm also cutting out the third subfunction for the moment:

def getlog(activity, type, endtype): tryToPrintActivityLine(activity) ## this works tryToPrintTypeLine(type) ## this doesn't work yet.

def tryToPrintActivityLine(activity): log = open("Multivert_Log.txt","r") for line in log.readlines(): ac = re.compile(activity) fac = re.findall(ac, line) if fac: print line break else: print "Uh-oh." log.close()

def trytoPrintTypeLine(type): logagain = open("Multivert_Log.txt","r") ty = re.compile(type) fty = [] ## buggy line modified for line in logagain.readlines(): if not fty: fty = re.findall(ty,line) elif fty: print line break logagain.close() ###

Once we get the program working to detect the two lines you want, we can then augment the working program to detect the math equation.

The expected output was:

CALCULATION EXPONENTS 1 ** 1 = 1 because each time a line is printed, the for loop is broken and no more lines are printed, with the exception of the first time (where if fac is not immediately reached, "Uh-oh." is currently printed, merely so i can see that the findall() is working the way i want it to.)

Ok, now we have a goal: we want our program to output that expected text.

The value of doing that refactoring of that getlog() function into three smaller functions,

def getlog(activity, type, endtype): tryToPrintActivityLine(activity) ## this works tryToPrintTypeLine(type) ## this doesn't work

is this: since we know that the first part is working, we can now concentrate your efforts on the second part, and we can do our debugging on the second function alone, in isolation from the first subfunction.

There are some possible reasons why the 'EXPONENTS' line isn't being printed. If we look at the second subfunction 'tryToPrintTypeLine()':

def trytoPrintTypeLine(type): logagain = open("MultivertLog.txt","r") ty = re.compile(type) fty = [] for line in logagain.readlines(): if not fty: fty = re.findall(ty,line) elif fty: print line break logagain.close()

we may want to make sure that the right 'type' variable is being passed to this function. If the 'type' is incorrect, then the function won't print a thing, since the regular expression is incorrect. Are you passing the uppercased string "EXPONENTS" as the 'type', or are you passing the lowercased string "exponents"?

Also, the following point is important: the loop logic is broken in the sense that the code, in effect, will always skip looking at the last line of input. Imagine what happens if "Multivert_log.txt" contains just two lines of input, like:

CALCULATION
EXPONENT

If you trace out the code, you will see that the loop does not provide a chance to print out the 'EXPONENT' line, even if a match is made, because the status report on the search is done on all lines, up to --- but not including! --- the last line of the log file.

The defect in the loop here probably doesn't account for 'EXPONENTS' not being printed, but it DOES account for the equation '1 ** 1 = 1' not being printed. The same looping bug occurs in all three subfunctions.

Rewrite the loop to:

def trytoPrintTypeLine(type): logagain = open("Multivert_Log.txt","r") ty = re.compile(type) for line in logagain.readlines(): fty = re.findall(ty,line) if fty: print line break logagain.close() ###

Not only does the code end up shorter, but it's more correct: it doesn't skip the last line of the file, and if a match occurs on the last line, this code has the opportunity to report that to the user.

Does this make sense? Please feel free to ask any questions on this. Good luck to you!



More information about the Tutor mailing list