Recently, I had a bug that was difficult to trace down. I had a double list (somelist[][]), that I removed some of the elements using the remove() method. I noticed that not all of the elements were being removed. In fact, the code was removing only every other element. The bug turned out to be the call to remove() would shorten the list by one, and thus cause a skipping effect. For example:
somelist = [['a','b'],['c','d'],['e','f']]
for i in somelist:
if re.match(regexToMatch, i):
somelist.remove(i)
I needed to somehow to rewind the list if remove() was called. At first, I thought this would be perfect for recursion.
def removeProcesses(theList):
for (hostname,process) in theList:
for line in excludeProcesses:
try:
if re.search(line.strip(), process):
theList.remove([hostname,process])
removeProcesses(theList)
except ValueError:
continue
return(theList)
Ah... recursion, sometimes elegant, sometimes a performance nightmare . Every function call causes a context switch, so for this reason recursion should be used on small data sets. The code above worked great on a small subset of data, but did not scale to what was needed for the project. I replaced the for loop with a while and a counter. When a match is found, count is subtracted by 1 so every element in theList is evaluated. This also removes any duplicates in the list.
def removeProcesses(theList):
count = 0
while count <= len(theList) - 1:
hostname = theList[count][0]
process = theList[count][1]
count += 1
try:
if re.search(excludeProcess, process)
theList.remove([hostname, process])
#reduce the count by 1 which resolves the "skipping" bug.
count -= 1
except ValueError:
continue
return(theList)
This document was generated using AFT v5.096