I’d like to raise an issue that was partially discussed in 2006 ( http://groups.google.co.uk/group/comp.lang.python/browse_thread/thread/1811df36f2a131fd/435ba1cae670aecf?lnk=st&q=python+iterators+duck+typing#435ba1cae670aecf ) with the half-promise that it would be revisited before Python 3000. Now’s the last chance.
What is Duck Typing? Ultimately, the goal is that if you do something stupid, Python will give you a big fat error message fairly soon after the stupid code was executed. Without effective duck typing, we’d be forced to put in lots of test code everywhere, something like
assert isinstance(x, list)
Doing so would be bad because our python would become cluttered and less able to be polymorphic/reused. Nuff said.
Duck Typing doesn’t distinguish lists and iterators
Now, where does duck typing fail in modern Python? In this case:
def foo(x): for i in x: doSomething(i) for i in x: somethingElse(i)
foo() is unsafe as part of any API because you never know whether someone is going to pass it a list or an iterator. For me, doing scientific programming, this is a very common use case.
doSomething() may collect statistics or look for bad data, then
somethingElse() does the main computation.
foo() is somehow passed an iterator, the second loop will fail silently, leading to much hair pulling and gnashing of teeth. Some might say “serves you right for making a mistake!”, but I’ve always suspected that such people say that to victims of traffic accidents, too.
Avoiding the problem
Of course there are ways to work around the problem. Using Java is one, adding assert statements is another, writing detailed docstrings is a third. However, none are nearly as good as duck typing. Adding
"x=list(x)" near the top of the function should work, but at a horrible cost in efficiency if it’s a big list.
It seems that the 2006 discussion barely missed the right solution:
- Create a new standard exception
IteratorExhausted;it will be a subclass of StopIteration.
StopIterationis raised when the iterator runs out of data. If it.next() is called again, then IteratorExhausted should be raised.
Forloops will be set to trap
IteratorExhaustedand raise an error (perhaps raise a
TypeError,“Iterator used in two for loops”).
This will reduce the transition difficulties to Python 3.0 due to changes of
zip() and other functions from lists to iterators.
Any code of the form
foo(filter(...)) or a few other things would become silently wrong in Python 3.0. With this modification, it will be noisy wrong. (Much better!)
IteratorExhausted is a subclass of
StopIteration, normal uses of
StopIteration will be unaffected. Code that sticks to the current PEP-234 will continue to work absolutely unchanged.
Code in the form below will fail noisily if it was intended to be used with current PEP-234 iterators and if the upper loop does not terminate early. (But it will work correctly if handed a list.)
def bar(x): for i in x: if someThing(i): break for i in x: anotherThing(i)
However, note that this code will give different results depending if it is passed an iterator or a list, so it’s somewhat dangerous anyway. I suspect this is a rare case compared to all the python 3.0 upheaval. However, it can be fixed fairly easily and efficiently by simply putting a
try...except statement around the second
I believe that it will add no silent failures to 2.5 code run on Python 3.0 and will convert many silent failures into noisy failures. In my book, that’s a Good Thing. Overall, I believe it will reduce the pain of Python 3.0.
This can also be found here.