Take that, Python.

Yes, I’ve started writing code in Python. The big breakthru for me was realizing that I can overcome one of my worst issues with python, the lack of visible scope annotation, with comments.

def myfunction(somearg):
#{
    print("Yep this is my function")
#}

How come I’m programming in Python at all?

I’ve finally seen the attraction of modern day Python: It’s not the language itself, but the vast array of things Python has its hooks into and which Python – like PHP – gives you sudden and wonderful access to.

Sudden and wonderful because the APIs are usually very clean, while my beloved Perl requires you to perform gymnastic semantics to get things up and running.

With Perl, you pay up front for smoother sailing down the line – you get more mileage out of your Perl “use” than your Python “import”, in the long run. Python programmers just find the next module to “import” and accept bloat instead of leg work.

Perl used to have CPAN for obtaining modules, but my experience the last few years has been that CPAN modules are increasingly abandoned, copious in number and wanting in documentation, perhaps as a result of the CPAN/RPM conflict (local OS bundling of CPAN modules via RPM and DEB packages has caused many an issue).

Python does’t have anything that truly compares with what CPAN was in its heyday. Any 3 Python programmers can give you 5 or more different ways to “easily” get modules on your system, which will work wonderfully for about 15% of the modules you’re going to wind up trying to install.

But I’ve yet to run into a Python module that introduced anything like the nightmarish dependency trees that some basic CPAN modules have thrown at me lately.

There is still, for me, the issue that the Python language is kind of awful.

Lua and Python are both capable of “supporting OOP”, the way the average human colon is capable of “support a pool cue”. In Lua it means pissing about with metatables.

Both languages strive for cleanliness of presentation. If you write really, really simple code, both languages pull this off quite nicely. But the moment you start trying to do anything not utterly-freaking-trivial all kinds of syntactic excrement crawls out of the woodwork.

Python started out by trying to reduce the amount of markup symbols being used, to make the code look more like text. In C/C++/Java you end a given code statement with a semicolon (;). In Python, just you hit return.

// C version
printf("Hello world\n") ;
# Python version
print("Hello world\n")

You can also use semicolons in Python, but the end-of-line character doubles as an end-of-statement character.

In most cases…

And herein is the rub for me, Python is sometimes smart enough to determine that you can’t possibly have finished a statement, such as

# This will work...
listOfThings = [
'tea',
'milk',
'cookies'
]

# This won't
emptyList =
[
]

# You'd have to write this
emptyList = \
[
]

Now this is minor and petty, but it was almost the very first thing I ran into. For those of you thinking I’m nit picking, that “\” character goes totally against the spirit of “avoid using symbol characters all over the place” and running into it so early on felt a bit like a betrayal of the ideal.

For me the next issue is the whole indentation stuff. Python uses white space counts (how many spaces or tabs infront of a line) to tell when you are continuing lines of text.

We humans do this with written text, for example in email we indent quotes and we distinguish between quote and reply by levels of indentation.

The trouble is, you’re not quoting someone else’s replies here, you are creating the whole thing.

My first take on this has been: Well then make sure you don’t have lots and lots of levels of indentation in your code; make copious use of subroutines.

Sadly: Python has a really expensive overhead for function calls.

The indentation concept works fine when you have only one or two levels, but when you start getting to 6 and more, it actually starts to get a bit difficult to track what indentation level is where, or where you *intended* for indentation to be.

Going back to the mail-quoting analogy, very few people use *just* white space for quoting replies. Most people use some kind of markup, a “> ” or a “| ” or some fancy HTML indentation.

Go dig up a deep-quoting email from your inbox and strip out the markup and look at as just pure white-space indented text and try following that. Bet you’re scratching your head within a few minutes. And yet, Python programmers voluntarily inflict this upon themselves for writing mission-critical computer code. OMFG.

    if something that you need to test for the start condition:
        if not assignedValue:
            if AlternateCalculationAvailable():
                assignedValue = AlternateCalculation()
            if not assignedValue:
                assignedValue = ThirdCalculation()
            if not assignedValue:
                assignedValue = DatabaseValue()
                if not assignedValue:
                    assignedValue = DefaultValue()
                    if assignedValue > MinimumValue():
                        if assignedValue >MaximumValue():
                            assignedValue = MaximumValue()

        if assignedValue < MinimumAllowedValue():
            alternateValue = CalcluateForMinimum(assignedValue)

Now – did I mean for that last test to be out-dented so many levels? Ok – that code is contrived, but I’ve seen plenty of code that looks like it. It’s *begging* for some way to allow you to more easily define scope levels, other than functions.

Lastly, I find the Python language to be crazily out of touch with it’s own paradigm of clean looking code: when laziness has won over, Python resorts to symbols; when verbosity has won over, Python resorts to serious verbosity.

Python uses the “:” character to separate pairs of items. It’s consistent about this through the language. To specify characters 5 thru 10 of “theString”, you write theString[5:10], for a dictionary (hash) you write hash = { ‘a’:1 } etc.

And so it uses it also between a conditional and compound statement, between a function/class declaration and the body. While this particular nuance of language design makes sense if you burrow that deep into the design process, it contravenes the “cleanliness” element.

# Actual python
if skyColor() == 'blue' :
    print("Sky is blue")

# Lua code
if skyColor() == 'blue' then
    print("Sky is blue")
end

For function and class definitions, it does make a sort of sense. It comes close to following the everyday English usage of the semicolon character,

Exhibit A:

class Waffles:
    # here's the definition of class Waffles
    def __init__(self):
        # Kinda abandoned the whole readable code here?

# Print the list of items in somelist, one per line.
for item in somelist.items():
    print(item)

I can sort of see how it works with that for, and it does look nice when you’re writing short lines of code.

But – and this is the meat of this particular argument – in a language that is going out of its way to avoid symbols like ‘;’, the ‘:’ becomes errant and elusive in particularly long or complex statements.

It also introduces a somewhat silly seeming input dependency, consider, in the case of “def” (function definition), it means you have to type two end of statement characters: the colon and the carriage return. DUH.

This last is perhaps so that you can do

def foo(): print("foo()")

Ok, I guess I can see that, but it seems that the more common case should accept the following (note: no colon)

def foo()
    print("Foo")

For the more complex uses – if, for, while etc, Python ought to revert to cleanilness:

# One liner.
for label in labels.items() do print(label)

# Compound version.
for label in labels.items()
    print(label)

Worst offender, the Python ternary operator:

# If the user typed 'hello' then respond 'world', otherwise 'hi'.
response = 'world' if input == 'hello' else 'hi'

(Note: suddenly Python can do “if” and “else” without colons, even handling both on one line???)

Now, while some argue the ternary operator is bad, I think that in attempting to avoid recreating the evil of

response = (input == 'hello' ? 'world' : 'hi')

Python created their very own special evil. In particular, this is a case where English would mandate some form of punctuation to help stipulate the precedence. The first alternative would have been to employ consistency:

response = if input == 'hello': 'world' else: 'hi'.

But given the way they did implement, there is a clear argument for allowing the following in Python:

if input == 'hello' response = 'world' else response = 'hi'

Perhaps the reason this is not done is to help the author/viewer to distinguish the conditional clause (” input == ‘hello’ “) from the effector (” response = ‘world’ “). Many languages use parentheses to do this:

// Javascript
if ( input == 'hello' ) response = 'world' ;

Python’s solution for the ternary operation is just a rabbit hole of badness.

# Does the if apply to "namespace + '::' + name"
# or just to name?

fullname = namespace+'::'+name if namespace else name


# Inverted version, even more confusing.
# Is the programmer trying to prefix name with name::
# when no namespace is present?"

fullname = name if not namespace else namespace + '::' + name


# Here another programmer tries desperately to
# state that he wants either "name" or "namespace::name"

fullname = name if not namespace else namespace+'::'+name

# (apparently, he thinks python uses lack of spaces to
# denote increased precedence).

14 Comments

I was really getting into this post, then it abruptly ended. Part 2 soon?

Maybe… Mostly I just ran out of concrete things to criticize Python for. As evidenced by the fact I’m actively using it now. I’m liking the ease with which you can “get stuff done”, but wincing at the semantics of the language every now and again.

“Perl used to have CPAN for obtaining modules, but my experience the last few years has been that CPAN modules are increasingly abandoned, copious in number and wanting in documentation, perhaps as a result of the CPAN/RPM conflict (local OS bundling of CPAN modules via RPM and DEB packages has caused many an issue).”

You don’t need to use packaged CPAN moudle or bundled Perl.
http://search.cpan.org/perldoc?local::lib
http://search.cpan.org/perldoc?App::perlbrew
http://www.onyxneon.com/books/modern_perl/index.html
will save you.

The up-an-coming CPAN of Python world is PyPI. People seem to be converging on deployment of software written in Python using virtualenv. It allows one to separate environments of applications from each other, and PyPI (i.e. easy_install or pip) is the easiest and most global source for packages to install into that isolated environment.

What packages have you found to be missing?

Just wondering aloud, have you contributed any code to CPAN? Any tests? Any documenation? If not, why not?

Camel:

Nope. Nope. Nope.

Why not? Because when I used Perl, 7-8 years ago, CPAN was always on the ball, and there was little I could contribute.

Then I had a ~7 year lull of very little Perling (back to C/C++ for the most part, and Lua for scripting).

In the last year or so, when I actively tried to use Perl again, the handful of CPAN modules that could have saved me time appeared to have been abandoned, and on investigation of whether there was anything I could do, they were so bloated in terms of excess functionality and dependencies that I had no interest in picking them up.

Gabor: Totally missing the point there. CPAN used to rock because it didn’t need things like those. CPAN never stopped rocking, but OS bundlers like RedHat generally broke CPAN by distributing CPAN-unaware Perl bundles.

Remember, CPAN tracks dependencies. Doing something like updating LWP through CPAN becomes a nuisance when you have random-CPAN-built-but-RPM-installed LWP sitting in the system directory. You start your update and the dependency list becomes longer and longer and longer and longer and suddenly you’ve been at this for nearly an hour and … oh dear, the version of something really mundane is incompatible and now CPAN is talking about installing a whole new build of Perl.

If you’ve been lucky enough not to experience this, then it’s perhaps because you’ve kept upto-date with Perl. If you haven’t then that’s where you run into the wall of hurt and – while not CPANs fault – creates the feedback look of decay as developers get lazy and start worrying only about a RPM or DEB distribution of their module and the CPAN one becomes abandoned, and users start to follow.

And about the time all this was happening was when Python was blossoming, so maybe python-pan failed to get off the ground for the same reason.

However: the end result of this, in my experience and that of a few others I’ve spoken with, is that the lack of an absolute singular python source like Perl’s CPAN means that the Python RPM/DEB modules tend to be much better maintained.

Of course, in the case of both Python and Perl, people seem to be rapidly drifting towards github, which might ultimately prove to be a good thing for both languages and their users.

Also, Camel, the quote you quoted, re-read this particular piece:

my experience the last few years has been that CPAN modules are increasingly abandoned, copious in number and wanting in documentation

There are itty-bitty little CPAN modules for all kinds of stuff, and I’ve no interest in contributing to that issue with itty-bitty “workaround” and “hack” modules that I’m sure as heck not going to maintain because I need them to make Script X work and then never look at it again. I believe the fact that so many people have done exactly that – regardless of the language – contributes to the general decay of all these kinds of centralized repositories, from sourceforge to cpan.

> Lua and Python are both capable of “supporting OOP”, the way the average human colon is capable of “support a pool cue”. In Lua it means pissing about with metatables.
Er, okay. In Perl, it means pissing about with bless and hash references. What’s your problem with Python’s OO?

> Now this is minor and petty, but it was almost the very first thing I ran into. For those of you thinking I’m nit picking, that “\” character goes totally against the spirit of “avoid using symbol characters all over the place” and running into it so early on felt a bit like a betrayal of the ideal.
Python allows breaking lines within brackets exactly so you don’t have to use backslashes.

> The indentation concept works fine when you have only one or two levels, but when you start getting to 6 and more, it actually starts to get a bit difficult to track what indentation level is where, or where you *intended* for indentation to be.
How do braces help with this at all? You still have tons of levels of indentation, but now the blocks are further apart vertically because there are a lot of lines with only a closing brace on them, and no indication what that brace actually closes.

> It also introduces a somewhat silly seeming input dependency, consider, in the case of “def” (function definition), it means you have to type two end of statement characters: the colon and the carriage return. DUH.
I conjecture the reason here is that Different Things Should Be Different. A line ending with a colon always introduces a block. Other lines are always single statements.

> (Note: suddenly Python can do “if” and “else” without colons, even handling both on one line???)
This is different syntax that happens to use the same keywords.

> # Does the if apply to “namespace + ‘::’ + name” or just to name?
> fullname = namespace+’::’+name if namespace else name
The former, because the ternary operator — as in most languages — has very low precedence.
How does it work here?
fullname = !namespace ? name : namespace + ‘::’ + name

The theme here seems to be that Python didn’t well enough reach its goal of being a clean language? But you’re lamenting that you can no longer use Perl, which doesn’t have that goal in the first place? I don’t understand.

Er, okay. In Perl, it means pissing about with bless and hash references. What’s your problem with Python’s OO?

I didn’t even go there with Perl for exactly those reasons :)

Python’s OOP? Well, start with “self”. Python’s OO is vastly better than Perl and significantly better than Lua or JavaScript, but it’s still “hey it happens it can do this!” too. At least, that’s still my impression. Even as I increasingly grow to like Python more and more.

Python allows breaking lines within brackets exactly so you don’t have to use backslashes.

Except in the example I cited…

How do braces help with this at all? You still have tons of levels of indentation, but now the blocks are further apart vertically because there are a lot of lines with only a closing brace on them, and no indication what that brace actually closes.

An explicit statement of desire to end a level of indentation.

I conceed – in a nice editor, in either case, you have nice little guide lines showing you blocks of code, but in Python’s case that doesn’t necessarily help you tell what the author was trying to do.

Just about every human language features some form of punctuation. On the one hand, one of the things I like about Python is that little extra saving you get not having to type “;” on the end of every line when scripting.

But when I’m writing actual code, I find that I revert to putting semicolons in it. Just like I tend not to put full stops at the end of text messages, but when I’m writing on my blog, I tend to endeavor to use punctuation correctly.

I conjecture the reason here is that Different Things Should Be Different. A line ending with a colon always introduces a block. Other lines are always single statements.

Except when they are indented :) Again, another reason for having a compound statement syntax (e.g. braces, but it could just as easily be shell-like using keywords: if … fi).

This is different syntax that happens to use the same keywords.

(Regarding x = y if … else …)

Handled by the same tokenizer and parser, though, and with the same capabilities as regular if and else… It doesn’t follow the rule of Different Things Should Be Different or the rule of Same Things Should Be The Same.

Ternary operators are invariably evil. Python deserves kudos for having gone so long without one. But ternary operators are invariably evil, so chances are that I would be complaining whatever solution they’d adopted :)

The theme here seems to be that Python didn’t well enough reach its goal of being a clean language? But you’re lamenting that you can no longer use Perl, which doesn’t have that goal in the first place? I don’t understand.

Not sure where you picked up the idea that I’m lamenting no longer using Perl; probably from reading this post alone with no background.

Those who know me better, I hope, will read it as a semi-confessional: I saw the light as to why folks are using Python. That doesn’t change the fact that the language has it’s own particular set of flaws. But they aren’t insurmountable and they aren’t irredeemable. Most particularly, the rich set of APIs and tools that Python presents more than compensate for most of the language’s flaws (with the possible exception of the god-awful overhead of function calls).

I still use Perl, but in the spirit that Perl was developed – as a sort of hyper-awk. Anything more complex than that, scripting wise, and I fire up IDLE.

So why write a post looking for warts in Python? A sort of vocalized demon confrontation. Just like my posts on C++0X where I whoop about the upcoming features and cry about the god-awful syntax that the committees seem to be hell bent on.

I mean … in an effort to avoid a few esoteric syntax issues, they introduce the [[hiding]] and [[override]] things in virtual functions:

  virtual bool someFunction [[override]] ();

ARRRGH.

Python’s OOP? Well, start with “self”. Python’s OO is vastly better than Perl and significantly better than Lua or JavaScript, but it’s still “hey it happens it can do this!” too. At least, that’s still my impression. Even as I increasingly grow to like Python more and more.

I still don’t understand your complaint. self is Python’s solution to a problem that plagues every OO system: what to do with the invocant. Perl and Python make it an explicit argument, JavaScript makes it an implicit magical context thing, C++ and Java make it outright optional. I’ve had fewest headaches with the former.
Python’s OO has first-class classes, metaclasses, transparent getters and setters, class methods, and so forth. Perhaps it’s missing interface support in core, but that doesn’t make a lot of sense to bake into a duck-typed language, and there are several third-party implementations (e.g. zope.interface). Classes are a Real Thing in Python, whereas in Perl and JavaScript and Lua they started as Hashes Plus A Class Name. What do you think is missing?

Except in the example I cited…

In the example you cited, you broke the line outside brackets. So, yes, you need to either keep the opening bracket on the same line or make clear your intention; a lone identifier is a valid statement by itself, albeit not a useful one. The only other approach would be for Python to guess what you mean, and down that way lies madness.

An explicit statement of desire to end a level of indentation.

I’m having a hard time imagining how you could accidentally end a level of indentation. It’s a structural and visual change. On the other hand, closing braces are just emphasis on something you can already see at a glance — and because what you look at and what the parser looks at are different, it’s easy to write misleading code. (See, for example, the problem with braceless C blocks.) If the indentation and braces don’t match up, what does that tell me about what the author was trying to do? Did he miss a closing brace? Is this copy/paste gone awry? Is he just messing with me?

But when I’m writing actual code, I find that I revert to putting semicolons in it. Just like I tend not to put full stops at the end of text messages, but when I’m writing on my blog, I tend to endeavor to use punctuation correctly.

I leave periods off of text messages, too—because it’s obvious where a sentence ends when there’s one sentence per line.

Except when they are indented :) Again, another reason for having a compound statement syntax (e.g. braces, but it could just as easily be shell-like using keywords: if … fi).

Except when what are indented? Indented lines are still single statements. They might be part of a block, or they might be part of the preceding un-indented line.
A colon followed by an indented block is compound statement syntax. It just ends when the block stops being indented, rather than when there’s a squiggle to note that the end of this paragraph is in fact the end of this paragraph.
That’s actually a decent analogy: we use whitespace to delineate paragraphs (like these ones!) in English text. We have the ¶ symbol to indicate it, too, but nobody uses it because the gap is already visually obvious.

Handled by the same tokenizer and parser, though, and with the same capabilities as regular if and else… It doesn’t follow the rule of Different Things Should Be Different or the rule of Same Things Should Be The Same.

They aren’t quite the same as regular if/else. The inline ternary form can only contain expressions, not statements or blocks; for example, you can’t do x = 3 if condition else x = 2. And of course, there’s no colon precisely because the colon introduces an indented block. It’s Different Enough™, and the most Englishy option.

Ternary operators are invariably evil. Python deserves kudos for having gone so long without one. But ternary operators are invariably evil, so chances are that I would be complaining whatever solution they’d adopted :)

To the best of my knowledge, it exists because people were using the grotesque hack a and b or c, and most of them didn’t realize that it doesn’t work when b is falsish. Adding real support is a lesser evil than encouraging programmers to continue using an opaque and broken hack.

Those who know me better, I hope, will read it as a semi-confessional: I saw the light as to why folks are using Python. That doesn’t change the fact that the language has it’s own particular set of flaws. But they aren’t insurmountable and they aren’t irredeemable.

Oh, sure. My objection, insofar as I have one, is that there are plenty of better flaws you could pick on. 8)

So why write a post looking for warts in Python? A sort of vocalized demon confrontation. Just like my posts on C++0X where I whoop about the upcoming features and cry about the god-awful syntax that the committees seem to be hell bent on.

Lordy. I gave up on C++ a while ago, and 0x has not won my favor back. I have high hopes that Rust and/or Go will take off, or I’ll be stuck with Cython should I need to do systemsy programming.

Boy I wish this comment form would let me preview. I have no idea how screwed-up this is going to look.

Classes are a Real Thing in Python, whereas in Perl and JavaScript and Lua they started as Hashes Plus A Class Name. What do you think is missing?

Where, exactly, are you getting the impression that my post is a dismissal of python in favor of perl, javascript or lua?

I just re-read my original post in again to see where some of your seeming desire for a linguistic fisticuffs is coming from. (I.e. if you’re trolling, hats off, I can’t tell).

“self”: The need for explicit self-referentials is generally a crutch to assist what I’d call “tier 2” OOP programming. “self” should be the default frame of reference.

The need to explicitly state “self” in an OO language introduces an uneccessary opportunity for error and a tendency to think in an un-OO way. It’s the programming equivalent of talking about yourself in the 3rd person.

if else … again, by importing a false external assumption of linguistic competitiveness, you’re continuing to read out of context. I liked Python better when it didn’t have one, but the one they hacked in disproves several long standing statements about the parser imposed structure of the language. But *shrug* I just used to write parsers for fun, so what would I know.

Oh, sure. My objection, insofar as I have one, is that there are plenty of better flaws you could pick on. 8)

Mostly these are objections I felt folks I’ve spoken with previously about language choices would likely share – and my post largely counters them or explains them, at least in so far as how I surmounted my annoyance at them.

For instance, contrary to your interpretation, I don’t feel that I “hate” on the indentation approach; although I am amused at your claiming you can’t imagine someone failing to get indentation right. I can only refer you back to the context in which I describe it in the OP.

We do indeed use white space for paragraph separation, however when we have to implement multiple levels of indentation in written language we very quickly begin to introduce an assortment of punctuation or markup such as bullet points, quotes, etc.

The original context I gave was that copious use of indentation begins to become cloudy when the indentation levels are relatively small, or illegible if the indentation levels are large.

I do think that explicit block commencement and termination is better – an opinion developed through my own experimentation with whitespaced languages 26 years ago, and I believe to be borne out by it’s continual recreation across languages everywhere, from opening and closing quote marks to parentheses, to markup tags.

But I would also admit your point that you can put braces in the wrong place just as easily as you could add one tab too few or too many. However, in this case I favor the omissive approach less.

So on one level I don’t have a problem with the indentation approach, it should irk the programmer into avoiding deep nesting through the use of functions etc, but in Python the cost of a function call is quite staggering. We’re not talking about a little bit of overhead as in JavaScript, Perl, Lua or pretty much any language I can think of.

Partly because of the decorator system that Python provides to allow you to validate function arguments or inject your own customized tracing/debugging tools, and partly because of the approach they’ve used to implement named arguments … the overhead of a Python function call can literally be hundreds to thousands of times that of a similar call in another language. And it actually gets worse on newer CPUs :(

Partly because of the decorator system that Python provides to allow you to validate function arguments or inject your own customized tracing/debugging tools, and partly because of the approach they’ve used to implement named arguments … the overhead of a Python function call can literally be hundreds to thousands of times that of a similar call in another language. And it actually gets worse on newer CPUs :(

Do you have links/references to that, interesting to note (as I’d have said that if youre worried about bashing a 70 char width limit then you should be looking at breaking out into a different class/function at that point of complexity (and we’ve been moving from shell etc to python for a lot of our system stuff – and I’d like to know up front :)

Ofc we could always just use ruby *cough, hack cringe*

Note: I’m talking about the relative cost of a function call, not of calling a specific function or of the execution of said function.

It’s because Python provides a variety of ways to intercept function calls, as with function decorators etc.

The trouble is that means passing thru several additional layers of non-trivial conditional logic before getting from the invocation to the actual called function itself.

If you aren’t using any decorators or any special argument passing, the overhead is acceptably low (although still many times that of calls in other languages). But when you start using named arguments and/or function decorators (such as the hidden decorators involved in the handling of member function calls), then it seriously starts to ramp up the overhead of function calls.

Its akin to the issues we used to run into back at Demon with Perl 3 where scripts would go from behaving nicely to killing boxes when it turned out that somewhere strings were being passed by value instead of carefully massaged into pass by reference; I guess.

Leave a comment

Name and email address are required. Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong>