Home > Computing > Case-insensitive identifiers

Case-insensitive identifiers

Recently I came across Jeff Atwood’s article about the idea of case-insensitive identifiers. I think this is an interesting idea, here’s why.

Why have case-sensitive identifiers at all? Function names, variable names, object member names. Having two variables in your program which overlap in any scope, whose names differ only by case is generally a bad idea. To somebody who tries to read and understand the program, they are indistinguishable, most likely they are a programming error or a remainder from an older version of the code. It would probably be a good idea for statically typed languages to forbid two variables of the same name differing only by case.

Let’s take dynamically typed languages, such as Python or JavaScript. One of their advantages over statically typed languages is that they allow faster development cycle, because there is less text needed to write a program so source code is more concise.  More concise source code is statistically easier to read and review and therefore maintain. However the disadvantage of dynamically typed languages is that the variables references are not checked during compile time, but resolved in run time. Hence it is easy for bugs to hide in Python or JS programs – the kind of bugs that can only be detected in run time in very specific situations.

Let’s consider the following function in Python:

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
def AddVectors(v1, v2):
    return Vector(v1.x+v2.x, v1.y+v2.y)

Most of the time there won’t be a problem with it, but sometimes the caller may pass malformed input (e.g. input read from a wrong file) and the function will raise an exception. That’s the drawback of dynamically typed languages.

But in some code path the programmer may just make a mistake:

class Empty:
    pass
v1 = Empty()
v1.X = 1
v1.Y = 2
v2 = AddVectors(v1, v1)

This program will obviously fail with an exception. It is a programming error, but does it have to be?

I argue that no, it should not really be a programming error. This kind of bug may be very annoying if it occurs in a rarely traversed path, and causes rare, unnecessary crashes for the end user.

Because using two variables differing only by case should be avoided as it leads to confusion and therefore bugs, it would actually be useful if identifiers in dynamically typed languages were case-insensitive.

This problem does not directly apply to statically typed languages, because all variable references are resolved during compile time, so the programmer has the opportunity to catch all spelling errors before the program is executed for the first time. It would still not hurt if the compiler (say for C++ or Java) did not allow two variables differing only by case – this would lead to cleaner and better code.

Categories: Computing
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: