11 confounding programming language features

Programming languages are full of peculiarities but these oddities tend to make developers say 'WTF?' more than most

Picture of man scratching his head

Every programming language has its own unique quirks, such as weird syntax, unusual functionality, or non-standard implementations. These things can cause developers new to the language, or even seasoned pros, to scratch their heads in wonder. Sometimes these oddities are a real stumbling block for a programmer, while other times they may come to understand, appreciate, or even really like the unique features of a particular language.

While there are an almost infinite number of such programming language idiosyncrasies, there are a handful that tend to get mentioned a lot when developers discuss the subject. Here are 11 programming language features that seem to have programmers repeatedly asking “WTF?”

Also on ITworld:

The word Unknown
Nic Hughes CC BY 2.0 (Creative Commons BY or BY-SA)

Empty strings are NULL in Oracle SQL

The head scratcher: The Oracle RDBMS considers character strings of zero length to be null values. This is contrary to numerous other databases and ANSI/ISO standard SQL, which consider the former to be a known (but empty) value and the latter unknown, and hence not the same. This can make converting code to or from another RDBMS or writing code to run across multiple RDBMS more painful.

The reason: This appears to be a relic left over from Oracle’s early days, which date back to the very first commercially available implementation of SQL in 1979, before there were even SQL standards. Oracle itself warns developers that this behavior may change in the future.

Quotes: “Yep, that's one of the more ‘awesome’ features of Oracle database.” Lukas Eder

“The string ‘’ should no more be treated as NULL than should the integer 0.” Ben Blank

“WHAT!? I am so glad we don't use Oracle. That is frightening.” Jeff Davis

Pictures of apples and oranges

+ is a concatenation operator in JavaScript

The head scratcher: The + operator is overloaded in JavaScript, being an addition operator for numbers and a concatenation operator for strings. If one operand is a string, JavaScript converts the other variable to a string and concatenation occurs, so that ‘1' + 1 yields 11.

The reason: This is ultimately due to JavaScript’s loose typing. Python, for example, also uses + for string concatenation but, being a strongly typed language, it will raise an error if one tries to add a string and an integer.

Quotes: “The problem is silent type coercion in conditionals can be unpredictable.” Anonymous

“Javascript should throw an exception in this case.” crgwbr

“+ for string concatenation is horrible”  Matteo Riva

Picture of a woman showing two thumbs up

Perl modules must return TRUE

The head scratcher: Perl modules almost always end with the statement 1; and if they don’t, or, rather, if the last statement doesn’t return a TRUE value, an error is raised.

The reason: Perl modules can contain initialization code as well as subroutines. After the file is loaded, Perl checks that any such code executes successfully by looking for a return value of TRUE. Even if there is no initialization code, Perl still expects the final statement to return TRUE or it raises an exception.

Quotes: “That always gave me a queasy feeling ...” Drew Hall

“It doesn't have a practical use any more, not compared to the continuous annoyance it provides.” Schwern

Picture of triplets dressed as bunnies
REUTERS/Gary Hershorn

Trigraphs in C and C++

The head scratcher: C (and C++) support a set of 9 trigraphs, 3-character combinations, which automatically get converted to single characters before any other processing, for example ??! gets converted to |. They can produce unexpected behavior and make source code harder to read.

The reason: Trigraphs allowed early C programmers to generate certain characters which their keyboards often didn’t support, such as curly braces.

Quotes: “Google doesn't help with search terms like ??!??!” Isaac

“Everybody hated trigraphs so much that C99 added digraphs as an alternative ... This provides for even more obfuscation ..." dododge

“The joy of trigraphs -- making C unreadable since 1977.”  Martin Beckett

Picture of a camel

PHPs case insensitivity

The head scratcher: While identifiers in many other languages are usually case sensitive, in PHP function names (as well as class and method names) are case insensitive. To really confuse developers, variable names, constants, and class properties in PHP are case sensitive.

The reason: Most likely an artifact of PHP’s organic development from a set of CGI scripts to a full-fledged programming language.

Quotes: “Well, it's PHP, you shouldn't be suprised.” Grzechooo

“So that's why php programmers use underscore instead of camelcase when naming their functions.” paperstreet7

“I have absolutely no idea how to write a programming language ...” PHP creator Rasmus Lerdorf

“Is there anything in PHP that doesn't cause a ‘WTF’?” Lambda Fairy

Picture of the number zero and one painted on a wall

In Ruby 0 is truthy

The head scratcher: In Ruby, the value 0 evaluates to TRUE. This is contrary to many other languages, such as C and Python, where it evaluates to FALSE, and so often surprises new Ruby developers.

The reason: In Ruby, only the Boolean FALSE and nil evaluate to FALSE; everything else evaluates to TRUE. 0 is treated like any other number.

Quotes: “… it can be a WTF, even though I think it's a good thing.” Chris Lutz

“That's why I like Ruby. I don't need to deal with such zero-false evil.”  Edditoria

“0==true // argh the c compiler in my brain is exploding!!” kenny

Picture of a blank, white wall

Whitespace used to denote blocks in Python

The head scratcher: Rather than punctuation or keywords, Python uses indentation level to denote the block to which a line of code belongs. The incorrect amount of whitespace (or mixing spaces and tabs) can cause errors.

The reason: Makes for more readable code and reduces the amount of typing, since many editors automatically indent.

Quotes: “Really, that's what preventing me for ever liking Python.” wazoox

“I love Python's whitespace ... The one big caveat, though, is when copy/pasting code from the Web -- it often causes mixed spaces and tabs which requires an extra step to clean up ...” Greg

“If you need indentation to be forced upon you, you're too lazy.” Joris Meys

Picture of a pointer dog pointing

Array indexing in C behaves like pointer arithmetic

The head scratcher: In addition to referencing element i of array a as a[i], C also allows you to reference the same element as i[a].

The reason: In C, arrays act like pointers to blocks of memory, so a[i] = *(a + i) = *(i + a) = i[a].

Quotes: “... it is invaluable if you want to compete in the obfuscated C contest ...” Confusion

“I don't see it as a feature -- so much as exposing the core of what C is about. Its all about pointers and getting to the memory directly with as little indirection as possible. Kind of beautiful, really.”  Michael Neale

Picture of a whole lot of random type setting characters

Perls predefined variables

The head scratcher: Perl has a long list of special variables with obfuscated names (though they also have longer English equivalents). For non-Perl gurus, they can require repeatedly referring to Perl documentation and make code harder to read.

The reason: These variables provide information and access to a variety of aspects of program execution, such as process ID ($$), error messages ($@), and regex matches ($^R).

Quotes: “Quite annoying.” MatrixFrog

“They're nice for one liners, though.” niXar

“I think the worst ones are $@ and @$. Both work and I still have no idea what the 2nd one does ...” Artem Russakovskii

“By the way, one other slight problem with these variables: They're ungooglable!” malvim

Picture of punctuation marks made of pancakes

JavaScripts automatic semicolon insertion

The head scratcher: JavaScript makes the use of semicolons to end certain statements optional by automatically inserting them where it thinks they belong, such as after line breaks. This often leads to errors or baffling behavior with no exceptions raised.

The reason: Semicolon insertion was meant as a convenience, to make JavaScript's C-like syntax easier for new developers.

Quotes: “You always run into problems when you design language features around the assumption that your users will mostly be idiots.” Rob Van Dam

“My advice: Figure out where the semicolons go, put them in the right place. You’ll be much better off.” Doug Crockford

“Semicolon insertion is one of the most evil parts of JavaScript.” fennec

Picture of stacks of boxes

Java's autoboxing with Integer caching

The head scratcher: Java will automatically convert primitives types to objects (autoboxing), such as int to an Integer object. It will also, by default, cache Integer objects for values from -128 and 127. This can lead to unexpected behavior when using == to compare autoboxed Integers with the same value (TRUE from -128 and 127; FALSE otherwise).

The reason: Autoboxing reduces the amount of code that developers need to write, while Integer caching improves performance.

Quotes: “That's the result of premature optimization.” Joschua

“This isn't a common mistake, but it is a good reason to use the native Java types for numbers, booleans, etc ...” Ravi Wallau

“So glad I'm a C# programmer.” Will

Copyright © 2015 IDG Communications, Inc.