20050226

Consumption frenzy

Got the books! Yay!

Quickly read Money 201 and managed to be uplifted and depressed at the same time. Looks like I've been doing many things right, but a few wrong. Unfortunately, to do those things right, I'll require more money. Or slack off on pre-paying parts of the mortgage. But I hate paying interest, regardless of what finance books say. I should write my own book, maybe.

Read Exceptional C++ style rather quickly. I feel relatively good about it--it's a good book, and it looks like I'm not too rusty. But, to my dismay, I'm also not that interested in all those dark corners anymore (something I've alluded to in previous posts). Still, I wish the "smaller language screaming to get out" Bjarne Stroustrup mentioned when talking about C++ would come to light. The closest I've found so far is Python, and its library is growing a bit messy.

Speaking of libraries, I've had the opportunity to look in the Java libraries for a few things. I was trying to get some substring appends to be efficient. Unfortunately, StringBuffer does not have an append(String, int, int) operation--only an append(char[], int, int) operation. So I had to call substring() (which is bad, but better from a garbage generation point of view than toCharArray() or whatever it's called). Man, I wish for the nth time I could get access to the internal array. Actually, StringBuffer could, if the String's array were package-private, and in my view, this would be a very sensible design.

Anyhow, I look around for a solution, and it looks like Writer has a substring write operation. So, I think, maybe I can change my code to work through a writer instead. But, being curious, I wonder if they just go through the whole array char-by-char, or maybe chunk stuff in a pooled buffer.

The answer is: none of the above. They call substring().

Also annoying, they don't let you create a StringWriter on an existing StringBuffer, but you have access to the underlying StringBuffer anyhow. This is incredibly non-symmetrical and quite stupid.

Coming from the C++ world, I find myself constantly annoyed by the sheer lack of rigor in the design of the core parts of the Java libraries. Newer parts (such as the Collections system) show more care, but some very fundamental classes (such as java.lang and java.io classes) show sloppiness. So, you get new classes that have better concepts, such as java.nio and Collections. But what can they do about String? It's such a fundamental type, and yet they give no way to easily extend it. You can't even access the internal array. Of course, that's done for safety reasons (so nobody can modify the string in place, since the string is supposed to be immutable), but you can still access it anyhow with reflection and a custom class loader. Worse, as far as I can tell from the StringBuffer code, it's not as efficient as it could be, because it goes through the public interface of the string object.

This may sound like a nit, but efficient string manipulation is extremely important. You want to have a language that lets you do as much as you can with as few temporary buffers as possible. Especially when object allocation and garbage collection are as slow as in many JVMs. I've had many sites run out of memory because they did extra copies of incoming requests and outgoing responses. Granted, they shouldn't do that, but given the API provided, it's the most natural way. I mean, I keep seeing (and writing!) code that does such things as "string("+s+")" even though it's inefficient. If it's inefficient, why is it the most natural way of doing things? At least, in C++, the compiler has a chance to collapse the temporaries! The specifications of the Java language prevent any sort of optimization for this construct. Bad.


Rant aside, I got one other thing--Xenosaga II. So far, I like it, although I couldn't fully believe it when they asked me to switch disks (I was a mere 8 hours in the game...). I hope the second disk is a bit longer. A lot of online reviews have complained how it's tedious and so on, but if they had a bit of a longer memory, they'd recall that Xenogears was pretty much the same. This new game feels more like the original Xenogears, with its long dungeons and somewhat higher level of difficulty. There's very few boss fights I finished in a singly try; I usually get killed at least once. This is a refreshing change from most modern RPGs, like Final Fantasy X-2 (which never felt very difficult--you have so much flexibility with jobs, and changeover is so fast, that it's hard to get stuck in an attrition battle with your enemy).

A couple of things are somewhat suboptimal with the game, though. First, I don't know why they messed with KOS-MOS' voice acting. It was excellent in the first Xenosaga, perfectly neutral and emotionless, except for a few (intentional) occasions. The new acting varies, going between somewhat neutral and somewhat whining. It just doesn't work; KOS-MOS is supposed to kick ass, not whine.

I'm also annoyed by the poor treatement they gave Yuki Kajiura's soundtrack. It's an awesome soundtrack, but in the game, the tracks are often cut before they finish (unforgiveable in a game that uses voice acting! The player does not control the rate of delivery, so efforts should be made to time the script to the music; Xenosaga I pulled it off much better), cover them with too much sound effects (disminishing their impact), and sometimes use them in strange contexts. How dare they make Kajiura's work sound so bland!

Load time in combat isn't that wonderful either. However, it's possible to level up relatively quickly, which puts a lot of tedium out of it. I prefer to have less more-difficult fights than to have to fight weak enemies 100 times to level up (like I've done often in FF VII). I might as well put a rubber band on the "O" button if I'm going to do that. I prefer games to treat me like a thinking being than like an automaton who just presses "O".

On the plus side, the fact that there's almost no segments with no BGM helps me enjoy the game quite a bit. The non-movie soundtrack is nothing really earth-shattering, but some tracks are very, very solid. Unlike many reviewers, I don't think it was a mistake to move from a symphonic soundtrack to a synthetic one. Symphonic soundtracks are popular in SF themes, mostly due to Star Wars and Star Trek. But synthetic soundtracks can work well, too--witness early Babylon 5. It's a matter of balance. Strong melodies should accent strong points, and trance-like tracks should be used for BGM in more repetitive parts.

And you gotta love the new character models. Too bad the hands are done with a thumb, index and block containing the three remaining fingers, all the time. That trick is used in Final Fantasy X and X-2 when there's too many characters in the scene, but Xenosaga II uses it all the time. It's a bit sloppy. But the nice face models and expressions make up for it.

I'll post a full review when I've finished the game. Which, at the rate I'm playing, will probably be next week-end or so.

20050217

Not so humbling after all, to humanity's great sorrow

OK, so I was really tired yesterday. Turns out I had good reasons to abstract file lookup: I needed to do easy unit tests. Granted, I could have mocked a ServletContext, but I think it's cleaner this way. So, much to everyone's chagrin, I'm not really humbled by my experience.

I have, however, discovered the benefit of a good night's sleep, yet again. Something I won't get next Tuesday because I have to go to the dentist. How this is interesting to you I have no idea, but you never know!

In other news, the guys at work are making fun of my work habits and love of mechanical keyboards1. They are mean to me. But then again, I complain all the time, so I suppose I deserve it.


1 The text in the picture is in french, and reads "Directly from the Espace Logient/Benoit Goudreault Emond/Concerto in Mechanical Keyboard/And Bouts of Anger". This is a loose translation, but it's probably accurate enough. "Espace Logient" is a pun on a show room in Montreal known as "l'Espace Go" (which is, by the way, quite a nice show room).

20050216

Humbling Experience

You know how it becomes customary for all software developers to whine about everyone else's code. Too complex. Too convoluted. Ad nauseam.

Well, read your own code.

Today, I was trying to retrofit some local file-reading capabilities in a system that was mostly meant to read stuff off of a network request. Since the network request bits are semi-auto-configured, I wanted the same capabilities for the local file-reading stuff. So, made an interface. This removed some bindings between the system and the application framework. Made it cleaner, more standalone, etc etc.

Then, in the bus, it hit me: bad idea. The system is completely dependent on the application framework anyways, because database tables/file names/etc. are all done according to an implicit convention that cannot be found outside said framework. So, all this wonderful isolation just made things more complex for no reason whatsoever. At worse, I should ask callers to supply the appropriate framework object and use it directly; if I need abstraction later, I can always put it there--later. Given that it's code for more junior programmers, why make it more complex than it needs to be? It's already a bit complex with a singleton spawning a query engine, which spawns a stateful query and a result.

In my defence, I got very little sleep yesterday :-)

In other news, I just ordered bunch of books from Amazon.ca. If you want some software-engineering-related or C++-related books, they have killer rebates right now (50% off selected books). Some of the books on sale are not that great (if I see one more "Enterprise Java with xyz" book, I'm going to hurl), but some others are the Herb Sutter classics, and some rather new books on working with legacy code and configuration management. I got myself the Sutter classic I didn't have, and "Working with Legacy Code", which is something I really need to read RFN. Especially since today, I was writing, effectively, legacy code, and it was my fault.

Also ordered a personal finance book, because it's tax season right now, and my new financial adviser seems to be determined in making me feel inferior. But looking back with a cool head, my gut feeling is that, besides a little bit of neglect (mostly money stuck in an ING account instead of invested in, say, a dividends fund), I've done pretty well. Probably the advisor's tactic was merely to try to sell me a credit line, something which I'm not really open to.

Finally, I realized, to my disappointment, that cool template tricks don't really do it for me anymore. I had the chance to get the Template Metaprogramming book 50% off, and I passed. It looks cool, but I have very dim hopes that I'll get to use it, because:

  1. It tortures compilers, including G++, and
  2. I don't think I'll ever be allowed to use this stuff except in a few toy or hobby projects, because it'll be very hard for any company I ever work for to find people able to understand this stuff.

In many companies, there's a "language guru" and a number of journey(wo)men. I've seen few places where there are many gurus (though Silanis was one of them at a time), and even then, they're hampered by management's fears that nobody will be able to figure out the code. A very sane fear in some ways, but not as much as one would think when you realize that code written by non-gurus tends to be as obscure as guru-code, except it's not because of mere technical proficiency reasons!

I also realized today that what I'd really like would be to rewrite my current company's codebase all in Python. But then, I'm sure nobody would want it, even though Python is easy, because we hired Java programmers, of course. Everyone's so damn specialized.

Well, that was today's rant. I need sleep. Especially since re-reading my previous lines makes me realize that the experience is having less and less of a humbling effect as I get riled up...

20050208

Whither Moore's law's application to everyday computing?

An interesting article: Where have my cycles gone?.

This article asks the question I've asked myself for the longest time.

Nowadays, it's not as bad, because I run Linux on my home computer. I have, therefore, a good idea where my cycles have gone. The computer is pretty snappy at this time (well, it is an AMD Athlon XP 2000+, but there's a mere 256 MB of RAM), despite how much resolution I drive it with, how much anti-aliasing I've added, and how many background services I run.

But whenever I use a Windows XP computer, a Java application, or even some Linux desktops (those with GNOME or KDE come to mind... I use XFce which avoids much of the madness), I really wonder: given Moore's law, how come many tasks appear to be slower than they ever have been?

The author cites some understandable reasons. Here's my take on reasons I do not understand.

  1. Incorrect algorithms: somewhere, programmers have gotten really sloppy. Trying to sort linked lists with a classical quicksort (hint: don't!), running through data structures many times in an effort to work on the data where a single pass would work, doing all sort of really stupid things to help performance (such as adding cache to an O(n2) algorithm that could really be done in O(nlog2n)), and so on and so forth. I'm always suprised (in a bad way) at how many things are done with such sloppy algorithms. If the programmer would just think for a few seconds, it would avoid such problems. Complexity problems like this are usually trivial on small data sets, but what if your small data set is the set of pixels in the GUI blit routines that get used all the time, hmmm?
  2. Wrongheaded ideas: some OSs and applications have patently bad ideas at their core. Like searching stuff frequently that is not indexed. Like putting files all over the disk and expecting the filesystem to have an efficient lookup algorithm tailored to your application. Like opening multiple database transactions where everything should be done in one (this is bad for performance, but also for data integrity). The list goes on. I see this in commercial software all the time. I can think of no fundamental reason why system applications would be in a better state, given that they are marketed nearly the same way as user applications (which, IMHO, is really wrongheaded!). User applications, of course, have all those problems. Yuck.
  3. Speed/memory tradeoffs: for some reason, everybody got drilled into them that you should always save cycles in priority to memory. So people read the whole file in memory and cursor through it with pointer operations. So they put stuff in sparse hash tables where a sorted array would do and give significantly similar performance characteristics. So they introduce lots of caches to gain a small 5% increase in performance. It doesn't work, and here's why. Today's systems are usually starved for IO time or for memory; CPU is rarely running at 100% while you work (take a look at a system monitor or at the Windows Task Manager during the day; you may be very surprised at what your computer is doing). As people try to run several programs at once and get them resident, the problem worsens. Once physical memory is exhausted, the system becomes starved for IO time as the OS needs to swap stuff. If you get in this situation, it will always be much slower than the "slower" algorithm that uses almost no memory. There's also an interesting effect I've seen in some cases: due to CPU cache effects and the importance of keeping a working set small so it's most effective, larger uses of memory may be detrimental for performance in many situations, even if the OS isn't starved for RAM.
  4. Memory/resource leaks: they're everywhere. Garbage collected languages are great, but they shouldn't be taught as the first language. People who learned on garbage collected languages tend to think that the garbage collector takes care of all resource allocation. Hate to break it to you, kids: it only takes care of memory allocation, and on top of that, if you keep references to an object too long (like in caches...that thing again!), it never gets collected. Garbage collection is no excuse to be sloppy about object ownership. See previous point on why resource leaks eat up time.
  5. Freakin' objects everywhere and the lack of stack allocation: this is mostly a problem for language like Java, which couples an asinine lack of stack allocation for simple objects with a really slow allocator and garbage collector. Mix in a really high per-object allocation overhead (all object get a condition variable and a vtable, whether they need it or not!) and you've got a recipe for high temporary memory usage. That would be OK if the Java VM contracted its memory use once in a while. But NOOO! I sometimes think those who wrote the JVM have disregarded 30 years of computing, both in VM design and in garbage collection algorithms. I can think of no other reason that would explain how they could deliver such a ridiculous JVM 1.0. And I still don't understand how many of my Python scripts have more predictable performance than many of my Java programs, despite the JIT and the fact that the Python programming model does not lend itself to much optimization.
  6. Buzzword mania: why is it that we need EJB? Do you do transactions across multiple databases? Do you really need to distribute your objects (remember the first law of distributed computing: don't distribute your objects)? Replace EJB and related questions with buzzword of the month. Many products are built with technologies that don't really fit the problem, increase complexity, memory use, and decrease performance, for no visible gain in capabilities. This is really dumb. See my earlier article on why software projects fail.

Is that it? Probably not. But I think it covers a lot of things. If you're studying in CS or Comp. Eng., I really recommend that you do the following:

  1. Study algorithms. Know the main ones. Know which data structures have what complexity guarantees. This will help you choose the right structure and algorithm for the right job.
  2. Pay attention to the more low-level classes. Assembly looks like pre-history, but the principles of how machines work will always remain useful. C and C++ feel like nailing your toes to the desk in an awkward position, but you'll learn to be careful about resource ownership, and that's a valuable skill regardless of the language.
  3. Remain skeptical of those who claim your designs aren't "elegant" enough. There's always a sweet spot; but in any case, a design with less code will almost always be more elegant from a maintainability, understandability, and from a performance point of view as well. In my experience, university "elegant" means "complicated". It's cool looking, full of design patterns and objects and inheritance. But when 60% of what you typed is syntactic and semantic sugar, you'll end up with a mess sooner or later. And that will make it harder to figure out whether you picked the right algorithm. Don't misunderstand: elegant design does exist. But you'll have to develop your own sense of it. The understanding of elegant design in academic circles varies widely. Be especially wary of teachers who don't code, or instructors who never had to maintain any of their projects. Design sense mostly comes from learning what not to do by having done it and being stuck maintaining it.
  4. Remain humble when you're about to do some task. You may be smart enough to implement the equivalent of a database by hand; but why take the chance? And even if you're smart enough, keep in mind you don't really have the time anyhow. Solved problems may be fun to solve again, but good commercial-grade code is always developed with time pressure. Time you spend on your fun problem will be taken away from time you should spend making the overall system design maintainable.
  5. Try to understand what you're doing. Try to understand the libraries you're using, at least in a general way. Otherwise, it's going to be very hard to pick a given routine (or even a given method overload!) over another.

Well, that's my advice, for what it's worth. I just realized that a lot of it applies to people who already program professionally, and I can think of a few people who don't follow this advice. I know I try to follow it carefully, and it served me well so far. I've been programming commercially for nearly 5 years, and as a hobbyist since I'm 14, so I like to think that I've learned a few things at this point. I'm sure there are other things programmers should be careful of, but those I've noted in this post are supposedly 'obvious' and people still don't do it.

Hence this rant.

Hopefully, if people apply those, Moore's law's application to everyday computing, in the form of faster, more capable computers with the ability to do more for their users, will become reality.