Shareware Beach

Wednesday, 26 October 2005

A New Era of Computing

Filed under: Software Development — Jan @ 10:31

With the limited release of the EditPad Pro 6 beta, JGsoft has entered a new era of computing. Previously, all our products were designed for computers with a single processing core (CPU). Almost all present Windows applications and utilities fall into this category, because multi-processor systems used to be very expensive. Putting multiple CPUs into a single computer only made–and still makes–sense for high-end workstations and servers.

The new PC I got a few months ago has a Pentium 820 CPU. I got this system because it was the cheapest dual-core system locally available at the time. In fact, the Pentium 820 was cheaper than Intel’s fastest single-core Pentiums.

Soon, run-of-the-mill computers and even notebooks will come equipped with multi-core CPUs. Intel has announced quad-core server CPUs for 2006, and quad-core mainstream CPUs for 2007. If Moore’s law continues to hold true–the number of transistors that can be put cost-effectively into a single CPU doubles every 18 months–we’ll likely see the number of CPU cores increase further in the future.

The significance of this is not that we’ll have faster computers. In fact, when running the applications I have, my 2.8 GHz Pentium 820 isn’t much faster than the 2.4 GHz Pentium 4 I bought three years ago. The reason is that applications designed with a single active thread in mind don’t take advantage of the extra CPU core.

I’ve made a screen recording of EditPad Pro 5 opening a huge file. The example is somewhat artificial, since few people put a million lines of source code into a single file. But it makes the situation easier to observe in the Windows Task Manager. As you’ll see in the movie, EditPad Pro 5 “goes stupid” the whole time while it scans the file for line breaks and applies syntax coloring. All the while, CPU usage is pegged at 50%. Both cores share the load, but it’s only 50% since there’s only one active thread. The other running threads–the Windows Task Manager and other OS threads–don’t use any measurable amount of CPU time. But they do cause EditPad Pro’s thread to be switched between the cores.

The significance of affordable multi-core systems is that consumers will become increasingly intolerant of applications that “go stupid”. Particularly when that application isn’t using 100% CPU time. An application can always use one CPU core for foreground GUI handling, while lengthy operations run in the background.

EditPad Pro 6 will use up to 4 threads. The main thread handles all GUI and editing tasks, while 3 background threads take care of finding line breaks and word wrapping, applying syntax coloring, and building the file navigation tree.

Even this modest amount of threading makes a dramatic difference. As you can see in the recording of EditPad Pro 6 opening a huge file. The file appears instantly, complete with syntax coloring. The only time you’d have to wait is when you want to jump to the end of the file and EditPad Pro 6 hasn’t finished scanning for line breaks yet. If syntax coloring isn’t done yet, the end of the file is simply displayed without. (Since Pascal supports multi-line comments, the whole file has to be colored.) The movie clearly shows that the line break scanning and syntax coloring threads each tie up one CPU core. The foreground thread doesn’t use much CPU, since I’m simply scrolling through the file.

Making your application multi-threaded isn’t enough. My first attempt at separating line break scanning into its own thread worked wonderfully on a single core CPU. But on a dual core system, CPU usage was still pegged at 50% while scrolling and line break scanning at the same time. The reason is that while my benchmark app had two threads, they weren’t running simultaneously.

Whenever the foreground thread had to repaint the screen, it would block the line scanning thread completely. Doing that is fine on a single core computer, but kills performance on a multi core system. The final solution uses critical sections to only block either thread for the smallest amount of time when line break information is updated.

In the end, what matters is percieved speed. How many seconds of CPU time your app clocks up in the task manager doesn’t matter. How many seconds your customers spend waiting for your software is what it’s all about. The ideal is obviously not to make them wait at all.

In EditPad Pro 5, performance of the syntax coloring mechanism was vital. If it was slow, users would spend time waiting for it. Therefore, EditPad Pro 5 came with a lot of built-in coloring schemes for various popular file formats and programming languages. I coded these schemes directly into the Delphi source code, so they’d run as fast as possible. By contrast, user-contributed schemes used a system based on regular expressions, which precluded many of the assumptions and optimizations the built-in schemes could make. It also made it impossible for users to customize the built-in schemes other than choosing the color palette.

In EditPad Pro 6, syntax coloring performance is irrelevant, as long as it’s reasonable. If a scheme is too complex or a file too large for the syntax coloring to keep up, it’ll simply temporarily disappear until it’s done. As a result, all syntax coloring schemes included with EditPad Pro 6 now use the regex-based custom scheme system, making them flexible and easy to customize. Though EditPad Pro 6 needs significantly more CPU time for syntax coloring, it feels much faster since you don’t have to wait for it, and it’s still fast enough to keep up while you’re scrolling. Only quickly hitting Ctrl+End on a large file or using a rediculously complicated scheme (some user love ‘em) might make it disappear for a while.

When benchmarking your application, think about human time rather than CPU time, and parallellize those tasks that users shouldn’t have to wait for.

2 Comments

  1. This is excellent. One of the reasons I moved to a commercial text editor instead of the free Notepad2 was because of the poor large file performance of the Scintilla colorization engine.

    It’s too bad you chose the dual-core P4 over the Athlon 64, though. The A64 has a far better dual core architecture and is substantially faster, particularly for software developers..

    Also, I appreciate the incremental search in EditPad Pro 6 very much!

    Comment by Jeff Atwood — Tuesday, 1 November 2005 @ 16:07

  2. As I said, I got the cheapest dual core CPU available locally (Bangkok) at the time. Borland Delphi compiles extremely fast, so I have little need for a bleeding edge CPU.

    I’m glad you like the new search facility in EditPad Pro 6.

    Comment by Jan Goyvaerts — Tuesday, 1 November 2005 @ 22:14

Sorry, the comment form is closed at this time.