Scripting language choice for initial performance
I have a small lightweight application that is used as part of a larger solution. Currently it is written in C but I am looking to rewrite it using a cross-platform scripting language. The solution needs to run on Windows, Linux, Solaris, AIX and HP-UX.
The existing C application works fine but I want to have a single script I can maintain for all platforms. At the same time, I do not want to lose a lot of performance but am willing to lose some.
Startup cost of the script is very important. This script can be called anywhere from every minute to many times per second. As a consequence, keeping it's memory and startup time low are important.
So basically I'm looking for the best scripting languages that is:
- Cross platform.
- Capable of XML parsing and HTTP Posts.
- Low memory and low startup time.
Possible choices include but are not limited to: bash/ksh + curl, Perl, Python and Ruby. What would you recommend for this type of a scenario?
Because of your requirement for fast startup time and a calling frequency greater than 1Hz I'd recommend either staying with C and figuring out how to make it portable (not always as easy as a few ifdefs) or exploring the possibility of turning it into a service daemon that is always running. Of course this depends on how
Python can have lower startup times if you compile the module and run the .pyc file, but it is still generally considered slow. Perl, in my experience, in the fastest of the scripting languages so you might have good luck with a perl daemon.
You could also look at cross platform frameworks like gtk, wxWidgets and Qt. While they are targeted at GUIs they do have low level cross platform data types and network libraries that could make the job of using a fast C based application easier.
Lua is a scripting language that meets your criteria. It's certainly the fastest and lowest memory scripting language available.
"called anywhere from every minute to many times per second. As a consequence, keeping it's memory and startup time low are important."
This doesn't sound like a script to me at all.
This sounds like a server handling requests that arrive from every minute to several times a second.
If it's a server, handling requests, start-up time doesn't mean as much as responsiveness. In which case, Python might work out well, and still keep performance up.
Rather than restarting, you're just processing another request. You get to keep as much state as you need to optimize performance.
When written properly, C should be platform independant and would only need a recompile for those different platforms. You might have to jump through some #ifdef hoops for the headers (not all systems use the same headers), but most normal (non-win32 API) calls are very portable. For web access (which I presume you need as you mention bash+curl), you could take a look at libcurl, it's available for all the platforms you mentioned, and shouldn't be that hard to work with.
With execution time and memory cost in mind, I doubt you could go any faster than properly written C with any scripting language as you would lose at least some time on interpreting the script...
I concur with Lua: it is super-portable, it has XML libraries, either native or by binding C libraries like Expat, it has a good socket library (LuaSocket) plus, for complex stuff, some cURL bindings, and is well known for being very lightweight (often embedded in low memory devices), very fast (one of the fastest scripting languages), and powerful. And very easy to code!
It is coded in pure Ansi C, and lot of people claim it has one of the best C biding API (calling C routines from Lua, calling Lua code from C...).
If Low memory and low startup time are truly important you might want to consider doing the work to keep the C code cross platform, however I have found this is rarely necessary.
Personally I would use Ruby or Python for this type of job, they both make it very easy to make clear understandable code that others can maintain (or you can maintain after not looking at it for 6 months). If you have the control to do so I would also suggest getting the latest version of the interpreter, as both Ruby and Python have made notable improvements around performance recently.
It is a bit of a personal thing. Programming Ruby makes me happy, C code does not (nor bash scripting for anything non-trivial).
Python is good. I would also check out The Computer Languages Benchmarks Game website:
It might be worth spending a bit of time understanding the benchmarks (including numbers for startup times and memory usage). Lots of languages are compared such as Perl, Python, Lua and Ruby. You can also compare these languages against benchmarks in C.
As others have suggested, daemonizing your script might be a good idea; that would reduce the startup time to virtually zero. Either have a small C wrapper that connects to your daemon and transmits the request back and forth, or have the daemon handle requests directly.
It's not clear if this is intended to handle HTTP requests; if so, Perl has a good HTTP server module, bindings to several different C-based XML parsers, and blazing fast string support. (If you don't want to daemonize, it has a good, full-featured CGI module; if you have full control over the server it's running on, you could also use mod_perl to implement your script as an Apache handler.) Ruby's strings are a little slower, but there are some really good backgrounding tools available for it. I'm not as familiar with Python, I'm afraid, so I can't really make any recommendations about it.
In general, though, I don't think you're as startup-time-constrained as you think you are. If the script is really being called several times a second, any decent interpreter on any decent operating system will be cached in memory, as will the source code of your script and its modules. Result: the startup times won't be as bad as you might think.
Dagny:~ brent$ time perl -MCGI -e0 real 0m0.610s user 0m0.036s sys 0m0.022s Dagny:~ brent$ time perl -MCGI -e0 real 0m0.026s user 0m0.020s sys 0m0.006s
(The parameters to the Perl interpreter load the rather large CGI module and then execute the line of code '0;'.)
I agree with others in that you should probably try to make this a more portable C app instead of porting it over to something else since any scripting language is going to introduce significant overhead from a startup perspective, have a much larger memory footprint, and will probably be much slower.
In my experience, Python is the most efficient of the three, followed by Perl and then Ruby with the difference between Perl and Ruby being particularly large in certain areas. If you really want to try porting this to a scripting language, I would put together a prototype in the language you are most comfortable with and see if it comes close to your requirements. If you don't have a preference, start with Python as it is easy to learn and use and if it is too slow with Python, Perl and Ruby probably won't be able to do any better.
Remember that if you choose Python, you can also extend it in C if the performance isn't great. Heck, you could probably even use some of the code you have right now. Just recompile it and wrap it using pyrex.
You can also do this fairly easily in Ruby, and in Perl (albeit with some more difficulty). Don't ask me about ways to do this though.
Can you instead have it be a long-running process and answer http or rpc requests? This would satisfy the latency requirements in almost any scenario, but I don't know if that would break your memory footprint constraints.
At first sight, it's sounds like over engineering, as a rule of thumb I suggest fixing only when things are broken.
You have an already working application. Apparently you want to want to call the feature provided from few more several sources. It looks like the description of a service to me (maybe easier to maintain).
Finally you also mentioned that this is part of a larger solution, then you may want to reuse the language, facilities of the larger solutions. From the description you gave (xml+http) it seems quite an usual application that can be written in any generalist language (maybe a web container in java?).
more details may trigger more ideas :)
Port your app to Ruby. If your app is too slow, profile it and rewrite the those parts in C.