improving

Node.js vs Python vs PyPy – A Simple Performance Comparison – Updated

n_queens_graph
n_queens_table

Some History

This is a followup to my original post: Node.js vs Python vs PyPy – A Simple Performance Comparison.  This article corrects a discrepancy caused by a slight difference in the JavaScript implementation which skewed the Node.js results.

The Algorithm

As stated in the previous article, I’ve attempted to implement the same single-thread, brute force, recursive algorithm in many different languages.  There is nothing overly special about this algorithm and I’ve made no attempts to optimize it.

The Findings

Node is fast, very fast.  It easily outperforms any of the other implementations I’ve included in the puzzle’s repository.  As you can see by the included charts, the performance difference between Node.js and out-of-the-box Python is very significant and the difference between it and PyPy while less pronounced is significant.

Special Notes

  • I’ve placed my source on GitHub at the following url: https://github.com/chaddotson/puzzles.  It now contains functional N-Queens puzzle implementations in JavaScript, Python, Lua, Scala, and Ruby.  There is also a version in Rust, but that needs to be updated to the latest syntax before it can be run again.
  • This is just with one type of algorithm, the best solution might and probably does change depending on what type of application you are researching.  For webserver performance, Node.js is slightly better than PyPy running Tornado.
  • This algorithm is a simple brute force algorithm, there are many faster and better ones out there.
  • See the original article for the Node.js vs Python vs PyPy – A Simple Performance Comparison for more details memory performance.
Posted by Chad Dotson in Misc, Programming, Software Engineering, Technology, 5 comments

Node.js vs Python vs PyPy – A Simple Performance Comparison

NQueensGraph

IMPORTANT NOTE: The NodeJS algorithm had a slight discrepancy in it.  See this article for a correction to the performance comparison section of this article.

The Algorithm

Yesterday, I decided to try translate my algorithm for calculating N-Queens to JavaScript.  I’ve implemented the same single-thread, brute force, recursive algorithm in many different languages with the biggest difference being the syntax of the language.  Once I completed the JavaScript Implementation, I ran the program with the latest version of Node.js.

The Findings

I knew Node was fast but it still surprised me.  As you can see by the included charts, the performance difference between Node.js and out-of-the-box Python is pretty significant.  Its not until the algorithms complexity and recursion depth hit certain limits that Node.js’s performance starts to falter.

Node.js and CPython – What’s The Difference?

You might ask what is behind this performance difference.  The answer is actually pretty simple.  It all boils down to how the code is being executed.  Node.js uses the V8 JavaScript Engine (Wikipedia | Google) written by Google and a part of the Chrome Browser.  V8 includes a just-in-time compiler that compiles the JavaScript to machine code before execution and then continuously optimizes the compiled code.  Python is a bytecode interpreter; meaning that the default interpreter (CPython) doesn’t execute Python scripts directly.  Instead, it first generates a intermediate file that will later be interpreted at runtime.

Ways To Get Better Performance

If you want to use Python, we can overcome the differences between Node.js and vanilla Python by using PyPy, an alternative implementation of Python that includes a just-in-time compiler.  For the algorithm I wrote, you can see a pretty good performance boost over Node.js when using PyPy.

Special Notes

  • I’ve placed my source on GitHub at the following url: https://github.com/chaddotson/puzzles
  • This is just with one type of algorithm, the best solution might and probably does change depending on what type of application you are researching.  For webserver performance, Node.js is slightly better than PyPy running Tornado.
  • This algorithm is a simple brute force algorithm, there are many faster and better ones out there.
  • At a board size of 15, Node.js could no longer run the algorithm due to its maximum recursion limit.
memory_usage
memory_usage_chart

Edit – A Follow-Up

The original focus of this article was shear performance, but I’ve received a question regarding the memory footprint of the 3 methods.  I think that is a very good and valid question.  So, I reran the tests to capture the peak memory utilization by each.  For this test I used “/usr/bin/time -l” to capture the maximum resident set size.  While this isn’t exactly the peak amount of memory utilized by the algorithm, it is sufficiently close to report on.

New Findings

Upon rerunning the tests for capturing memory utilization, I found that for the most part memory utilization contrasts performance.  A higher memory utilization isn’t really unexpected, if you think about it.  Essentially, the jit is sacrificing memory for performance.  In most cases, this isn’t really that bad.  Using a jit is just a cheap way of boosting performance of code written in an interpreted language.  The boost in performance, speed of which it was written and the maintainability of it outweigh memory utilization concerns in many cases.

The Oddity

As you can see, I’ve included a chart covering all the solutions for boards 8×8 to 14×14.  During most increments in board size, the memory utilization seems to increase exponentially; however, when we hit the 14×14 board size we see all the cases level off at relatively the same memory utilization of around 300 MB.  At this time, I really don’t have a good answer for this.  I could certainly speculate, but I’d rather not until I know more.

 

Posted by Chad Dotson in Programming, Software Engineering, Technology, 28 comments

Knowing When You’ve Wrote Crappy Code

Note: This article was kicked off by one I read over at LosTechies.

We write lots of code, statistically speaking some of it is what we’d deem as “crappy.” If you’re doing things right and progressing in your career and your understanding; your definition of “crappy code” should change over time.  This is very important concept for a good Software Engineer, its one of the ways we get better.  We recognize our past coding mistakes and work to better them.

Dangers of Crappy Code:

  • It could hide technical debt.  “I don’t really know how or why this works, but here it is. Done.”
  • You’ve not taken the time to make what works, right.  This potentially leads to bloated code, duplication, and an overall poor product quality.
  • Is it robust?
  • It is, potentially, not very reusable.
  • Can I hand this code off to someone else and they understand it?

Preventing Crappy Code:

The prospect of writing crappy code should not prevent you from getting a project working.  However, it is very important to make it as right possible before committing the code changes to the repository.

  • Write self-documenting code.  Some developers may scoff at this statement, but self documenting code is very possible in just about any language.  Remember comments are your failure to communicate.
  • Remember to refactor, refactor, and refactor.
  • Ask for the opinion of a peer.  This should be someone you consider qualified enough to give an opinion.
  • Use “TODO” comments so your thought process is not lost and it will serve as a reminder that you must still make something right.  This doesn’t prevent crappy code from making it into a project but it documents its existence.

What To Do When You Find Crappy Code:

For the sake of this topic, lets assume you run across some crappy code 2 years into maintenance of a software product.

  • Identify why it is “crappy.”
  • Does it work or has it unknowingly introduced bugs into the system?
  • Perhaps leave a NOTE comment in the code, especially if it can add insight into the function of the crappy code.
  • Only change the code if it is within the scope of your current task.  Remember that it has worked for 2 years and changing it now could potentially introduce error.  If you do end up changing the code, attempt to make it right before you finish.  Also make sure to update your unit tests.

Are you Ready?  A Simple Test

Review a code base that you wrote 2 years ago.  Did you find “crappy code?”  The answer should almost certainly be a “Yes.”

 

Posted by Chad Dotson in Doing Things Better, Key Concepts, Programming, Software Engineering, 5 comments

Presentation Is Everything

PRESENTATION IS EVERYTHING

There is something that learned over the past couple of years, presentation is everything. It is quite possibly the most important element in all we do.

Presentation from the aspect of leadership can mean the difference between shared vision and unengaged worker bee. Presentation from the aspect of a product ( in my case software product) means the difference between phenomenal success and utter failure of a product.

In short, no matter how good or revolutionary your product is, if it looks bad, it will fail. To have a physical product example, look no further than the current mobile phone market. Look at two products: one you like and one you hate. What are the specs of each one? Are they comparable or does one have an advantage over the other. Now, for arguments sake, lets say the more “cool” one has a better physical appearance and a more flashy user interface but the one that is less “cool” looking and has a plain user interface has better specs and capabilities, which one would you buy? I’m betting (and society proves) that the “cooler” one with the more flashy interface wins out more than the other device. I’ve got no facts in front of me to prove this directly, only experience and my own beliefs.

FIXING OUR SOFTWARE

So how do we fix our software? Well here are some design concepts to keep in mind.

  1. The API and/or the user interface for software must be intuitive.
  2. The software features must be discoverable. This means that users can learn to use the software by using the software. There is not a need for an involved manual to pick up the software and begin using it.
  3. The software must be easy to use. This is an ambiguous and loaded statement I know, but never-the-less it must be true. Just think about the software you are writing from the user’s perspective.

These are some of my most important concerns for a software product because if a user cannot easily pick up your software and start using it immediately, then you will have problems with adoption within your targeted user base. As with the mobile phone example above; if you have two pieces of software equivalent in terms of capabilities and specs, but one is presented a lot better and is more user friendly, which would you choose?

Going Forward

What we’ve learned here can be summed up in the two following statements. Software must be intuitive, discoverable, and easy to use. Also, presentation and appearance are quite possibly the most important part of software.

Posted by Chad Dotson in Doing Things Better, Featured, Key Concepts, Software Engineering, 0 comments