Awhile back I had some thoughts on communication. If you’ve ever played World of Tanks Blitz you’d know that basically its a team of tanks against another team of tanks. With the pick up, fast paced nature communication is minimal at best (sometimes limited to a single “<<<<<<<<<” or “>>>>>>>>>” indicating which direction to take the offense). I found that teams that could coordinate with minimal communication, play their tank roles (scouts, mediums, heavies, and destroyers), and move fast could achieve massive overwhelming victories. Something similar is probably true in an agile/teamwork environment. Know your stuff, know your role, take opportunities, work together, succeed.

Have you ever encountered a situation where it would be beneficial to save whatever is written to the console across browser sessions?  If you have, you know there are not many solutions.  I recently decided that localStorage would be a good candidate to implement a solution to this problem with.

logsaver.js

logsaver.js is a lightweight, simple log preservation script.  It utilizes the flexibility of JavaScript to replace the default console.log function with a new, customized one.  Along with writing messages to the original console.log, this custom console.log  also writes messages to the specified localStorage key.

  • To start saving console.log messages, call startSavingLog().
  • To stop saving console.log messages, call stopSavingLog().
  • To clear saved console.log messages, call clearSavedLog().
  • To completely remove the localStorage key used by logsaver, call removeSavedLog().

For more information on options and how to use them, checkout the readme.

While you can copy the source below, I strongly recommend grabbing the latest from github here: logsaver.js.

Notes:

  • It doesn’t handle objects well.  Results in [Object object] appearing in the localStorage key value.  This is because localStorage is limited to strings.
  • It is possible to exceed the maximum space allocated to you in localStorage.  For desktop browsers, Chrome and Firefox have a 10 MB limit, while Safari and IE both have a 5 MB limit.  These number differs for mobile browsers.  More details.  In this event, logsaver.js will stop trying to save the log and display additional error messages in the console.

v0.1.0 Source

With the exploration of space and the eventual colonization of Mars humanity needs to establish a method for improving internet connectivity for people on the surface.

One method of achieving this is to place one or more satellites in high orbit around the planet.  These would use interplanetary internet(delay tolerant) technology and protocols to service requests as well as actively mirror content for primary sites located on Earth.  The goal of this strategy is to reduce the latency of user requests from 8-48 minutes down to at most a few seconds for popular content.

What is needed for a single node:

  • High bandwidth communications equipment similar to what would have been on the Mars Telecommunications Orbiter.
  • A cluster of servers in an a spacecraft the the size of a supply module.  Something similar to Microsoft’s Project Natick would be about right for a first generation.  There would be a redundant capacity built in to cover eventual server failure.
  • A heat management system.
  • A nuclear power generation system.

Now we just need NASA, Microsoft, SpaceX, Amazon (w/Blue Origin) AWS – Mars, or Google to make it happen!

Have you encountered the following scenario?

You are trying to solve a problem (or helping solve a problem) and know or at least think you know the solution.  You are in the middle of implementing it when someone else looks at it and says, “why don’t you do it this way, isn’t this way easier/better?”  Taking a step back, you realize that the question not only has merit but is a better and much more obvious solution; you can’t believe you missed it.

What happened?

I think its because you were too close to the problem and had developed a very narrow focus.  That narrow focus prevented you from seeing the better solution.  Perhaps this is even a variation of functional fixedness in that we’ve latched onto an idea of how to solve a problem and our mind’s may not see alternatives easily.

What can we do?

  • Think about the broad (or product) level goals regularly.
  • Entertain questions and/or suggestions from others.
  • Ask: “Is this the best way?”
  • Ask: “Is this the practical way?”
  • Don’t overthink the problem.
  • Get it working then evaluate the solution and/or do a code review!

The python logging module offers a wide variety of logging options and handlers.  One thing missing from the documentation is when to use each level.

A quick foreword

You really should familiarize yourself with the logging package.  How to create new loggers (I find creating them by module very useful).  There are many ways to configure logging, I tend to like dictConfig from logging.config (but start off with basicConfig form logging).

A Word on Optimal Setups

I prefer to setup my logging with each module having its own logger.  This allows me to configure logging levels at a package and/or module level.  I typically do the following in each module to create a logger.

Assuming my package structure consists of the following:

– foo (package)
—– core (module)
—– bar (module)

We can configure varying levels of logging for each element, as seen in the following snippet from a dictConfig.

In this example, the root ( ” ) logger (those not configured by any other settings) reports INFO level and up messages.  With the exception of the bar module, the foo package only reports WARNING level and up messages.  The bar module is set to a more verbose DEBUG level, to show information needed for debugging.

Selecting A Log Message Level

Out of the box, there are six default logging levels recognized by the logging module, most are self-explanatory.  I’ll just make some notes about usage.  (From here on out, I’ll refer to my logging instance as logger.)

For general status messages, you should use logger.info (INFO).  For errors, use either logger.critical (CRITICAL) or logger.error (ERROR).  For all exceptions, use logger.exception (ERROR).  logger.exception will automatically include stack trace information about the exception for you in the log. When you want verbose debugging information, use logging.debug (DEBUG)

In Closing

  • Use the logging module instead of print statements.
  • Always use logger.exception for logging exceptions.
  • Favor logger.debug for verbose log statements.
  • Favor logger.info for most other log statements (with the exception of errors).
  • Don’t forget that each of the logging functions uses C-style formatting.

Background

Sometimes we may generate or retrieve a list, set or even dict when creating collection of things that we will be testing against.  Theoretically a set, frozenset or dictionary should be the fastest (and equivalent) forms of storage for this operation.  However, what I’ve found is that on some systems the set is faster and on others dict is faster.  This difference maybe very important if your writing real-time or close to real-time software.  So where am I going with this?

Big-O Notation – Advertised Complexity

Python has published the expected time complexities of their collection types.  I’ve copied the ones for the in operator below.  These Big-O numbers are exactly what you would expect since everything but a list is implemented using a hashing algorithm.  It should be noted, however, that the speed of the set, frozenset, and dict can be compromised if the objects stored do not implement a good hashing algorithm.

Type Average Worst
list O(n)
set O(1) O(n)
frozenset O(1) O(n)
dict O(1) O(n)


More: Python Time Complexity

What I Found

Going back to my statement above, I found that on certain machines, python sets were faster and on some machines python dicts where faster.  I cannot replicate sets being faster in all cases directly so I tried to replicate it with a RHEL 7.1 machine on AWS.  Given that I was at an optimal case for the collection (no collisions), I would have thought that set, frozenset, and dict at least performed on par with each other.  I was surprised to find with the default python interpreter my tests showed that python dicts are actually faster.  So, I reran the tests with the corresponding version of PyPy and found that the expected results hold true and set and frozenset operate at virtually the same speed as dicts.  I suspect the primary reasons for the differences are the compiler used to create the python binaries.  It was interesting however that PyPy performed as expected on all systems.

The Data

I ran the benchmarks on OSX, Ubuntu 14.04, and RHEL 7.1 (Courtesy of AWS Free Tier);  Though, I opted not to record the RHEL results as they are similar to the Ubuntu results.

Benchmarks Fastest % Difference
OSX
Python
list 5.47 150.641
set 0.85 9.877
frozenset 0.85 9.877
dict 0.77 0.77 0.000
PyPy
list 0.34 89.362
set 0.13 0.13 0.000
frozenset 0.13 0.13 0.000
dict 0.13 0.13 0.000
Ubuntu
Python
list 6.07 123.733
set 1.44 0.697
frozenset 1.49 4.110
dict 1.43 1.43 0.000
PyPy
list 0.78 102.913
set 0.25 0.25 0.000
frozenset 0.25 0.25 0.000
dict 0.26 3.922

Recommendations

If you have a need to create a collection to test for existence like in this example; favor set, frozenset or dict whichever makes sense for your situation.  If you are working with a list your given and you want to speedup the system, you can consider changing the list to a set.

The Code

I’ve uploaded all the code to github.  It is available here: https://github.com/chaddotson/container-membership-benchmark/.

The Scenario

This scenario illustrates two possible mistakes people make when using the python logging module.  Analyze the following code and look for issues.

So what is wrong with that?

First and foremost, the code fails to use the existing Logging.exception function that could and in most cases should be used when logging exceptions.  That function will automatically add all the exception info to the log, meaning that you will have the stack trace!  Secondly, this sample used the string.format function to format the log message for the logging library when the logging library can in fact handle string formatting itself via old style format specifiers.

Fixing it

If I were to ignore the first problem, the following code is what I should have written.  The benefit here is that the formatting is only executed if the log message is to be captured, unlike the first method.

Taking both errors into account, we should have used the exception function instead of the error function on the logger as well as the built in formatting.  Given both of these, the code becomes.

More Data

This scenario led me to quantifying the error in execution time.  The first set of data is related to logging alone; the second set extends to timing the different string formatting options.  As you can see by the data, using the format is a good bit slower than the built-in “old-style” formatting in the logging package.  While it will add up, it isn’t a world ending difference if done on a small scale.  Again, the time difference is largely due to the fact that no formatting takes place unless the message has a high enough level.  This data caused me to extend my study into timing the two different formatting options.  As you can see by the data, the “old style” is marginally slower than the format style.

Comparing old style to new style string formatting

 

In the end

You should use functionality the API gives you.  In most cases, and the case with python, it has been engineered to work, be fast and be maintainable.  For more information on the logging module, check the python docs.  2.7 or 3.5.

So, you want to add your PyCharm project files to a VCS but you constantly deal with problems because each of your team members have different names/locations for their project interpreter.  There is a rather simple solution to this problem.  Basically, at the project level, PyCharm only cares about the name of the interpreter, not the location.  Follow these instructions to give your interpreter a name that is consistent across developers and then use that in your project files to fix the issue.

    1. Open PyCharm
    2. Select File >> Settings (or Configure >> Preferences if you don’t have a project open).
    3. Search for the Project Interpreter setting (this will also work if you don’t use the project interpreter in your run configurations).
    4. Hit the little gear box next to the project interpreter then select More.
    5. Select the interpreter from the virtual environment of your choice and hit the edit button.
    6. Now give it a unique name (preferably identifiable and related to your project.
    7. Now, select that unique name as your default project interpreter or your run configurations.
    8. Commit
    9. Make sure each team member does the same.

Now whenever you update or commit, you won’t constantly see changes associated with people selecting their interpreter.

If you want to know more, I definitely recommend reading a thread over at Jetbrains’ Support.