Creating Downloadable CSV (Or Any Other) Files In Browser With JavaScript

Many of us have wanted to create downloadable content straight from the browser.  I remember several years back where it was more difficult and the solutions were not clean.  Today, it is much much more simple and clean.  Enter Blob URLs.

In this example, I have an object of arrays.  These arrays may or may not all be the same length.  The csvify function accounts for this by letting the user optionally specify a default value and whether or not the pad at the beginning or at the end for arrays that aren’t the same size as the max.

It’s what comes after the csvify function that allows all this to work.  To make this csvified data downloadable, I first create a blob.  This blob contains the csv data and is created with the csv mime type.  I then create a url from that blob using createObjectURL and set that url as the href attribute on the download link.

Links:

Posted by Chad Dotson in Programming, Software Engineering, Technology, Tips, 0 comments

“On The Job Training” vs “On The Job Learning”

A career in programming consists more of “on the job learning” than “on the job training.”

Training

Training as something someone else teaches you to do.  There may or may not be any real expectation of advancing the concepts covered in the training.  The analogy would be that if you work in a sprocket factory, you are not taught about the design of the sprocket nor any concepts behind the design of sprockets, you are taught how to make sprockets given direction.  Training is limited, it can only take you so far in a programming career.

Learning

Learning is progressive and continual and a critical necessity for a career in programming.  You should spend lots of time learning concepts, principles, and techniques.  Then you should attempt apply and further those ideas.  A good programmer is always learning from their experiences and asking, “How can I do this better?,”  “Have I clearly communicated my intentions?,” and “Did I do a good job?”  Will you ever find that you have done a bad job or written crappy code, most certainly!  I think that realization is a sign of progress.

TLDR; Be a good programmer, don’t stop learning.

Posted by Chad Dotson in Key Concepts, Programming, Software Engineering, 0 comments

Installing Technical Analysis Library for Python

I’m tinkering with some financial analysis scripts so when I got to looking into some useful python packages, Technical Anaysis Library popped up.  The python bindings require the TA Lib (Technical Analysis Library) which on osx is available via homebrew.  Now, when I originally installed I didn’t want to install it globally so I’ve got the less preferred, local install setup.  This local install results in the following necessary commands to get the pip package to install correctly.

Now that I’ve brew installed TA Lib and set the new include and library path, I can install the python bindings via pip.

 

Posted by Chad Dotson in Programming, 2 comments

Authorizing a twitter-bot

So last night I ran into a small issue, how to authorize a twitter bot to use an application without stubbing together a website and logging in with the bot account?  The answer, this little script using twython:

Source: https://gist.github.com/moonmilk/035917e668872013c1bd#gistcomment-1398946

Posted by Chad Dotson in Programming, Programming Live Blog, 0 comments

Virtualenvwrapper / Updating Tools

If you are using virtualenvwrapper with the default version of python on your system (OSX in particular) and I recommend updating pip and setuptools as part of your postmkvirtualenv hook.  Just add the following line to YOURVIRTUALENVDIRECTORY/postmkvirtualenv.

 

Posted by Chad Dotson in Programming Live Blog, Tips, 0 comments

Don’t use parse_requirements in your code

I just ran into trouble while building a setuptools package.  Specifically, I was using pip.req.parse_requirements to process package dependencies from a requirements.txt file and noticed that upgrading pip would break my code.  Upon further investigation, I found out that they(authors of pip) do not guarantee the api of their internal code.  This is because they officially do not provide a programmatic api to pip see (https://github.com/pypa/pip/issues/2286).  My suggestion, just use readlines, its the same thing and dirt simple.

Posted by Chad Dotson in Programming, Programming Live Blog, 0 comments

Thoughts on MVP

We all know that getting to a minimum viable product (MVP) is a race.  It is a race against competition, market need, industry direction, etc, etc etc.  I came to a realization recently that reaching MVP can also be a race against yourself and how long your technological choices hold out.

For example let us say that I have chosen a specific UI framework.  I chose it because it satisfied a good portion of my initial requirements out of the box, was well established, and generally well maintained.  This UI framework saves me a a lot of time and money along with lets me get something up and going quickly.  So now that I’ve selected a UI framework and have done some work, the requirements grow and evolve and the framework begins showing its age.  Assuming that this happens before I reach MVP, I am left with a bit of a problem: reevaluate and possibly retool the system or keep moving forward.  Retooling the system will mean a step backward and slowdown my timeline to market, but continuing means incurring more debt that will have to be recuperated later.  What is the right choice in this situation?  I believe you have to play it by ear, but favor sticking with your choices for as long as possible.

Posted by Chad Dotson in Programming, Software Engineering, Technology, 0 comments

The Python “in” Operator – Theoretical vs Actual Time Complexity

Background

Sometimes we may generate or retrieve a list, set or even dict when creating collection of things that we will be testing against.  Theoretically a set, frozenset or dictionary should be the fastest (and equivalent) forms of storage for this operation.  However, what I’ve found is that on some systems the set is faster and on others dict is faster.  This difference maybe very important if your writing real-time or close to real-time software.  So where am I going with this?

Big-O Notation – Advertised Complexity

Python has published the expected time complexities of their collection types.  I’ve copied the ones for the in operator below.  These Big-O numbers are exactly what you would expect since everything but a list is implemented using a hashing algorithm.  It should be noted, however, that the speed of the set, frozenset, and dict can be compromised if the objects stored do not implement a good hashing algorithm.

Type Average Worst
list O(n)
set O(1) O(n)
frozenset O(1) O(n)
dict O(1) O(n)


More: Python Time Complexity

What I Found

Going back to my statement above, I found that on certain machines, python sets were faster and on some machines python dicts where faster.  I cannot replicate sets being faster in all cases directly so I tried to replicate it with a RHEL 7.1 machine on AWS.  Given that I was at an optimal case for the collection (no collisions), I would have thought that set, frozenset, and dict at least performed on par with each other.  I was surprised to find with the default python interpreter my tests showed that python dicts are actually faster.  So, I reran the tests with the corresponding version of PyPy and found that the expected results hold true and set and frozenset operate at virtually the same speed as dicts.  I suspect the primary reasons for the differences are the compiler used to create the python binaries.  It was interesting however that PyPy performed as expected on all systems.

The Data

I ran the benchmarks on OSX, Ubuntu 14.04, and RHEL 7.1 (Courtesy of AWS Free Tier);  Though, I opted not to record the RHEL results as they are similar to the Ubuntu results.

Benchmarks Fastest % Difference
OSX
Python
list 5.47 150.641
set 0.85 9.877
frozenset 0.85 9.877
dict 0.77 0.77 0.000
PyPy
list 0.34 89.362
set 0.13 0.13 0.000
frozenset 0.13 0.13 0.000
dict 0.13 0.13 0.000
Ubuntu
Python
list 6.07 123.733
set 1.44 0.697
frozenset 1.49 4.110
dict 1.43 1.43 0.000
PyPy
list 0.78 102.913
set 0.25 0.25 0.000
frozenset 0.25 0.25 0.000
dict 0.26 3.922

Recommendations

If you have a need to create a collection to test for existence like in this example; favor set, frozenset or dict whichever makes sense for your situation.  If you are working with a list your given and you want to speedup the system, you can consider changing the list to a set.

The Code

I’ve uploaded all the code to github.  It is available here: https://github.com/chaddotson/container-membership-benchmark/.

Posted by Chad Dotson in Doing Things Better, Programming, Software Engineering, Tips, 0 comments