23 November 2010

Inspired by a conversation with a few people on twitter I came up with a
mind-map laying out a personal syllabus of learning goals. I've been out
of school for a few months and I have a sinking feeling that I'd better
keep my gears turning or the proverbial *"use it or lose it..."* bit
will completely grab me with its forgetfulness.

One of the branches or areas that I want to keep fresh is algorithm
evaluation. Its also an area of weakness for me. Its been at least two
years since I had a class that focused on BigO notation or on speed,
I've never taken a formal algorithms class, and lastly on a professional
note I've never seen anyone do an analysis on a performance problem
using anything other than capturing run times in a log. From my
experience developers sprinkle log captures throughout their code in
areas they *think* are bottlenecks.

Professionally I haven't seen a discussion/analysis on why certain bits of code are faster or slower other than watching developers do it by trial and error. This next bit is taken from Project Euler. I've done two implementations. Can you see why one might run faster than the other?

**Example**: If we list all the natural numbers below 10 that are
multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is
23. (Problem #1)

**Question:** Find the sum of all the multiples of 3 or 5 below 1000.

*Assumption if we input 16 the multiples are: 3, 5, 6, 9, 10, 12, 15,
15*

def multiple1(limit): count = 0 for i in range(1, limit): if (i % 3) is 0: count += i if (i % 5) is 0: count += i return count

Let's run the first algorithm with a value of 10,000,000.

ordo-grande:Desktop matt$ time ./play.py 1000000 the answer is: 266666333333 real 0m0.423s user 0m0.373s sys 0m0.047s

And now the second algorithm.

def multiple2(limit): count = 0 for i in range(3, limit, 3): count += i for i in range(5, limit, 5): count += i return count

ordo-grande:Desktop matt$ time ./play.py 1000000 the answer is: 266666333333 real 0m0.104s user 0m0.083s sys 0m0.018s

I drew up these two methods on the whiteboard in my team room yesterday along with the problem declaration. Note, I didn't include the run times. I asked my team which method was more efficient and over half the team guessed wrong. I won't spell out the discussion or the evaluation of the two examples, but can you see why one is faster/more efficientthan the other?

Tweetblog comments powered by Disqus