Sunday, June 07, 2020

Volume of a unit hyper sphere of radius 1

Inspired by foundations of data science, the area of a circle is

$\int_{x=-1}^{x=1}\int_{y=-\sqrt{1-x^2}}^{y=\sqrt{1-x^2}}dydx$

This further extended to a sphere is

$\int_{x=-1}^{x=1}\int_{y=-\sqrt{1-x^2}}^{y=\sqrt{1-x^2}}\int_{z=-\sqrt{1-x^2-y^2}}^{z=\sqrt{1-x^2-y^2}}dzdydx$

This when implemented via maxima is

(%i17) integrate(integrate(integrate(1, z, -sqrt(1-x^2-y^2), sqrt(1-x^2-y^2)),
        y, -sqrt(1-x^2), sqrt(1-x^2)), x, -1, 1);

Maxima goes on to ask if
"Is "(x-1)*(x+1)" positive or negative?"
For a circle this value is definitely negative and voila we get the answer as $4\pi/3$. Higher dimenisons lead to interesting results

For 4 dimensions, we use

integrate(integrate(integrate(integrate(1, x4, -sqrt(1-x1^2-x2^2-x3^2), 
          sqrt(1-x1^2-x2^2-x3^2)), x3, -sqrt(1-x1^2-x2^2), 
          sqrt(1-x1^2-x2^2)), x2, -sqrt(1-x1^2), sqrt(1-x1^2)), 
          x1, -1, 1);

and get the volume as $\pi^2/2$ as the answer which matches what the book predicts

Saturday, May 30, 2020

Passive reaction: Computer Science has changed/changing

In their book Foundations of Data Science, the authors state in the preface that:

"Courses in theoretical computer science covered finite automata, regular expressions, context-free languages, and computability. In the 1970’s, the study of algorithms was added as an important component of theory. The emphasis was on making computers useful. Today, a fundamental change is taking place and the focus is more on a wealth of applications. There are many reasons for this change. The merging of computing and communications has played an important role. The enhanced ability to observe, collect, and store data in the natural sciences, in commerce, and in other fields calls for a change in our understanding of data and how to handle it in the modern setting. The emergence of the web and social networks as central aspects of daily life presents both opportunities and challenges for theory.

While traditional areas of computer science remain highly important, increasingly researchers of the future will be involved with using computers to understand and extract usable information from massive data arising in applications, not just how to make computers useful on specific well-defined problems. With this in mind we have written this book to cover the theory we expect to be useful in the next 40 years, just as an under standing of automata theory, algorithms, and related topics gave students an advantage in the last 40 years. One of the major changes is an increase in emphasis on probability, statistics, and numerical methods.
...

Modern data in diverse fields such as information processing, search, and machine learning is often advantageously represented as vectors with a large number of components"

Sounds like I am a student of the past that focused more on Automata Theory and algorithms (traditional) in the past. I am yet to catch up with the mentioned emergence and opportunities topics. Time to learn new things, but it's going to be tricky to do it by myself, but I am glad to see someone thinking of the next 40 years.


Sunday, November 03, 2019

Frustration with the state of math in schools

I remember growing up and enjoying math in class. The learning method used in school was not very exploratory, but my dad (who was a genius in my opinion) made it a lot of fun. My dad was very protective also about what I learnt, I used to look at my elder brothers book and learn calculus in grade 8, but my dad suggested that the wrong background would lead me down to a path of losing interest. I guess he was right, but then by 8th grade I was able to calculate square and cube roots using the Newton Raphson method.

NOTE: I had learnt the formula for squares and cubes without necessarliy mastering how we arrived at the differential, it was algorithm/formula for me to follow

My frustration is with the math of today, some with the teachers where I put my son for some extra classes and he was asked to remember formulae without detailed concepts. I see why those topics are important, but it seems like:


  • Teachers who want to clarify concepts and teach nicely are too slow in catching up with topics to be covered
  • Teachers catching up with topics and keeping good pace, don't spend enough time on concepts

Short of explaining everything to my son and writing a book or blog posts it's going to be hard to find a balance. I have some hope in the form of Khan Academy, but my son is not too interested in it at the moment.

If someone has pointers to great graphics/programs that can help, please do. Otherwise, I'll try and cover ratio and proportions for my first post.

Sunday, July 09, 2017

Dynamic programming for the binomial coefficient

More fun things, this time with some visualisation of what happens when memoisation is used and what happens when we don't. I don't these these algorithms are the most efficient algorithms


#
# dynamic programming for binomial co-efficient
#
def dynamic_bc(n, k, indent=0):
    #
    # return (n k) using dynamic programming
    # tail cases are k == 0 or k == 1
    # (n 0) = 1 and (n 1) = n
    for i in range(0, indent):
        print " ",
    print "%d:%d" %(n, k)
    if (k == 0):
        arr[n][k] = 1
        return 1
    if (k == 1):
        arr[n][k] = n
        return n
    if (n == k):
        arr[n][k] = 1
        return 1
    # (n k) = (n!)/(n-k)!k!, lets split it further
    # n!/(n-k)!k! = n(n-1)!/(n-k)!k(k-1)!
    # or (n k) = (n-1 k) + (n -1 k-1)
    if (arr[n-1][k-1] == 0):
        arr[n-1][k-1] = dynamic_bc(n-1,k-1,indent+1)
    if (arr[n-1][k] == 0):
        arr[n-1][k] = dynamic_bc(n-1, k,indent+1)
    return arr[n-1][k] + arr[n-1][k-1]

def bc(n, k, indent=0):
    #
    # tail cases are k == 0 or k == 1
    # (n 0) = 1 and (n 1) = n
    for i in range(0, indent):
        print " ",
    print "%d:%d" %(n, k)
    if (k == 0):
        return 1
    if (k == 1):
        return n
    if (n == k):
        return 1
    # (n k) = (n!)/(n-k)!k!, lets split it further
    # n!/(n-k)!k! = n(n-1)!/(n-k)!k(k-1)!
    # or (n k) = (n-1 k) + (n -1 k-1)
    return bc(n-1,k,indent+1) + bc(n-1,k-1,indent+1)

number = 6
select = 3
arr = [[0 for x in range(0, number)] for x in range(0, number)]

print bc(number, select)
print dynamic_bc(number, select)
The output for the first call with full recursion

6:3
  5:3
    4:3
      3:3
      3:2
        2:2
        2:1
    4:2
      3:2
        2:2
        2:1
      3:1
  5:2
    4:2
      3:2
        2:2
        2:1
      3:1
    4:1
20

The output for For the first call with full recursion

6:3
  5:2
    4:1
    4:2
      3:1
      3:2
        2:1
        2:2
  5:3
    4:3
      3:3
20
Python supports memoization via functools (wraps, lru_cache, etc). I am using the wrapper (decorator pattern). Using the following pattern makes the programming so transparent
#
# dynamic programming for binomial co-efficient
#
from functools import wraps

def dynamic_bc2(func):
    cache = {}
    def wrap(*args):
 if args not in cache:
  cache[args] = func(*args)
 return cache[args]
    return wrap

@dynamic_bc2
def bc2(n, k, indent=0):
    #
    # return (n k) using dynamic programming
    # tail cases are k == 0 or k == 1
    # (n 0) = 1 and (n 1) = n
    for i in range(0, indent):
        print " ",
    print "%d:%d" %(n, k)
    if (k == 0):
        return 1
    if (k == 1):
        return n
    if (n == k):
        return 1
    # (n k) = (n!)/(n-k)!k!, lets split it further
    # n!/(n-k)!k! = n(n-1)!/(n-k)!k(k-1)!
    # or (n k) = (n-1 k) + (n -1 k-1)
    return bc2(n-1,k-1,indent+1) + bc2(n-1,k,indent+1)

number = 6
select = 3

print bc2(number, select)
Comparing the output will show the affect of functools. The output with functools is:
6:3
  5:2
    4:1
    4:2
      3:1
      3:2
        2:1
        2:2
  5:3
    4:3
      3:3
20

Thursday, July 06, 2017

Iterative combinations algorithm

The algorithm is quite straight forward, see the code in python below


#
# Do iterative combinations generation
# From Knuth's observation, we know for (6 3) that
# [0, 1, 2]
# [0, 1, 3]
# ..
# [3, 4, 5]
# Basically we can see a set of for loops, if we call the columns c1,c2 and c3 then
#
# for c3 = 2 to n-1
#  for c2 = 1 to c3-1
#   for c1 = 0 to c2-1
#
# The loop gives us what we need
# My implementation is based on an index (i) and max (n)
# which are used to determine the next value to generate
#
def iterative_combination(n, k):
    a = [i for i in range(0, n)]
    index = k - 1
    done = False
    while (done == False):
        print a[0:k]
        while (a[index] == n - (k - index)): # boundary for that index
            index = index - 1
            if (index < 0):
                done = True
                break
        a[index] = a[index] + 1
        while (index < k - 1):
            index = index + 1
            if (a[index] == (n - (k - index))): # initialize the next neighbour
                a[index] = a[index - 1] + 1

Random CS picture

Here is a random picture from a computer science topic



The picture is a counter example of why greedy selection does not work optimally for the set cover problem. If the picture seems not so well done, it's because my asymptote skills are lacking :)

You can probably guess what I'm reading by connecting set cover with combinatorial generation of combinations.

Wednesday, July 05, 2017

Combinations revisited

I've been trying to relearn some of the combinatorics I used to know, I redid a nice recursive algorithm (from early university days), with python it looks so beautiful and nice. A simple implementation is below, so nice and simple.



#
# I'm inspired by two algorithms I saw, both are recursive.
#
# Example:
# ((1, 2, 3, 4), 2) - means select 2 out of 3
# should produce (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)
#
# The idea from a recursive perspective is this
# we pick an index - say 1, then we pick the next (n-1) elements
# to go with it
#

def rec_comb(a, i, j, k, n):
 #
 # array a built upto index i
 # selected k at a time of max
 # length n
 #
 if (i == k):
  print a[0:k]
  return
 for l in range(j, n):
  a[i] = l
  rec_comb(a, i+1, l+1, k, n)
 return

def combination(n, k):
 #
 # n (numbers from 1..n), chosen k at a time
 #
 a = [0 for i in range(0, n)]
 rec_comb(a, 0, 0, k, n)
The sample output for ${6 \choose 3}$ is:

[0, 1, 2]
[0, 1, 3]
[0, 1, 4]
[0, 1, 5]
[0, 2, 3]
[0, 2, 4]
[0, 2, 5]
[0, 3, 4]
[0, 3, 5]
[0, 4, 5]
[1, 2, 3]
[1, 2, 4]
[1, 2, 5]
[1, 3, 4]
[1, 3, 5]
[1, 4, 5]
[2, 3, 4]
[2, 3, 5]
[2, 4, 5]
[3, 4, 5]
The output lends itself to a simple and nice iterative algorithm, I'll follow up on.

The interesting follow up to combinatorial generation algorithms is using ranking and unranking. At this point, I think we need a O(n) to identify if two combinations are the same -- given in any order of elements. For example to compare $\{1, 2, 3\}, \{1, 3, 2\}, \{2, 1, 3\}, \{3, 1, 2\}, \{2, 3, 1\}, \{3, 2, 1\}$ we need to identify them and associate the same rank with them. I think for a subset of size $k$, the cost of ranking is $\theta(k)$, but I need to see if there is a better way to do it.


Friday, July 01, 2016

Thoughts on Agile programming - team size

I've been thinking about this for a very long time. What would be an ideal agile team size -- what would be a core team size to be more precise. I've been told in the past that it's 5, but I think it's 4 and here is how I think the team should be organized

Role Role Role Role
Developer1 Developer2 Reviewer/Test1 Reviewer/Test2
Here Reviewer/Tester and Developers should exchange roles if required and be flexible to work both ways for maximum efficiency. Note the role of Reviewer/Tester could be easily done by another Developer.

Here is how I would schedule their interactions

Day Room1 Room2
1 Developer1 and Reviewer/Test1 Developer2 and Reviewer/Test2
2 Developer2, Developer1 and Reviewer/Test1 Reviewer/Test2
3 Developer2, Developer1 and Reviewer/Test2 Reviewer/Test1
4 Reviewer/Test2, Developer1 and Reviewer/Test1 Developer2
5 Reviewer/Test2, Developer1 and Reviewer/Test2 Developer1
6 Developer2, DeveloperReviewer/Test2 and Reviewer/Test1

Teams will think I am crazy for suggesting working 6 days a week, this problem can be resolved by splitting day 6 into some hours each day. For example 1.5 hours a day could be kept aside each day for the developers/reviewer-tester to interact and day 6 could be eliminated all together

In my scheme, each person gets a day/some time to themselves each week. All of them collaborate together and work together frequently. In this scheme indepedence is as important as the inter-dependance between each member

What do you think?

Live patching on PPC64LE

Michael Ellerman has a great blog with all the technical details. There was a lot of work, brain storming and discussions involved. Check out http://mpe.github.io/posts/2016/05/23/kernel-live-patching-for-ppc64le/


Volume of a unit hyper sphere of radius 1

Inspired by foundations of data science, the area of a circle is $\int_{x=-1}^{x=1}\int_{y=-\sqrt{1-x^2}}^{y=\sqrt{1-x^2}}dydx$ Thi...