The tax calculation challenge
People seems to be more interested in answering the question than the code that solved it. Actually, people seemed to be more interested in outdoing one another in creating answers to that. What I found most interesting is that a large percentage of the answers (both in the blog post and in the interviews) got a lot of that wrong.
So here is the question in full. The following table is the current tax rates in Israel:
Tax Rate | |
Up to 5,070 | 10% |
5,071 up to 8,660 | 14% |
8,661 up to 14,070 | 23% |
14,071 up to 21,240 | 30% |
21,241 up to 40,230 | 33% |
Higher than 40,230 | 45% |
Here are some example answers:
- 5,000 –> 500
- 5,800 –> 609.2
- 9,000 –> 1087.8
- 15,000 –> 2532.9
- 50,000 –> 15,068.1
This problem is a bit tricky because the tax rate doesn’t apply to the whole sum, only to the part that is within the current rate.
Comments
Thomas, Did it got cut off?
Oops, my code was truncated because of the '<' sign... here it is on pastebin http://pastebin.com/NCmjVJE0
Hmm, everything on this page is monospaced now...
Ayende, I hereby file a bug. :-)
What's the tax rate for 5,070.5?
Wow, tax rates in Isreal are freakin' high!
https://gist.github.com/1237076 tem minutes
Jonty, Do the math.
ten*
At least the submitted answers use decimal instead of int..
@Daniel Lang: Stay away from Belgium then..
11 240 to 18 730 euro 40 % 18 730 to 34 330 euro 45 %
Stay away from Sweden too... :) Our taxes starts at 30 % and the sky is the limit...
decimal GetTax(decimal amount) { var brackets = new[] { new TaxBracket(0, 5070, 0.1m), new TaxBracket(5070, 8660, 0.14m), new TaxBracket(8660, 14070, 0.23m), new TaxBracket(14070, 21240, 0.3m), new TaxBracket(21240, 40230, 0.33m), new TaxBracket(40230, decimal.MaxValue, 0.45m) };
return brackets.Sum(r => r.GetTaxInBracket(amount)); }
public class TaxBracket { public TaxBracket(decimal lower, decimal upper, decimal rate) { Lower = lower; Upper = upper; Rate = rate; }
public readonly decimal Lower; public readonly decimal Upper; public readonly decimal Rate;
public decimal GetTaxInBracket(int amount) { if (Lower > amount) return 0;
} }
I think Geert wins in terms of readability...
You should stay away from germany. You would need a hole army of developers to calculate the tax rate because the laws are so freakin complex... :)
The trick is: don't calculate the value for each rate, just calculate the value for the bigger rate and apply a fixed discount.
For example: 5800 -> 5800 * 0,14 - 202,84 (202,84 is the fixed discount) 9000 -> 9000 * 0,23 - 982,33
I know I didn't use any fancy linq tricks, which I'd probably do to make it more elegant, but this seems to pass the tests listed above: http://pastebin.com/zJADQGdx
I'm quite pleased, actually. I can never normally do this kind of thing; my brain isn't geared up for it.
Well, my solution... Basically the same as all the others...
https://gist.github.com/1237185
It's pretty funny how similar the ayende blog readers are writing their code. here's mine: https://gist.github.com/1237210
Here's a Scala version http://mysticpaste.com/view/10034
Here's mine : http://pastebin.com/A5fzvFqt
Hmm, after reading the comments mine is similar but I didn't need the Lower bounds recorded. Ugliest bit was updating the upper boundary taxable amounts based on their previous amounts but it works (incl. income of 0 and decimal.max (I wish:)
https://gist.github.com/1237303
Spent 20 min on it, mostly trying to find a cleaner way to generate/update the tax brackets.
Hi, How about this Python program: taxRates=[ { 'min':40230 , 'rate':0.45 }, { 'min':21240 , 'rate':0.33 }, { 'min':14070 , 'rate':0.3 }, { 'min':8660 , 'rate':0.23 }, { 'min':5070 , 'rate':0.14 }, { 'min':0 , 'rate':0.1 } ]
def taxRate(amount): tax=0; for taxRate in taxRates: if amount > taxRate['min']: tax = tax + round((amount - taxRate['min']) * taxRate['rate'],1) amount = taxRate['min'] return round(tax,1)
my using components:
http://pastebin.com/VCqiwbuk
My ruby implementation is naive and assumes you enter the tax steps' boundaries with relation to each other (instead of absolute upper bounds), but a proper design would use some kind of builder, where you can add steps one after the other and finish with a "final step" (I don't like this decimal.MaxValue thing, it's smelly). The resulting calculator would keep precalculated discounts like in Diogo's suggestion to avoid running through all the steps all the time.
Are the solutions breaking o/c principle ? what if a new tax comes into play ?
Wow, the Scala version is nice.
Got it working relatively neatly in C# after working out that I was moving the wrong bound to account for the inter-rate gap.
If this was real-life though, I'd be going to the local tax expert and asking them what happens with the unspecified shekels rather than trying to reverse engineer it.
Ayende,
Is there a error in the problem statement? They way it is written states that if income falls between two tax rates, for example 5070.50, then no tax rate would apply. Does income get rounded to the nearest whole number before calculation? Or should the problem statement be changed to take into consideration fractional units of income?
Wow! The scala version is nice.
Managed to get it working in C# relatively neatly after I realised that I was moving the wrong bound in order to account for the inter-rate gap.
http://pastebin.com/4qL8NHi9
If this was real life though, I'd probably just go to the local tax expert and ask them how the unspecified shekels should be accounted for, rather than trying to reverse engineer the rules.
Here is my solution:
http://pastebin.com/ZtMd4gzJ
Any comments would be appreciated.
use the Force, Luke...from LinqPAD:
int GetSlice(int salary, int startRange, int endRange) { if (salary <= startRange) return 0; if (salary >= endRange) return endRange - startRange; return salary - startRange; }
IEnumerable<decimal> TaxTable(int salary) { yield return GetSlice(salary, 0, 5070) * .1m; yield return GetSlice(salary, 5070, 8660) * .14m; yield return GetSlice(salary, 8660, 14070) * .23m; yield return GetSlice(salary, 14070, 21240) * .30m; yield return GetSlice(salary, 21240, 40230) * .33m; yield return GetSlice(salary, 40230, Int32.MaxValue) * .45m; }
void Main() { TaxTable(5000).Aggregate((a,b) => a + b).Dump(); TaxTable(5800).Aggregate((a,b) => a + b).Dump(); TaxTable(9000).Aggregate((a,b) => a + b).Dump(); TaxTable(15000).Aggregate((a,b) => a + b).Dump(); TaxTable(50000).Aggregate((a,b) => a + b).Dump(); }
Solution with nice small DSL :) http://jsfiddle.net/val2048/pY7Vm/1/
Phil, That is how it is define by law. In practice, it means that you can safely ignore this. That is, you can say that if you are getting 5,070.5 that half a shekel isn't taxed.
@Mike: "Are the solutions breaking o/c principle ?" Yes & No.
Most of the solutions are focused on building a reasonable code segment to solve a specific code problem.
Even given an application that had a similar requirement with fixed boundaries and no matching requirement to make boundary to add/remove/change boundaries (even if it looked plausable that this is something they'd want in the future) I personally would still implement something clean and simple like many of these solutions. Otherwise you're having to write code and unit tests (and/or include test cases) for non-requirements. If the customer never asks for it, it's a waste. If the customer needs it in the future then it should be kept easy to re-factor and they'd pay for the new requirements.
Interesting, in Canda they use the wording "If amount is X or less then Y, If the amount is more than X but not more than W then Z"
Here you go.
https://gist.github.com/1237614
Ok, here you go. https://gist.github.com/1237614
F#: http://pastebin.com/g9UDu5vF
This would make an excellent codegolf challenge. Semi readable using linq: http://pastebin.com/vduGhdJY
https://gist.github.com/1237707
another one
Sorry for the duplicate comment. I wish there was a way to delete one.
Apparently I lost some lines in my cut/pasting...this should be at the top:
called like this in case its not obvious, sorry for triple-posting,should've just paste-bin'd this...
(yes I'm lame and did a console app instead of wiring up tests)
F# solution from yesterday: Gist at: https://gist.github.com/1236106
taxOfIsrael 5800 = 609.20
with missing important piece included :)
Here's a fairly straightforward option in c#:
static readonly Dictionary<decimal, decimal> tiers = new Dictionary<decimal, decimal> { {40230, .45m},
{21240, .33m}, {14070, .30m}, {8660, .23m}, {5070, .14m}, {0, .10m} };
static decimal CalculateTax(decimal salary) { decimal salaryRemaining = salary; decimal tax = 0; while (salaryRemaining > 0) { var tier = tiers.Where(t => salaryRemaining > t.Key).First(); var rate = tier.Value; var amountToTax = (salaryRemaining - tier.Key); tax += rate * amountToTax; salaryRemaining -= amountToTax; } return tax; }
Every single one of these answers is wrong. Where are the tests?
I wrote a solution using TDD. http://technofattie.blogspot.com/2011/09/solving-ayendes-tax-woes.html
Now that I am looking over the other solutions it is clear that a lot of people solved it in a very similar way. Oh well, it was a nice break from normal work for a little bit.
"Hmm, everything on this page is monospaced now..." That is because this blog uses HTML blacklisting sanitization instead of encoding it. Blacklisting is nearly always wrong (because buggy), encoding is nearly always the right solution. Sry for banging on this point but this issue drives me nuts. I event went through whole SO and commented on every wrong sanitization code example. It is just wrong.
To make this more concrete: Your regex based blacklisting does not catch imbalanced tags or singe lt/gt chars. It does not catch unclosed tags either. That is because it is humanly impossible to write correct blacklisting code ;-)
As someone who considers themselves a novice and definitely in the "would be applying to beginner positions", how is this code? http://pastebin.com/VirBFdtB
I don't know, seemed like a fun problem. My solution isn't as elegant as many of the others, but I'm curious if it would be considered a turnoff to an employer.
@Jamie -- every single one eh? Did you even look at them all? The second one added (hazzik's) had tests. So did Damien Powell's example. Get off your high horse and look at them before making a blanket statement like that.
Another way to look at this problem. If I have a salary of 50,000... The answer is calculated as:
(50000 - 40230) * .45 + 10671.6 = 15068.1
So I wrote this is a series of rules, check for a match and calculate tax. Could even convert this into a Boo DSL.
http://pastebin.com/yzxRqSx0
I've been working for a big payroll house the past two years. We didn't calculate tax, as theire are third party libraries for that sort of thing, but I did enough excel spreadsheets for creating test data to know how tax is calculated.
Readability is a personal preference thing, but I decided to take a bunch of the C# samples and benchmark their performance and test that they actually worked... Here are the results:
Timed tests...
Steve Py x 5000: 9.0005ms Alexander x 5000: 7.0004ms Damien x 5000: 23.0013ms Dmitry x 5000: 8.0004ms Eric x 5000: 8.0005ms Geert x 5000: 13.0007ms Link Goron x 5000: 9.0005ms Peter x 5000: 12.0007ms Thomas x 5000: 9.0005ms Steve Py x 100000: 103.0059ms Alexander x 100000: 97.0056ms Damien x 100000: 375.0215ms Dmitry x 100000: 128.0073ms Eric x 100000: 146.0083ms Geert x 100000: 225.0129ms Link Goron x 100000: 140.008ms Peter x 100000: 173.0099ms Thomas x 100000: 150.0085ms Steve Py x 1000000: 1014.058ms Alexander x 1000000: 972.0556ms Damien x 1000000: 3782.2164ms Dmitry x 1000000: 1321.0755ms Eric x 1000000: 1459.0835ms Geert x 1000000: 2281.1304ms Link Goron x 1000000: 1411.0808ms Peter x 1000000: 1714.098ms Thomas x 1000000: 1504.086ms
Unit tests...
Steve Py... Steve Py x Passed! Alexander... Alexander x Passed! Damien... Damien x Passed! Dmitry... Dmitry x Passed! Eric... Eric x Passed! Geert... Geert x Passed! Goron... Goron x Passed! Peter... Peter x Passed! Thomas... Thomas x Passed!
Sorry to anyone that had c# examples that I missed...
The unit tests section just ran through Ayende's examples plus $0. Since most of the samples used decimal, I converted the double/int versions to decimal for consistency of test sets. This was just measured with DateTime & Timespan checks not a precision timer so it's not going to be terribly accurate. There were swings between test runs but the performance comparison was pretty-much the same.
Notables: Alexander's was noticably better performing than the rest, and all passed the test scenarios.
Damien's sample nearly ended up with an all-fail by a factor of 100 until I spotted he chose to pass tax rates as percentages.
Geert's sample didn't compile as provided. 2 minor adjustments needed. Additionally Geert's sample was improved a bit by extracting out the set of taxbrackets as a member variable instead of each time the method was called: Geert x 1000000: 2281.1304ms to Geert x 1000000: 1944.1112ms
This brought him a bit closer in-line with the other samples.
I think calculating benchmarks for something like this is a bit pointless as the fastest algorithm is also likely to be the hardest to maintain. So just for kicks I created a fast version of in C# - which funnily enough is similar in spirit to the interviewee's submission :)
Full Gist at: https://gist.github.com/1238962
public static decimal CalcTaxFast(int salary) { const int band1 = 5070; const decimal rate1 = 0.10m; const int band2 = 8660; const decimal rate2 = 0.14m; const int band3 = 14070; const decimal rate3 = 0.23m; const int band4 = 21240; const decimal rate4 = 0.30m; const int band5 = 40230; const decimal rate5 = 0.33m;
}
P.S: For ultimate perf, someone should rewrite this in ASM :)
Correction: Oddly enough Damien's test failed when working with decimals and passing the decimal rate (I.e. .23) while removing the / 100.
Damien x Failed. : $5800, expected $609.2, actual $609.34| $9000, expected $1087.8, actual $1088.17| $15000, expected $2532.9, actual $2533.57| $50000, expected $15068.1, actual $15069.55
It seems the rounding may have been concealing a bit of a bug. (or I may have lost something in translation.) One thing that did look a bit odd was that there was overlap between ranges. (5070-8660, 8660-14070) This may be throwing off the calculation.
For fun when I was looking over Peter's example (which is quite a bit of code to maintain) it got me thinking perhaps a double - linked list might work, but the best I could get out of it without it looking obfuscated was still slightly slower than my original.
Additionally I made one pretty obvious mistake in the test sets: I had it generating a random range of values from $0-3M. This is going to result in a very disproportionate # of values falling into the upper tax bracket. I revised the generation to between $0-100k and the results changed a bit. Apparently Alexander's solution handles a lot of rich people better than mine. The extra load I had at start up though pays off in the larger # of requests.
I also incorporated Frank's sample from the previous post since that was one lean looking piece of code. It covered all of the test cases however the performance was surprisingly unimpressive:
Steve Py x 5000: 8.0004ms Steve Py 2 x 5000: 9.0006ms Alexander x 5000: 7.0004ms Damien x 5000: 17.0009ms Dmitry x 5000: 7.0004ms Eric x 5000: 7.0004ms Geert x 5000: 12.0007ms Link Goron x 5000: 8.0005ms Peter x 5000: 12.0007ms Thomas x 5000: 9.0005ms Frank x 5000: 8.0004ms Frank x 5000: 7.0004ms Steve Py x 100000: 95.0054ms Steve Py 2 x 100000: 95.0054ms Alexander x 100000: 101.0058ms Damien x 100000: 237.0135ms Dmitry x 100000: 114.0065ms Eric x 100000: 127.0073ms Geert x 100000: 204.0117ms Link Goron x 100000: 129.0074ms Peter x 100000: 163.0093ms Thomas x 100000: 144.0083ms Frank x 100000: 150.0085ms Frank x 100000: 129.0074ms Steve Py x 1000000: 921.0527ms Steve Py 2 x 1000000: 935.0534ms Alexander x 1000000: 1076.0616ms Damien x 1000000: 2413.138ms Dmitry x 1000000: 1145.0655ms Eric x 1000000: 1278.0731ms Geert x 1000000: 2079.1189ms Link Goron x 1000000: 1298.0742ms Peter x 1000000: 1637.0937ms Thomas x 1000000: 1381.079ms Frank x 1000000: 1464.0837ms
but the original was declaring the arrays in each call, a quick tweak turned: Frank x 1000000: 1464.0837ms to: Frank x 1000000: 1261.0721ms
In any case, not a concern for such a slick little bit of code.
Hi @Steve Py
Out of curiosity what numbers do you get if you plug CalcTaxFast from: https://gist.github.com/1238962 in?
Just another C# solution: https://gist.github.com/9cbe9ab68f75f52eebfb
Java solution. It's slightly different from others. I used a factory to create calculators and some parameter validations (like a precondition). I think that practices are important, even in naive implementations. https://gist.github.com/1239534
Here's the simplest solution I could come up with:
Regarding the benchmarking: The idea was not to try and come up with the fastest solution, it was to measure the potential performance impact of what people thought were good solutions. The goal of the test should be that
1. (Obviously) the code compiles and runs.
2. The code passes a set of test scenarios.
3. The code is reasonably well structured and easy to follow.
Depending on the whims of the project lead they are bound to look for other things such as it being easy to extend or swap out interest rates, validates itself robustly, is unit tested effectively, is efficient, etc. The idea of a benchmark isn't always to make things faster, it can be to measure the potential trade off in performance to try making something better.
If anyone wants to play around with the benchmark and add their own sample that I missed, it is available for download at: http://www.mediafire.com/?d91x97jvzx14jd6
So got I the job?
I compulsively order from small to large, hence the Reverse :D Not the most efficient but nice and small. https://gist.github.com/1241154
In Australia, progressive tax is usually specified by the tax office as:
(Between {min} and {max}, {base amount} plus {% on each $ above min}.)+
Which has the nice side effect of the primary source material easily translating into a ordered list which makes a LINQ query a readable SkipWhile, First.
While most of the answers are quite "clever" and show an understanding of some nice coding methods, I would pick Daniel Lidström's answer if I was hiring.
It's short, to the point, simple, easy to follow, and shows a greater understanding of the customer's needs than the programmer's needs.
Tax for the portion below each bracket can be pre-calculated, it's static, you simply have to tax the remainder.
Of course it would probably load the rates from a database with pre-calculated values (that should be calculated on data entry instead of on use), but there was no requirement for that in the "test" above. This could be easily refactored later to allow for this.
Most things turn out to be a fold:
@Simon,
This function can actually be generalised to deal with any ranges of objects without any code changes the only thing required is that the objects passed in as
rates
support comparison (IComparable) and the addition operator. It is only the naming that makes this function specific to taxes.Of couse this could be done in C# aswell through generics but the ceremony is far higher.
@Colin,
You're hired.
@Bruno, you're fired.
public class TaxCalc { public static decimal Calculate(int salary) { var brackets = new List<TaxBracket>{ new TaxBracket(0, 5070, 10), new TaxBracket(5000, 6000, 10), new TaxBracket(5071, 8660, 14), new TaxBracket(8661, 14070, 23), new TaxBracket(14071, 21240, 30), new TaxBracket(21241, 40230, 33), new TaxBracket(40231, Decimal.MaxValue, 45) };
}
public class TaxBracket { public decimal MinValue {get;set;} public decimal MaxValue {get;set;} public decimal Rate {get;set;}
}
readability<--->performance. Just my 2c.
http://pastebin.com/Ezi5BL7u
Let me join the party. My approach is to allow more trivial tax rate setting. But the implementation is a bit tricky.
http://pastebin.com/YkUB8RvG
And now, my super-over-engineered (and quite useless) solution!
http://nbaldi.codeplex.com/SourceControl/list/changesets
To be the gadfly: there is no question. Just saying.
F#: https://gist.github.com/1247782
From C# Program in LINQPad
http://pastebin.com/gmwKbwws
My solution - https://gist.github.com/1248932
@Robert Thanks for read my implementation. Please, give me a more detailed feedback. It's important to my learning.
I over engineered a C# solution: http://pastebin.com/VDQ6VFhY . It's based on Geert Baeyaert solution.
Comment preview