Figure this code?

time to read 3 min | 547 words

I have just finished writing this code, and it occurred to me that on its own, it makes very little sense.

private static int RedactedFunctionName(List<int> a, List<int> b)
{
    List<int> n,m;
    if (a.Count > b.Count)
    {
        n = a;
        m = b;
    }
    else
    {
        n = b;
        m = a;
    }

    int nSize = n.Count;
    int mSize = m.Count;

    double o1 = nSize + mSize;
    double o2 = nSize * Math.Log(mSize, 2);

    int result = 0;
    if (o1 < o2)
    {
        int mi = 0, ni = 0;
        while (mi < mSize && ni < nSize)
        {
            if (n[ni] > m[mi])
            {
                mi++;
            }
            else if (n[ni] < m[mi])
            {
                ni++;
            }
            else
            {
                ni++;
                mi++;
                result++;
            }
        }
    }
    else
    {
        for (int i = 0; i < mSize; i++)
        {
            if (n.BinarySearch(m[i]) >= 0)
                result++;
        }
    }
    return result;
}

Can you figure out what this function does, why is it behaving in this manner, and what preconditions it expects?

Tweet Share Share 15 comments

Tags:

challanges

Comments

11 Feb 2015
10:03 AM

Uri

Sets intersection count

11 Feb 2015
10:16 AM

Carsten Hansen

You believe a binarysearch is better when n*log(m)<n+m where m is the smallest set.

And you expect both list to be sorted.

Is the *.count cached or do the program use the painters algoritm?

11 Feb 2015
10:31 AM

Carsten Hansen

See also:

http://stackoverflow.com/questions/7574311/efficiently-compute-intersection-of-two-sets-in-java

http://stackoverflow.com/questions/7164572/c-sharp-fastest-intersection-of-2-sets-of-sorted-numbers

11 Feb 2015
10:48 AM

Traceur

Preconditions -Lists must be sorted(for binary search to have a meaning) -Than it checks for size of the lists and it will take the most effective in terms of memory consumption The result is: The intersection count.

11 Feb 2015
13:11 PM

Ayende Rahien

Carsten, I'm not sure that I understand you. The sets that this method is called with aren't the same all the time

11 Feb 2015
13:12 PM

Martijn

double o2 = mSize * Math.Log(nSize, 2);

11 Feb 2015
13:17 PM

Tim

It looks like its finding equalites in the set, but I would say it could probably be markedly more efficient by dropping the list structure for arrays. And I think that Count is an enumerating operation.

11 Feb 2015
14:39 PM

Wouter Van Schandevijl

Count is a property and is cached. It is cheap (as a property should be).

The Linq Count() tries to cast the IEnumerable to a few types that have a cached count/length and only enumerates if there is no other obvious way. (In the original Linq implementation Count() always enumerated)

11 Feb 2015
19:20 PM

Philip

I'm with Martijn:

o2 = mSize * Math.Log(nSize,2)

11 Feb 2015
19:24 PM

Jason

List's count is not Linq Count(). Completely different thing, http://www.dotnetperls.com/list Count, clear. To get the number of elements, access the Count property. This is fast—just avoid the Count extension method. Count, on the List type, is equal to Length on arrays.

11 Feb 2015
20:03 PM

Catalin Pop

This function tries to optimize intersection when m is far smaller than n.

IE: intersecting m:100 n:1000000 will result in only 1993 compares on average due to binary search, while comparing all items will result in 1000100 compares worst case.

Also Martijn is corect.

n.BinarySearch(m[i]) is Log(n,2) average and you are iterating through m.

Therefore o2 should be double o2 = mSize * Math.Log(nSize, 2);

12 Feb 2015
13:09 PM

Tim

I happily stand corrected.

12 Feb 2015
23:37 PM

Sagan Marketing

The fact that you're using one letter variable names is terrible! I wouldn't even hire you if you wrote code like that in an interview... have you not read Clean Code (http://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882)

13 Feb 2015
15:19 PM

Catalin Pop

Dear Sagan Marketing,

Well if you would not hire Ayende, I doubt you would be able to spot any kind of talented programmers.

Good job in finding real talent by looking at superficial details instead of essence. I'm guessing you would hire a programmer that writes a O N^2 complexity algorithm for intersection that uses multi letter variables instead of someone who writes an O (Log N) one with one letter variables.

14 Feb 2015
09:27 AM

Rob

One letter variable names are perfectly acceptable in a function like this.

It's small, self contained, has no obvious side effects, and performs a single operation.

It's basically a low level maths function, and using the idiom of a mathematical function is entirely appropriate.

Slavishly following coding style dictats like "goto considered harmful", "use long variable names always" and projecting those onto interviewees is so closed minded as to be bordering on bigotry and is a sure fire way to shoot yourself in the foot.

Suffice to say that many talented thoughtful programmers probably wouldn't want to work for someone like that.

Working on and maintaining a high level code base is an entirely different thing, where good code hygiene is essential. But It should still be practiced with thought, consideration, and appropriate strictness.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Figure this code?

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication