Make Mostly Read, Seldom-Written Lists Much More Efficient

One of the many things that I do at work is run a full-blown Search Engine which I also developed from scratch. This Search Engine feeds all product related information to our websites. A search index consists of a pre-computed collection of products, their properties, a list of words that are correctly spelled, and some pre-computed faceted/guided navigation. A search index, until this week, took up approximately 10.7 gigs of memory. This was becoming too large as we added new products every single day.
8 minutes to read

Who Loves Interns?

The topic at hand is interning. More specifically, string interning.

“What is string interning?” you ask? Good question. As you may or may not know, strings are immutable reference types. This means that they are read-only and a pointer will refer to the string’s location on the heap. Typically, a new string is created and stored within your application’s memory each time that you assign a string – even if the same string is defined repeatedly. What this means is that you can define the same string N times and have it take up the string’s memory N times. This sucks when dealing with repeating string data.

6 minutes to read