Andreas Grabner About the Author

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

SharePoint: List Performance – How list column indices really work under the hood

Have you ever wondered what is really going on under the hood when you index a column in SharePoint? Everybody keeps talking about indexing columns once your lists exceed the 2000 items limit. Knowing that everything in SharePoint is basically stored in one single table made me wonder how indexing individual columns is actually implemented. You might be surprised!

Scenario 7: Boosting performance with indexed columns?

Let’s first have a look at the main table in the SharePoint Content Database: AllUserData is the table that stores all items in any list of the SharePoint site. For backward compatibilty, a SQL View with the name UserData exists that returns data from AllUserData.
Besides the main columns in AllUserData that are used to identify the list, site, version, author, … we also see multiple columns for every possible data type that SharePoint columns can have. We can for instance see 64 nvarchar columns used to e.g.: store text columns, 16 ints, 12 floats, 8 datetime, …

Parts of the AllUserData Table Schema in the SharePoint Configuration Database

Parts of the AllUserData Table Schema in the SharePoint Configuration Database

This table schema makes it impossible to define database column indices on the custom columns of a SharePoint list. Why? Because every list is stored in this table and every list would have different columns with different index settings.

How does the index work?

Once you decide to index a column in SharePoint you will see that the table NameValuePair[_{SQLCollation}]has been filled with the values of the columns that you indexed. Lets have a look at the table schema.

Each indexed column value is stored in this table with a reference to its list item

Each indexed column value is stored in this table with a reference to its list item

The table will be filled with the current column values once you decide to index a column. When editing, deleting or updating values in your SharePoint list – the changed values of the index column will also be updated in the NameValuePair table.

How is the index table accessed?

If you define a view or if you use the SharePoint object model to execute a CAML query the NameValuePair table will be joined with the UserData view in case you specify a filter on your indexed column. I defined a list where I indexed one of my text columns. I also defined a view that specified a filter on this column. Browsing to this SharePoint view now only shows me the items in the list that match my filter leveraging the NameValuePair table. Here is a sample SQL Statement that was executed when accessing my view:

NameValuePair table JOIN's UserData and is used in WHERE clause

NameValuePair table JOIN

You can see the join as well as the WHERE clause that was defined on the indexed column in the NameValuePair table.

What is the overhead?

As we now know – every list item that has an indexed column will create a new row in the NameValuePair table. The table also needs to be updated whenever indexed columns change their value or when items are added or deleted. Having a list with 100000 items and an index column for instance means that you also add 100000 items to the NameValuePair table. And – every time you update your list – NameValuePair need to be updated as well.

Conclusion

The SharePoint Content database schema doesn’t allow indexing in the AllUserData table itself. To overcome this limitation the NameValuePair table was introduced. Using SharePoint indices will definitely speed up selective access to items stored in large lists or when the overall numbers of items in your content database is high. You should however consider the side-effects of indexed columns. Every indexed value is stored in a separate table and those values must be kept up-to-date when items are changed.

As SharePoint is built on the .NET Platform you might also be interested in my latest White Papers about Continuous Application Performance for Enterprise .NET Systems

Comments

  1. Hey, thanks for Sharing!

  2. Thanx a lot for sharing. It was very helpful.

  3. Glad that it helps. Let me know if you have any insight that you can share with the SharePoint community

  4. Thanks for this information, it proved very usefull to fix the problem described in the linked page.

  5. Thanks so much. This has helped us understand SharePoint better. We will be showing it to our customers. Thanks.

  6. We work with a lot of large SharePoint lists and SharePoint data. This article is definitely helpful in understanding how columns indices are created. Thanks

  7. Meenu Shekhawat says:

    what does “column name(r)” means in any sharepoint site column name?

  8. Hi Meenu
    Not sure if I understand your question. Could you be more specific? I would also try a SharePoint specific user discuss forum. My blog targets Application Performance and SharePoint is just one of the application frameworks we talk about
    Andi

  9. how does this is pointshare?

Trackbacks

  1. [...] Besides the main columns in AllUserData that are used to identify the list, site, version, author, … we also see multiple columns for every possible data type that SharePoint columns can have. We can for instance see 64 nvarchar columns … Read more [...]

  2. [...] Here is a great in-depth article on what happens when you define an index on a SharePoint column.  You [...]

  3. [...] Andreas Grabner wrote an article on how list column indexing works under the hood. [...]

  4. [...] SharePoint: List Performance – How list column indices really work under the hood [...]

Comments

*


2 + five =