Using Linq with MongoDB

Last week we looked at accessing MongoDB using the C# driver. This week we will look at accessing it using Linq.

If you followed the instructions for last weeks post, you already have everything you need to get up and running. So lets just jump right in.

Referencing MongoDB.Linq

To access the Linq functionality of the MongoDB C# driver, you need to add a reference to the MongoDB.Linq.dll which comes in the distribution with the C# driver. Simply add a reference to it.

image

Write the Linq Code

Next we want to replace the query code from last week with the Linq query code. So find the following code in last week’s Program.cs file.

// Create a specification to query the orders collection.
var spec = new Document();
spec["customerName"] = "Elmer Fudd";

// Run the query.
Document result = orders.FindOne( spec );

And replace it with this code…

// Query the orders collection.
var results =
	from doc in orders.AsQueryable()
	where doc.Key( "customerName" ) == "Elmer Fudd"
	select doc;
var result = results.FirstOrDefault();

The IMongoCollection interface exposes the AsQueryable() method which gives you access to all the Linq goodness.

I realize this was a very short post this week, but that is more a testament to the ease of use than to the laziness of the blogger.

Next week we’ll dig deeper into querying the db.

-Chris

Posted in .NET, C#, Linq, MongoDB, NoSQL | Tagged , , , , | Leave a comment

Accessing MongoDB via C#

Last week we looked at setting up a MongoDB instance, and working with it through the Mongo shell. This week, we look at accessing it through the C# driver.

Getting the Driver

There are several .NET drivers available for MongoDB. You can check out the list here. I chose to use the mongodb-csharp driver. It seems to to have the widest feature set, and the most support. To use it, you can either build it from the source, or download the binaries directly here.

Using the Driver

To use the driver, simply create a new project and add a reference to it. For our sample, create a new console application. Then right click on references, choose add reference, and find the MongoDB.Driver.dll and add it.

image

After adding the reference we are ready to write code. Here is a very simple snippet of code to test out Mongo. This is the full content of Program.cs.

using System;
using MongoDB.Driver;

namespace MongoTest
{
	internal class Program
	{
		private static void Main( string[] args )
		{
			// Connect to the mongo instance.
			var mongo = new Mongo();
			mongo.Connect();

			// Use the myorders database.
			Database db = mongo.GetDatabase( "myorders" );

			// Get the orders collection.
			IMongoCollection orders = db.GetCollection( "orders" );

			// Create a new order.
			var order = new Document();
			order["orderAmount"] = 57.22;
			order["customerName"] = "Elmer Fudd";

			// Add the new order to the mongo orders colleciton.
			orders.Insert( order );
			Console.WriteLine( string.Format( "Inserted: {0}", order ) );

			// Create a specification to query the orders collection.
			var spec = new Document();
			spec["customerName"] = "Elmer Fudd";

			// Run the query.
			Document result = orders.FindOne( spec );
			Console.WriteLine( string.Format( "Found: {0}", result ) );

			Console.WriteLine("Press Enter to Exit.");
			Console.ReadLine();
		}
	}
}

The comments should explain most of what is happening. One thing worthy of note is the Document class. The Document class is much like a dictionary. Remember that MongoDB does not have a static schema. Therefore a dictionary serves the purpose well since it is dynamic by nature.

There are other drivers (such as NoRM) out there that will map a static C# class into MongoDB much like NHibernate or another ORM, however, then you lose some of the benefit of the dynamic schema, however, depending on your needs, this may be acceptable.

Starting up MongoDB

Note that to use this, you need to have the instance of MongoDB up and running. You can follow the instructions from last week’s post to get it installed. If its already installed, just open a command prompt, go to the bin folder in the install directory (in my case C:\mongodb\bin), and execute the command mogod.

image

Running the Code

Now you are ready to run the application. Doing so gives the following output.

Inserted: { “_id”: “4bde57114076a60500000001”, “orderAmount”: 57.22, “customerName”: “Elmer Fudd” }

Found: { “_id”: “4bde41cd4076a60df0000001”, “orderAmount”: 57.22, “customerName”: “Elmer Fudd” }

Press Enter to Exit.

Next week we’ll either dive a bit deeper into the driver’s api, or look at the autosharding capabilities of MongoDB. Not sure which yet. See you then!

-Chris

Posted in .NET, C#, MongoDB, NoSQL | Tagged , , , | 2 Responses

Test Driving MongoDB

Lately, I have been looking into MongoDB and other alternatives to traditional relational databases. MongoDB seems to be the best fit for the kinds of scenarios I am working with here at BancVue.

What I am looking for…

The project I am currently working on is using messaging to transfer large amounts of data from many remote source systems. This data then hydrates a master database that represents a consolidated view of all the data from those remote systems. This master system can then feed other systems the data they need. Based on this system, here is a list of the qualities I need:

Fast Inserts
Because there is little querying, but large amounts of inserts, they need to be fast.

Handle Large Data Volumes
The consolidated dataset can get very large. Therefore we need a system that can handle massive amounts of data.

Horizontally Scalable
Because we need to house so much data, it requires us to scale to multiple machines. Ideally, the system would easily scale to multiple machines as needed (something SQL-based databases don’t do very well).

Parallelized Queries
Since we do have a need to query the database to get the data out, it would be nice if the system could parallelize the queries so they would be more performant when scaled out.

Easy to Setup and Use
I hate the thought of working on something that takes a UNIX guru to set up (because I am not that guru), and I don’t have a whole heck of a lot of time to devote to learning a new system at the moment.

MongoDB fulfills all these needs. It has a very fast insert speed, can scale to thousands of machines, can automatically shard the data across those machines to store large volumes of data, and run map/reduce queries in parallel across those machines to produce results. Not to mention that it is easier to get running than any db I have ever used.

Setting up MongoDB

Setting up an instance of MongoDB could not be much easier. Simply follow the instructions on the MongoDB Quickstart page:

  1. Download the binaries and extract them.
  2. Create a folder for the data. (C:\data\db or /data/db depending on OS).
  3. Execute mongod.exe in the extracted bin folder.

Now you have a running instance of MongoDB!

Screenshot of Terminal running mongod

Connecting to the MongoDB Instance

MongoDB comes with its own interactive shell. Run mongo.exe from the extracted bin folder to start it up. It will automatically connect to the instance we started up on our local box. (You can use command line args to connect to instances on other machines of course).

image

From here, you can interactively execute commands against the database.

Working in the Shell

MongoDB does not use SQL, instead, it uses JavaScript as its query language. This is all very well documented on the MongoDB website. Lets walk through a few commands to get you started. The text in green is the command you type in. The text in blue is the server’s response.

Getting a list of databases

The show dbs command will display a list of all the databases on this server.

> show dbs

admin

local

mongo_session

test

>

This is the list of all the databases in your mongo server.

Switching to use a different database

To switch to another database you use the use command just like in SQL.

> use myorders

switched to db myorders

>

Notice that you may pass a new database name to the use command and it will work. Actually, MongoDB will not create the database until you actually insert something. You can see this by executing show dbs again. The myorders database does not show in the list.

Inserting data into a collection

MongoDB uses the concept of collections in the same way Sql uses tables. However, since MongoDB is a document database, it does not constrain all the objects in a collection to the same structure like tables do. Each object can have its own structure (or schema). Thus, MongoDB is called a “schema-less” database.

Lets insert some data into a collection now.

> order1 = {orderAmount:25.00, customerName:”Bob Smith”};

{ “orderAmount” : 25, “customerName” : “Bob Smith” }

> db.orders.save(order1);

>

So what did we just do?
We created a variable called order1, and assigned it a JSON object. (Did I mention it was all JavaScript based?). Then we inserted order1 into the orders collection.

Where did the orders collection come from?
It was created automatically by MongoDB. Likewise, the database was created at the same time. If you want to run show dbs again now, you will see the myorders database.

> show dbs

admin

local

mongo_session

myorders

test

>

Querying the database

Lets now look at what was inserted. First, we’ll look at all the records in the collection. We do this by calling the find() function on the collections with no arguments.

> db.orders.find();

{ “_id” : ObjectId(“4bd…”), “orderAmount” : 25, “customerName” : “Bob Smith” }

>

You will notice the introduction of a new field called _id. This is an autogenerated id that serves as the primary key. You can override this key if you like, but that is beyond the scope of this test drive.

This is the simplest kind of query, we can also query by giving a prototype object to match on:

> db.orders.find( { orderAmount: 25 } );

{ “_id” : ObjectId(“4bd…”), “orderAmount” : 25, “customerName” : “Bob Smith” }

>

Notice we got the same result…it found the one record. If you supply an object to the find method, it will search for all items that match all the fields provided.

Conclusion

Well, I hope you found this test drive useful. Check out the documentation for more information.

I am impressed with the ease of setup and use of MongoDB so far. Next time, we’ll look at accessing MongoDB through C#.

-Chris

Posted in MongoDB, NoSQL | Tagged , | 2 Responses

A Streaming Message Writer & Reader in C# & Json.NET

Here is a C# implementation of a high performance message reader and writer that can read and write messages to any stream using Json.NET. We are using a similar implementation here at BancVue as our message store and it is performing quite well.

I settled on Json.NET after trying several other serializers and reading a few posts on serialization performance. Objects serialized as Json are much smaller than Xml, and the Json.NET project seems to have the fastest serializer, and pretty wide support in the developer community.

Overview

What I set out to create was something that can serialize millions of messages to a temporary holding place, then deserialize them for processing later. This serializer simply writes the Json to a file, but it could be used to write to any stream. There is a MessageWriter and a MessageReader. Together they can be used to form a “Message Store” as mentioned in some of Greg Young’s posts. Below is a unit test modeling how I want to use these objects.

[ DataContract ]
internal class TestMessage
{
	[ DataMember ]
	public string Text { get; set; }
}

public class Given_a_message_writer_and_reader_tied_to_the_same_stream : ContextSpecification
{
	protected MemoryStream _stream;
	protected IMessageWriter _writer;
	protected IMessageReader _reader;

	protected override void SharedContext()
	{
		_stream = new MemoryStream();
		_writer = new MessageWriter( _stream );
		_reader = new MessageReader( _stream );
	}
}

[ Concern( typeof ( MessageReader ) ) ]
public class When_a_message_is_written_to_the_stream : Given_a_message_writer_and_reader_tied_to_the_same_stream
{
	private TestMessage _message;

	protected override void Context()
	{
		_message = new TestMessage {Text = "TestValue"};
	}

	protected override void Because()
	{
		_writer.WriteMessage( _message );
		_writer.Flush();

		// Need to rewind to beginning of stream so we can read.
		_stream.Seek( 0, SeekOrigin.Begin );
	}

	[ Observation ]
	public void Should_be_able_to_retrieve_that_message_from_the_reader()
	{
		var actualMessage = _reader.ReadMessage< TestMessage >();
		Assert.That( ( actualMessage ).Text, Is.EqualTo( _message.Text ) );
	}
}

Simply calling writer.WriteMessage() will write the message to the stream. Calling reader.ReadMessage() is all that is needed to read a message from the stream. I added Flush() to the writer so that when writing to files I can use a buffer. Calling flush simply calls flush on the underlying stream.

The Interfaces

Here are the interfaces for the writer and reader.

public interface IMessageWriter : IDisposable
{
	void WriteMessage( object message );
	void Close();
	void Flush();
}

public interface IMessageReader : IDisposable
{
	T ReadMessage< T >();
	bool Eof { get; }
	void Close();
}

The Implementation

Now lets look at the implementations. First, the writer…

public class MessageWriter : IMessageWriter
{
	private readonly StreamWriter _writer;
	private readonly JsonSerializerSettings _jsonSerializerSettings;

	public MessageWriter( Stream stream )
	{
		_writer = new StreamWriter( stream );

		_jsonSerializerSettings =
			new JsonSerializerSettings
				{
					DefaultValueHandling = DefaultValueHandling.Ignore,
					NullValueHandling = NullValueHandling.Ignore,
					MissingMemberHandling = MissingMemberHandling.Ignore,
					TypeNameHandling = TypeNameHandling.Objects
				};
	}

	public void WriteMessage( object message )
	{
		_writer.WriteLine( JsonConvert.SerializeObject( message,
		                                                Formatting.None,
		                                                _jsonSerializerSettings ) );
	}

	public void Close()
	{
		_writer.Close();
	}

	public void Flush()
	{
		_writer.Flush();
	}

	public void Dispose()
	{
		Close();
	}
}

Something to note… I am using WriteLine() to write each object. This separates them with a CRLF. I had to do this because Json.NET currently requires you to have only one root element…unless you want to deserialize all the root elements as an array. Since I have millions of messages, I can’t create an array of that size in memory or I will run out of RAM. Until this is fixed in Json.NET, I will just use the WriteLine mechanism. It has worked well so far.

As you may have guessed, I am using ReadLine() in the reader to retrieve each message. Here is the reader’s implementation…

public class MessageReader : IMessageReader
{
	private readonly StreamReader _reader;
	private readonly JsonSerializer _serializer;

	public MessageReader( Stream stream )
	{
		_reader = new StreamReader( stream );

		_serializer = new JsonSerializer
				{
					DefaultValueHandling = DefaultValueHandling.Ignore,
					NullValueHandling = NullValueHandling.Ignore,
					MissingMemberHandling = MissingMemberHandling.Ignore,
					TypeNameHandling = TypeNameHandling.Objects
				};
	}

	public T ReadMessage< T >()
	{
		return (T)ReadMessage();
	}

	public object ReadMessage()
	{
		string line = _reader.ReadLine();

		return _serializer.Deserialize(
				new JsonTextReader( new StringReader( line ) ) );
	}

	public bool Eof
	{
		get { return _reader.EndOfStream; }
	}

	public void Close()
	{
		_reader.Close();
	}

	public void Dispose()
	{
		Close();
	}
}

Getting Json.NET to deserialize your types

Notice the JsonSerializerSettings. Most of these are to compress the Json by not showing empty or null values, but notice the last setting:

TypeNameHandling = TypeNameHandling.Objects

This setting tells the serializer to insert the type names of your objects into the Json itself. It then can use it when deserializing to determine what type to create. It does increase the size of your Json, but it makes it much easier to work with since you don’t have to know exactly what type of message you are deserializing.

Closing

All in all, this serializer/deserializer set is very fast. We are using a similar implementation here at BancVue in our production environment and are very pleased with the results.

FYI: Multiple message test

Here is a test that shows a bit more about how we use the serializer when working with multiple messages.

[ Concern( typeof ( MessageReader ) ) ]
public class When_multiple_messages_are_written_to_the_stream : Given_a_message_writer_and_reader_tied_to_the_same_stream
{
	private TestMessage _message1;
	private TestMessage _message2;

	protected override void Context()
	{
		_message1 = new TestMessage {Text = "TestValue1"};
		_message2 = new TestMessage {Text = "TestValue2"};
	}

	protected override void Because()
	{
		_writer.WriteMessage( _message1 );
		_writer.WriteMessage( _message2 );
		_writer.Flush();

		// Need to rewind to beginning of stream to read.
		_stream.Seek( 0, SeekOrigin.Begin );
	}

	[ Observation ]
	public void Should_be_able_to_retrieve_that_message_from_the_reader()
	{
		var actualMessage1 = _reader.ReadMessage< TestMessage >();
		Assert.That( actualMessage1.Text, Is.EqualTo( _message1.Text ) );

		var actualMessage2 = _reader.ReadMessage< TestMessage >();
		Assert.That( actualMessage2.Text, Is.EqualTo( _message2.Text ) );
	}
}

-Chris

Posted in .NET, C#, Messaging, SOA | Tagged , , , | Leave a comment

WiX: Setting the Install Directory from an Environment Variable

I pulled my hair out for hours trying to figure out how to get WiX to generate an installer that pulled the target install directory from an existing environment variable. This is easy to do if the variable is one of the standard windows file locations like “Program Files” or “App Data”. However, if the environment variable is a custom one, its not so straightforward.

To get it to work, I had to use a custom action to set the directory value. And if I wanted it to show up in the UI, I had to run the custom action in the InstallUISequence (not the InstallExecuteSequence).

Below is the code that I ended up using to get it all to work.

<?xml version="1.0" encoding="UTF-8"?>

<Wix xmlns="http://schemas.microsoft.com/wix/2006/wi">
  <?include VariableDefinitions.wxi ?>

  <Product Id="86f82e0c-04e1-4c3d-9c8a-a295ee9cee0d"
           Name="$(var.ProductName) $(var.Version)"
           Language="1033"
           Version="$(var.Version)"
           Manufacturer="$(var.ProductManufacturer)"
           UpgradeCode="$(var.ProductUpgradeCode)">

    <Package Id="*"
             InstallerVersion="200"
             Platform="x86"
             Manufacturer="$(var.ProductManufacturer)"
             Description="$(var.PackageDescription)"
             Compressed="yes"
             Comments="$(var.PackageComments)" />

    <!-- The custom action to set the install directory from an environment var. -->
    <CustomAction
      Id="SetApplicationRootDirectory"
      Directory="APPLICATIONROOTDIRECTORY"
      Value="[%CustomFolderName]bin" />

    <!-- Media here -->

    <Directory Id="TARGETDIR" Name="SourceDir">
      <Directory Id="ProgramFilesFolder">
        <!-- DefaultFolderName will be overridden by the environment var value. -->
        <Directory Id="APPLICATIONROOTDIRECTORY" Name="DefaultFolderName">
          <!-- Components here -->
        </Directory>
      </Directory>
    </Directory>

    <!-- Features here -->

    <InstallUISequence>
      <!-- Execute the custom action to set the install folder. -->
      <Custom Action="SetApplicationRootDirectory" After="CostFinalize" />
    </InstallUISequence>

  </Product>
</Wix>

I hope this post helps you avoid wasting the time I did trying to solve this issue.

-=Chris=-

Posted in Wix | Tagged | 2 Responses